New benchmark launched: Microsoft's DELEGATE-52 measures AI performance across 52 sectors, revealing weaknesses in handling complex, long-running workflows. Error ...