AI still unprepared to replace humans in office settings, study finds

In real workplace assignments, Gemini topped at 24% accuracy, and GPT-5.2 closed behind at 23%

By
Geo News Digital Desk
|
AI still unprepared to replace humans in office settings, study finds
AI still unprepared to replace humans in office settings, study finds

Around two years ago it was predicted by a tech giant that artificial intelligence (AI) would start taking over office workforces soon, but it seems that the time for AI to replace humans has not come yet, as a new study has clarified that AI isn't ready for that.

While an otherwise apprehension was given by Microsoft CEO Satya Nadella, the recent finding says that AI has not yet gained command over the nuances and nitty-gritty involving office life.

Training-data company Mercor has introduced a new benchmark called APEX-Agents, designed to test AI systems in more realistic professional scenarios.

The benchmark is not like typical evaluations that ask a model to produce a poem or solve a tidy maths puzzle. It uses genuine queries from lawyers, consultants and bankers. The tasks are multi-step and require navigating different sources of information, reminiscent of the complexity surrounding real workplace assignments.

The findings of the study are surreal, saying that even the most advanced models available, including Gemini 3 Flash and GPT-5.2, failed to achieve a 25% accuracy rate.

Gemini topped the chart at 24%, GPT-5.2 closed behind at 23, and other systems scored in the teens.

According to Mercor’s chief executive, Brendan Foody, the problem is not raw processing power but context. Office work often involves checking messages, reviewing policy documents, analysing spreadsheets and combining insights into a coherent answer.

Humans switch between these sources with relative ease. AI systems, however, struggle when information is scattered and ambiguous.

As the study concludes that AI behaves less like an experienced professional and more like an unreliable intern, it can be established that AI's full takeover of office jobs isn't near enough.