OpenAI and Google DeepMind computer models have scored gold medal level performance on the 2025 International Mathematical Olympiads, their natural language answers getting five of six problems right. It is also the first instance that general-purpose AI systems have surpassed the results of high school students in the world in completing mathematical tests, without the official translation of problems.

Performance Overview
OpenAI’s experimental reasoning model correctly answered five IMO questions in its “informal” evaluation, generating proof‑style solutions in English. The company hired three former IMO medalists to grade its outputs before notifying the Olympiad committee and timing its announcement to follow the students’ award ceremony. Google DeepMind entered its advanced Gemini Deep Think agent (an enhanced reasoning mode that explores multiple solution paths in parallel) and officially collaborated with IMO graders to confirm its 35‑point gold‑medal score under the contest’s guidelines.
Verification Dispute
Google publicly criticized OpenAI for announcing its results before independent IMO verification. DeepMind CEO Demis Hassabis noted that Google respected the Olympiad’s request to wait for official grading and student recognition. OpenAI responded that it initially declined formal participation to focus on natural‑language reasoning but agreed to delay its release after the IMO provided feedback. Neither company has released full model details, though both claim their systems outperform last year’s silver‑medal‑level AI and outscore most contestants.
Implications for AI Reasoning
The fact that AI reasoning models enable the highly abstract problems to be solved in an unstructured environment with minimal formal encoding exhibits the gold medal performance in this environment. The experts observe that 5 accurate proofs perform better than almost 90 percent of those who enter it as human beings, and it means that the distinction between human and machine problem-solving is narrowing in tricky areas. This advancement could also be applied in co-research in mathematics and allied arts because these two models are strong in the clarity and accuracy of the proofs they fail to produce.
What’s Next
The introduction of GPT 5 by OpenAI is yet to come with expanded reasoning features, whereas Google is involved in improving on Deep Think in the Gemini interface. The talent competition in AI and based on the trends of its perception on the part of leadership, is challenging as well, and the market, like IMO, acts as a flagship manifestation of growing potential.

The use of domain knowledge required by real-world problems and iterative demonstration of correctness make researchers stipulate that the IMO milestone is merely one step to more general-purpose intelligence-enhancing assistants, as opposed to definitive instances of human-level intelligence.
To ensure the future progress of both labs, cooperation with organizations such as the IMO will be essential to establish an environment for testing AI accomplishments in a rigorous manner. Interestingly, mathematical competitions have broken the record for the speed of AI reasoning expansion, and this is not the last stage of achieving success in research and teaching.