When AI Does the Math: Mathematicians Debate the Future of Their Field

Over the past two years, AI systems have achieved gold-medal performance at the International Mathematical Olympiad, autonomously produced PhD-level research results, and disproved conjectures in combinatorial geometry. These milestones have triggered a fundamental debate within the mathematical community — not just about AI’s capabilities, but about what mathematics is and whether it retains its meaning without human cognitive struggle.

Key takeaways

Google DeepMind and OpenAI systems reached gold-medal level at the International Mathematical Olympiad (summer 2024)
Aletheia (Google DeepMind) autonomously produced publishable PhD-level results in arithmetic geometry
An OpenAI AI system disproved a conjecture in combinatorial geometry — a result worthy of a top mathematics journal
Reasoning agent Gauss (Math, Inc.) formalized Viazovska’s Fields Medal proof; completed the 24-dimensional case autonomously in two weeks
At the 12th Heidelberg Laureate Forum (September 2025), AI forecasts triggered palpable existential anxiety among young mathematicians

From stochastic parrots to mathematical reasoners

For decades, computers served mathematics as computational tools, not intellectual ones. In 1976, machines proved the four-color theorem by checking 1,936 cases — controversially, in a way no human could realistically verify. Even so, the mathematician’s role remained central: humans formulated conjectures, designed proof strategies, and verified reasoning. That equation is now shifting rapidly.

Large language models (LLMs), once dismissed as “stochastic parrots” capable only of regurgitating mathematical patterns scraped from the internet, have evolved into advanced mathematical reasoning systems. In the summer of 2024, systems from Google DeepMind and OpenAI achieved results equivalent to gold-medal performance at the International Mathematical Olympiad (IMO) — a competition where contestants must solve six notoriously difficult problems across various mathematical domains over two days.

Google DeepMind’s experimental system Aletheia went further: it autonomously produced research results qualifying for publication at PhD level in arithmetic geometry, specifically computing structure constants in a field abstract even to most mathematicians. The significance lies not in the subject matter but in the nature of the reasoning displayed — independently framing and resolving an open research question. Around the same time, an OpenAI AI system disproved an important conjecture in combinatorial geometry. Leading mathematicians confirmed the result would have been worthy of a prestigious journal had a human been the author.

Agent Gauss and the formalization frontier

A parallel development involves combining LLMs with proof assistants — systems such as Isabelle, Lean, and Rocq — that verify the logical correctness of each step in a proof. Translating an informal mathematical proof into machine-readable code (a process called formalization) has traditionally been a manual, labor-intensive task. AI is beginning to remove that bottleneck.

In February 2025, the company Math, Inc. used a reasoning agent called Gauss to formalize a proof that earned mathematician Maryna Viazovska of EPFL, Switzerland, a Fields Medal in 2022. Viazovska had solved the sphere-packing problem in 8 dimensions. Gauss first helped human mathematicians complete the formalization of that case in a matter of days, then autonomously formalized the considerably harder 24-dimensional case in just two weeks.

If it wasn’t for this formal verification layer, opening projects up without any safeguards would just be a disaster. But in math, we can completely check and verify outputs.

Terence Tao, UCLA professor and Fields Medalist, on formalization as the foundation of future mathematical collaboration.

Heidelberg 2025: existential anxiety in the room

In September 2025, the 12th Heidelberg Laureate Forum — an annual conference bringing hundreds of young mathematicians and computer scientists together with their intellectual idols — was dominated by AI from the outset. Yang-Hui He of the London Institute for Mathematical Sciences articulated a forecast that stayed with attendees: if AI takes over conjecture formation, solution-space search, proof verification, and generalization without human involvement, mathematicians could become “priests to oracles.”

The audience’s reaction was far from enthusiastic. Trill White, a student at Deakin University in Australia, recalled her thoughts in that hall: “That’s devastating. What will people have to contribute to mathematics? Will it become something that no one understands?” Jessica Randall of Google Developer Groups described a “collective existential dread” rising among young participants. The feeling was widespread: attendees were realizing that AI has the potential to replace them in the work they had trained for.

Not everyone reacted with alarm. Jeremy Avigad of Carnegie Mellon University captured the pragmatic position: mathematicians care about answers to the hardest open questions, regardless of whether it is a human or a machine that provides them. Some established mathematicians, including Yang-Hui He, are comfortable with AI taking on tasks currently reserved for humans.

Three visions for the future

Three broad positions are emerging from the debate. The first treats AI as a tool — a calculator for higher mathematics, where human understanding remains the primary value. Akshay Venkatesh, a Fields Medalist at Princeton, argues that mathematics serves not only to produce true statements but to build shared human understanding. Maia Fraser of the University of Ottawa adds that a beautiful, human-comprehensible proof retains value even if AI finds its own version first.

The second position is collaboration — Terence Tao’s vision. His concept of “big mathematics” envisions large-scale, decentralized collaborations between humans and machines, where AI handles the technical grunt work and humans claim the creative parts. The third position, generating the most resistance, is AI as oracle: a fully autonomous researcher for whom results matter more than the human process of discovering them.

Why this matters

AI entering mathematics is not merely a change of tools — it is an epistemological challenge. A proof incomprehensible to humans raises the question of whether it constitutes the same kind of knowledge as one that can be traced step by step. The mathematical community, divided on the vision for the future, is nonetheless united on one point: AI has transformative potential, and its direction depends on decisions being made now. Mathematicians are writing essays, organizing workshops, and building the first ethical guidelines — applying to AI the same rigor they apply to theorems.

The risk of intellectual atrophy in the next generation — if students learn to bypass the struggle with difficult problems — is taken seriously. So is accessibility: if mathematics requires expensive proprietary AI models, it becomes an elite activity for well-funded institutions only. At stake is not just the status of a profession, but the character of one of the fundamental domains of human inquiry.

What's next

Formalization via AI (projects combining Lean/Rocq with LLM assistants) will extend the scope of autonomous machine contributions — the Gauss precedent on Viazovska’s proof establishes a new baseline
The six remaining Millennium Prize Problems will serve as the next stress test for AI’s limits in the hardest open questions in mathematics, each carrying a $1 million prize
The International Mathematical Union and community working groups are preparing guidelines on AI in research and publication — outcomes will influence funding standards and authorship criteria