As a TCS assistant professor from Eastern Europe, I always am a little jealous of the biggest names in math having such an easy access to the expensive, long thinking models.
Paying for Pro from any of my current academic budgets is completely ouf of the field of reality here -- all budgets tend to have restricted uses and software payments fit into very few categories. Effectively, I'd have to ask for a brand new grant and hope the grant rules allow for large software payments and I won't encounter an anti-AI reviewer; such a thing would take one year at least.
As a nail to the coffin, I was "denied" all Claude Opus recently as part of Microsoft's clampdown on individual (and academic) use of Copilot.
(Chagpt 5.5 Plus does not seem sufficient for any deeper investigations into new research topics, I've tried.)
It's a very long post with a mix of technical (math) and philosophical sections. Here are the most striking points to reflect upon IMHO.
> It seems to me that training beginning PhD students to do research [...] has just got harder, since one obvious way to help somebody get started is to give them a problem that looks as though it might be a relatively gentle one. If LLMs are at the point where they can solve “gentle problems”, then that is no longer an option. The lower bound for contributing to mathematics will now be to prove something that LLMs can’t prove, rather than simply to prove something that nobody has proved up to now and that at least somebody finds interesting.
Training must start from the basics though. Of course everybody's training in math starts with summing small integers, which calculators have been doing without any mistake since a long time.
The point is perhaps confirmed by another comment further down in the post
> by solving hard problems you get an insight into the problem-solving process itself, at least in your area of expertise, in a way that you simply don’t if all you do is read other people’s solutions. One consequence of this is that people who have themselves solved difficult problems are likely to be significantly better at using solving problems with the help of AI, just as very good coders are better at vibe coding than not such good coders
People pay coders to build stuff that they will use to make money and I can happily use an AI to deliver faster and keep being hired. I'm not sure if there is a similar point with math. Again from the post
> suppose that a mathematician solved a major problem by having a long exchange with an LLM in which the mathematician played a useful guiding role but the LLM did all the technical work and had the main ideas. Would we regard that as a major achievement of the mathematician? I don’t think we would.
As a graduate student, this piece made me sad. I always believed that my work speaks for itself and transcends beyond my limited time on this cosmic experience. This notion of immortality was just a small intangible bonus I hoped for when I jumped into grad school. AI is making me feel less worthy.
You are worthy. You will hone your skills in grad school and be able to command these AIs better than somebody who hasn’t struggled with hard problems for a long time.
I saw Tim Gowers give a talk at the AMS-MAA joint meeting in Seattle about ten years ago where he predicted that in 100 years humans would no longer be doing research mathematics. I wonder if he’s adjusted his timeline.
At the time I thought the key missing tool was a natural language search that acted like mathoverflow, where you could explain your problem or ideas as you understood them and get references to relevant literature (possibly outside your experience or vocabulary).
>So if your aim in doing mathematics is to achieve some kind of immortality, so to speak, then you should understand that that won’t necessarily be possible for much longer — not just for you, but for anybody.
After reading this post, I have to admit that I could not understand the mathematical parts at all because they are beyond my current knowledge.
But one thing seems clear to me. If I try to describe the situation in mathematics presented here, it sounds like there were already precedents or existing pieces of knowledge, but humans had not thought to connect them. AI seems to have helped make that connection.
If AI can connect different fields in this way, then perhaps something even more significant could emerge from it.
That said, I could not understand most of the article. And if using LLMs properly requires this level of background knowledge, I honestly worry about whether I can really use them well.
On complex problems with lengthy proofs, the first step that I would have done is to ask 5.5 pro in a new, unrelated, session, to be very critical, to try to find flaws in the arguments.
And certainly not to send it to a fellow colleague to ask its opinion first.
LLMs are certainly becoming capable to code, find vulnerabilities, solve mathematical problems, but we need to avoid putting their works in production, or in front of other humans, without assessing it by any possible mean.
Otherwise tech leads, maintainers, experts get overwhelmed and this is how the « AI slop » fatigue begins.
To be clear I’m talking about this step:
> That preprint would have been hard for me to read, as that would have meant carefully reading Rajagopal’s paper first, but I sent it to Nathanson, who forwarded it to Rajagopal, who said he thought it looked correct.
> but we need to avoid putting their works in production, or in front of other humans, without assessing it by any possible mean.
I think this is good advice in general, maybe with an emphasis on public vs. private, friendly contact. Having 0 thought AI slop thrown at you out of the blue is rude. "could have been a prompt" indeed. But having a friend/colleague ask for a quick glance at something they know you handle well is another story for me.
If I've worked on a subject for a few years, and know the particulars in and out, I'd have no trouble skimming something that a friend or a colleague sent me. I am sparing those 5-10 minutes for the friend, not for what they sent. And for an expert in a particular domain, often 5 minutes is all it takes for a "lgtm" or "lol no".
Undergraduate? No. We've had calculators able to solve undergraduate problems for decades. AI doesn't change the need to understand how calculus works any more than calculators did. The foundations remain valuable.
I don’t think it’s just mathematics. We don’t hear enough about this, but if I think back to my undergraduate years, which were less than 10 years ago, every homework assignment and every take-home exam I had would be trivial for LLMs to solve at this point I wonder what is actually happening on the ground.
As a TCS assistant professor from Eastern Europe, I always am a little jealous of the biggest names in math having such an easy access to the expensive, long thinking models.
Paying for Pro from any of my current academic budgets is completely ouf of the field of reality here -- all budgets tend to have restricted uses and software payments fit into very few categories. Effectively, I'd have to ask for a brand new grant and hope the grant rules allow for large software payments and I won't encounter an anti-AI reviewer; such a thing would take one year at least.
As a nail to the coffin, I was "denied" all Claude Opus recently as part of Microsoft's clampdown on individual (and academic) use of Copilot.
(Chagpt 5.5 Plus does not seem sufficient for any deeper investigations into new research topics, I've tried.)
Apologies for the rant.
You can’t afford $200/month in Eastern Europe? How poor is Eastern Europe?
For a TCS assistant professor in Eastern Europe, $200/month would be 20% of their salary.
And the situation is better, ten years ago it would have been 80%.
It's a very long post with a mix of technical (math) and philosophical sections. Here are the most striking points to reflect upon IMHO.
> It seems to me that training beginning PhD students to do research [...] has just got harder, since one obvious way to help somebody get started is to give them a problem that looks as though it might be a relatively gentle one. If LLMs are at the point where they can solve “gentle problems”, then that is no longer an option. The lower bound for contributing to mathematics will now be to prove something that LLMs can’t prove, rather than simply to prove something that nobody has proved up to now and that at least somebody finds interesting.
Training must start from the basics though. Of course everybody's training in math starts with summing small integers, which calculators have been doing without any mistake since a long time.
The point is perhaps confirmed by another comment further down in the post
> by solving hard problems you get an insight into the problem-solving process itself, at least in your area of expertise, in a way that you simply don’t if all you do is read other people’s solutions. One consequence of this is that people who have themselves solved difficult problems are likely to be significantly better at using solving problems with the help of AI, just as very good coders are better at vibe coding than not such good coders
People pay coders to build stuff that they will use to make money and I can happily use an AI to deliver faster and keep being hired. I'm not sure if there is a similar point with math. Again from the post
> suppose that a mathematician solved a major problem by having a long exchange with an LLM in which the mathematician played a useful guiding role but the LLM did all the technical work and had the main ideas. Would we regard that as a major achievement of the mathematician? I don’t think we would.
> So maybe there should be a different repository where AI-produced results can live.
Does the author know about CAISc 2026 [0]?
[0]: https://caisc2026.github.io
As a graduate student, this piece made me sad. I always believed that my work speaks for itself and transcends beyond my limited time on this cosmic experience. This notion of immortality was just a small intangible bonus I hoped for when I jumped into grad school. AI is making me feel less worthy.
You are worthy. You will hone your skills in grad school and be able to command these AIs better than somebody who hasn’t struggled with hard problems for a long time.
I saw Tim Gowers give a talk at the AMS-MAA joint meeting in Seattle about ten years ago where he predicted that in 100 years humans would no longer be doing research mathematics. I wonder if he’s adjusted his timeline.
At the time I thought the key missing tool was a natural language search that acted like mathoverflow, where you could explain your problem or ideas as you understood them and get references to relevant literature (possibly outside your experience or vocabulary).
>So if your aim in doing mathematics is to achieve some kind of immortality, so to speak, then you should understand that that won’t necessarily be possible for much longer — not just for you, but for anybody.
This made me a little sad
Now repeat that for every sort of human achievement
After reading this post, I have to admit that I could not understand the mathematical parts at all because they are beyond my current knowledge.
But one thing seems clear to me. If I try to describe the situation in mathematics presented here, it sounds like there were already precedents or existing pieces of knowledge, but humans had not thought to connect them. AI seems to have helped make that connection.
If AI can connect different fields in this way, then perhaps something even more significant could emerge from it.
That said, I could not understand most of the article. And if using LLMs properly requires this level of background knowledge, I honestly worry about whether I can really use them well.
On complex problems with lengthy proofs, the first step that I would have done is to ask 5.5 pro in a new, unrelated, session, to be very critical, to try to find flaws in the arguments.
And certainly not to send it to a fellow colleague to ask its opinion first.
LLMs are certainly becoming capable to code, find vulnerabilities, solve mathematical problems, but we need to avoid putting their works in production, or in front of other humans, without assessing it by any possible mean.
Otherwise tech leads, maintainers, experts get overwhelmed and this is how the « AI slop » fatigue begins.
To be clear I’m talking about this step:
> That preprint would have been hard for me to read, as that would have meant carefully reading Rajagopal’s paper first, but I sent it to Nathanson, who forwarded it to Rajagopal, who said he thought it looked correct.
> but we need to avoid putting their works in production, or in front of other humans, without assessing it by any possible mean.
I think this is good advice in general, maybe with an emphasis on public vs. private, friendly contact. Having 0 thought AI slop thrown at you out of the blue is rude. "could have been a prompt" indeed. But having a friend/colleague ask for a quick glance at something they know you handle well is another story for me.
If I've worked on a subject for a few years, and know the particulars in and out, I'd have no trouble skimming something that a friend or a colleague sent me. I am sparing those 5-10 minutes for the friend, not for what they sent. And for an expert in a particular domain, often 5 minutes is all it takes for a "lgtm" or "lol no".
Is the assessment system of undergraduate mathematics education no longer effective?
Undergraduate? No. We've had calculators able to solve undergraduate problems for decades. AI doesn't change the need to understand how calculus works any more than calculators did. The foundations remain valuable.
Graduate? Yes.
How should graduate school be changed then? Specifically for mathematics
I don’t think it’s just mathematics. We don’t hear enough about this, but if I think back to my undergraduate years, which were less than 10 years ago, every homework assignment and every take-home exam I had would be trivial for LLMs to solve at this point I wonder what is actually happening on the ground.
Today I learnt that there are mathematics papers titled: paper entitled Diversity, Equity and Inclusion for Problems in Additive Number Theory.