Here we go. Another tweet that can move hundreds of billions of dollars on the market and to justify a bail out. No accountability. Just the thrill to be the one to be able to say “first”.
Are you kidding? This is an incredible result. Stuff like this is the most important stuff happening in AI right now. Automated theorem proving? It's not too far to say the entire singular point of the technology was to get us to this.
We've had automated theorem proving since the 60s. What we need is automated theorem discovery. Erdős discovered these theorems even if he wasn't really able to prove them. Euler and Gauss discovered a ton of stuff they couldn't prove. It is weird that nobody considers this to be intelligence. Instead intelligence is a little game AI plays with Lean.
AI researchers keep trying to reduce intelligence into something tiny and approachable, like automated theorem proving. It's easy: you write the theorem you want proven and hope you get a proof. It works or it doesn't. Nice and benchmarkable.
Automated axiom creation seems a lot harder. How is an LLM supposed to know that "between any two points there is a line" formalizes an important property of physical space? Or how to suggest an alternative to Turing machines / lambda calculus that expresses the same underlying idea?
> We've had automated theorem proving since the 60s.
By that logic, we've had LLMs since the 60s!
> What we need is automated theorem discovery.
I don't see any reason you couldn't train a model to do this. You'd have to focus it on generating follow-up questions to ask after reading a corpus of literature, playing around with some toy examples in Python and making a conjecture out of it. This seems much easier than training it to actually complete an entire proof.
> Erdős discovered these theorems even if he wasn't really able to prove them. Euler and Gauss discovered a ton of stuff they couldn't prove. It is weird that nobody considers this to be intelligence.
Who says they don't? I wouldn't be surprised if HarmonicMath, DeepMind, etc have also thought about this kind of thing.
> Automated axiom creation seems a lot harder. How is an LLM supposed to know that "between any two points there is a line" formalizes an important property of physical space?
That's a good question! It would be interesting to see if this is an emergent property of multimodal LLMs trained specifically on this kind of thing. You would need mathematical reasoning, visual information and language encoded into some shared embedding space where similar things are mapped right next to each other geometrically.
This is me being snarky and ignorant, but if it solved one problem and it is automated what’s stopping it from solving all the others? That’s what’s ultimately being sold by the tweet right.
Ultimately the main thing that will stop it from solving literally "all the others" are things like the impossibility of solving the halting problem, considerations like P ≠ NP, etc. But as we have just seen, despite these impossibility theorems, AI systems are still able to make substantive progress on solving important open real-world problems.
>This is me being snarky and ignorant, but if it solved one problem and it is automated what’s stopping it from solving all the others?
Yeah that's the crux of the matter. How do AI did it? Using already existing math. If we need new math to prove Collatz, Goldbach or Riemman, LLMs are simply SOL. That's what's missing and hype boys always avoid to mention.
This is response from mathematician:
"This is quite something, congratulations to Boris and Aristotle!
On one hand, as the nice sketch provided below by tsaf confirms, the final proof is quite simple and elementary - indeed, if one was given this problem in a maths competition (so therefore expected a short simple solution existed) I'd guess that something like the below would be produced. On the other hand, if something like this worked, then surely the combined talents of Burr, Erdős, Graham, and Li would have spotted it.
Normally, this would make me suspicious of this short proof, in that there is overlooked subtlety. But (a) I can't see any and (b) the proof has been formalised in Lean, so clearly it just works!
Perhaps this shows what the real issue in the [BEGL96] conjecture is - namely the removal of 1 and the addition of the necessary gcd condition. (And perhaps at least some subset of the authors were aware of this argument for the easier version allowing 1, but this was overlooked later by Erdős in [Er97] and [Er97e], although if they were aware then one would hope they'd have included this in the paper as a remark.)
At the moment I'm minded to keep this as open, and add the gcd condition in the main statement, and note in the remarks that the easier (?) version allowing 1 and omitting the gcd condition, which was also asked independently by Erdős, has been solved."
The commentator is saying: "I can't believe this famous problem was solved so easily. I would have thought it was a fake proof, but the computer verified it. It turns out the solution works because it addresses a slightly different set of constraints (regarding the number 1) than what Erdős originally struggled with. (Generated by Gemini)
This seems to be 2nd in row proof from the same author by using the AI models. First time it was the ChatGPT which wrote the formal Lean proof for Erdos Problem #340.
> In over a dozen papers, beginning in 1976 and spanning two decades, Paul Erdős repeatedly posed one of his “favourite” conjectures: every finite Sidon set can be extended to a finite perfect difference set. We establish that {1, 2, 4, 8, 13} is a counterexample to this conjecture.
Ok… has this been verified? I see no publication or at least an announcement on Harmonics webpage. If this is a big deal, you think it would be a big deal, or is this just hype?
> My summary is that Aristotle solved "a" version of this problem (indeed, with an olympiad-style proof), but not "the" version.
> I agree that the [BEGL96] problem is still open (for now!), and your plan to keep this problem open by changing the statement is reasonable. Alternatively, one could add another problem and link them. I have no preference.
— BorisAlexeev
Here we go. Another tweet that can move hundreds of billions of dollars on the market and to justify a bail out. No accountability. Just the thrill to be the one to be able to say “first”.
Seriously, blame the investors.
Nah, everyone is to blame here in bubble.
Are you kidding? This is an incredible result. Stuff like this is the most important stuff happening in AI right now. Automated theorem proving? It's not too far to say the entire singular point of the technology was to get us to this.
I think it is way too far to say that!
We've had automated theorem proving since the 60s. What we need is automated theorem discovery. Erdős discovered these theorems even if he wasn't really able to prove them. Euler and Gauss discovered a ton of stuff they couldn't prove. It is weird that nobody considers this to be intelligence. Instead intelligence is a little game AI plays with Lean.
AI researchers keep trying to reduce intelligence into something tiny and approachable, like automated theorem proving. It's easy: you write the theorem you want proven and hope you get a proof. It works or it doesn't. Nice and benchmarkable.
Automated axiom creation seems a lot harder. How is an LLM supposed to know that "between any two points there is a line" formalizes an important property of physical space? Or how to suggest an alternative to Turing machines / lambda calculus that expresses the same underlying idea?
> We've had automated theorem proving since the 60s.
By that logic, we've had LLMs since the 60s!
> What we need is automated theorem discovery.
I don't see any reason you couldn't train a model to do this. You'd have to focus it on generating follow-up questions to ask after reading a corpus of literature, playing around with some toy examples in Python and making a conjecture out of it. This seems much easier than training it to actually complete an entire proof.
> Erdős discovered these theorems even if he wasn't really able to prove them. Euler and Gauss discovered a ton of stuff they couldn't prove. It is weird that nobody considers this to be intelligence.
Who says they don't? I wouldn't be surprised if HarmonicMath, DeepMind, etc have also thought about this kind of thing.
> Automated axiom creation seems a lot harder. How is an LLM supposed to know that "between any two points there is a line" formalizes an important property of physical space?
That's a good question! It would be interesting to see if this is an emergent property of multimodal LLMs trained specifically on this kind of thing. You would need mathematical reasoning, visual information and language encoded into some shared embedding space where similar things are mapped right next to each other geometrically.
[dead]
> What we need is automated theorem discovery.
I've been thinking mathematicians have fun doing math, making discoveries, crafting proofs.
Does Tour de France & Co. make no sense since small, lightweight and powerful e-bicycles appeared?
Using computer as a helper like bicycles is one thing, using LLMs seems more like e-bicycle and is something another.
This is me being snarky and ignorant, but if it solved one problem and it is automated what’s stopping it from solving all the others? That’s what’s ultimately being sold by the tweet right.
Ultimately the main thing that will stop it from solving literally "all the others" are things like the impossibility of solving the halting problem, considerations like P ≠ NP, etc. But as we have just seen, despite these impossibility theorems, AI systems are still able to make substantive progress on solving important open real-world problems.
>This is me being snarky and ignorant, but if it solved one problem and it is automated what’s stopping it from solving all the others?
Yeah that's the crux of the matter. How do AI did it? Using already existing math. If we need new math to prove Collatz, Goldbach or Riemman, LLMs are simply SOL. That's what's missing and hype boys always avoid to mention.
This is response from mathematician: "This is quite something, congratulations to Boris and Aristotle!
On one hand, as the nice sketch provided below by tsaf confirms, the final proof is quite simple and elementary - indeed, if one was given this problem in a maths competition (so therefore expected a short simple solution existed) I'd guess that something like the below would be produced. On the other hand, if something like this worked, then surely the combined talents of Burr, Erdős, Graham, and Li would have spotted it.
Normally, this would make me suspicious of this short proof, in that there is overlooked subtlety. But (a) I can't see any and (b) the proof has been formalised in Lean, so clearly it just works!
Perhaps this shows what the real issue in the [BEGL96] conjecture is - namely the removal of 1 and the addition of the necessary gcd condition. (And perhaps at least some subset of the authors were aware of this argument for the easier version allowing 1, but this was overlooked later by Erdős in [Er97] and [Er97e], although if they were aware then one would hope they'd have included this in the paper as a remark.)
At the moment I'm minded to keep this as open, and add the gcd condition in the main statement, and note in the remarks that the easier (?) version allowing 1 and omitting the gcd condition, which was also asked independently by Erdős, has been solved."
The commentator is saying: "I can't believe this famous problem was solved so easily. I would have thought it was a fake proof, but the computer verified it. It turns out the solution works because it addresses a slightly different set of constraints (regarding the number 1) than what Erdős originally struggled with. (Generated by Gemini)
I think that person is the owner of the website discussing it too.
He was cited https://the-decoder.com/leading-openai-researcher-announced-...
Where a while back OpenAI made a misleading claim about solving some of these problems.
More interesting discussion than on Twitter here:
https://www.erdosproblems.com/forum/thread/124#post-1892
For reference, this is from: https://harmonic.fun/
related article:
> Is Math the Path to Chatbots That Don’t Make Stuff Up?
https://www.nytimes.com/2024/09/23/technology/ai-chatbots-ch...
This seems to be 2nd in row proof from the same author by using the AI models. First time it was the ChatGPT which wrote the formal Lean proof for Erdos Problem #340.
https://arxiv.org/html/2510.19804v1#Thmtheorem3
> In over a dozen papers, beginning in 1976 and spanning two decades, Paul Erdős repeatedly posed one of his “favourite” conjectures: every finite Sidon set can be extended to a finite perfect difference set. We establish that {1, 2, 4, 8, 13} is a counterexample to this conjecture.
Meanwhile I have to use ai just to understand the problem statement.
Related, independent, and verified:
GPT-5 solved Erdős problem #848 (combinatorial number theory):
https://cdn.openai.com/pdf/4a25f921-e4e0-479a-9b38-5367b47e8...
https://lifearchitect.ai/asi/
I read how GPT-5 contributed to proof. It is not fully solved by GPT-5 instead assisted. For more look here https://www.math.columbia.edu/~msawhney/Problem_848.pdf
FTL? Seriously?
Ok… has this been verified? I see no publication or at least an announcement on Harmonics webpage. If this is a big deal, you think it would be a big deal, or is this just hype?
verified by lean so 99.99% yes
Lean verified a proof of a solution to a problem, but was it the same problem as Erdős problem #124?
https://www.erdosproblems.com/forum/thread/124#post-1899
> My summary is that Aristotle solved "a" version of this problem (indeed, with an olympiad-style proof), but not "the" version.
> I agree that the [BEGL96] problem is still open (for now!), and your plan to keep this problem open by changing the statement is reasonable. Alternatively, one could add another problem and link them. I have no preference. — BorisAlexeev
There we go, so there is hype to some degree.
Is there some good literature to read about lean? First time I’m hearing about it and it seems pretty cool.
[dead]