My main skepticism here is whether the theorems have been properly replicated in the proof. Verifying that the proof really captured the mathematical statement seems like a manual, human process, and quite hard to repeat reliably across proofs.
Maybe this is what you meant, but the dangerous part is ensuring that the final claim being proved is correct -- the actual proof of that claim is, by design, something that can be quickly and mechanically verified by applying a series of simple rules to a set of accepted axioms. Even a human could (somewhat laboriously) do this verification.
I have no experience with this, but I'd expect the danger to arise from implicit assumptions that are obvious to a mathematician in the relevant subfield but not necessarily to an LLM. E.g., whether we are dealing with real-valued or integer-valued variables can make a big difference in many problems, but might only be actually explicitly stated once in the very first chapter (or even implied by the book's title).
There are also many types of "overloaded" mathematical notation -- usually a superscripted number means "raised to the power", but if that number is -1, it might instead mean "the inverse of this function".
This is so cool. I think I had read that some mathematicians were trying to formalize all(?) known math in Lean. Expecting it to be a long term project. And I was recently wondering how long it would be before they turned LLMs loose on the problem.
Seems like plenty of people are already on the path. So cool.
Are there any lessons or tricks here that we can apply to software engineering? What kinds of assertions can we express more easily with Lean etc. than with unit tests, fuzz tests, property tests, using our existing testing frameworks?
My main skepticism here is whether the theorems have been properly replicated in the proof. Verifying that the proof really captured the mathematical statement seems like a manual, human process, and quite hard to repeat reliably across proofs.
Maybe this is what you meant, but the dangerous part is ensuring that the final claim being proved is correct -- the actual proof of that claim is, by design, something that can be quickly and mechanically verified by applying a series of simple rules to a set of accepted axioms. Even a human could (somewhat laboriously) do this verification.
I have no experience with this, but I'd expect the danger to arise from implicit assumptions that are obvious to a mathematician in the relevant subfield but not necessarily to an LLM. E.g., whether we are dealing with real-valued or integer-valued variables can make a big difference in many problems, but might only be actually explicitly stated once in the very first chapter (or even implied by the book's title).
There are also many types of "overloaded" mathematical notation -- usually a superscripted number means "raised to the power", but if that number is -1, it might instead mean "the inverse of this function".
This is so cool. I think I had read that some mathematicians were trying to formalize all(?) known math in Lean. Expecting it to be a long term project. And I was recently wondering how long it would be before they turned LLMs loose on the problem.
Seems like plenty of people are already on the path. So cool.
Are there any lessons or tricks here that we can apply to software engineering? What kinds of assertions can we express more easily with Lean etc. than with unit tests, fuzz tests, property tests, using our existing testing frameworks?
The main lesson is quite simple: if you can write the test to be uncheatable, ChatGPT can write the code for it.
can somebody expoain like im five?
AI takes a math textbook and translates all the theorems and proofs into formal language that can be checked automatically for correctness.
Mathematicians are using LLMs to write proofs.