An AI that translates math problems into code to make them easier to solve

⇧ [VIDÉO] You might also like this partner content (after ad)

The process of automatically translating mathematical concepts written in natural language into formal specifications and proofs is called “self-formalization”. With the aim of improving current self-formalization models and verification machines, researchers have developed a new method, based on an AI capable of supporting such a process. More efficient and faster than humans in this task, it could also lead to the discovery of new mathematical theories.

Mathematical proofs (or demonstrations) are now verified by computers. However, this requires first “translating” the demonstration – which combines formulas, mathematical symbols and natural language – into a specific language, understandable by the machine. However, this task is very time-consuming: it took nearly ten years to formalize (and verify) in this way Kepler’s conjecturewhich describes the best way to stack a collection of spherical objects.

Not only would an automated system reduce the cost of current formalization efforts, but it could connect the various research fields that automate certain aspects of mathematical reasoning to the vast body of knowledge exclusively written in natural language. ” A successful self-formalization system could advance the areas of formal verification, program synthesis, andartificial intelligence “, write the researchers in the article describing their approach.

Self-formalization through Grand Language Models

Only a very small fraction of mathematical knowledge has been formalized and then proven to date. A few projects exist for understanding formal languages, but are limited to languages ​​for which there are a large number of corpora on the web (for example, the Python language).

Data on formal mathematics are very rare: I’Archive of Formal Proofsis only 180 MB — less than 0.18% of the training data of the OpenAI Codex, an artificial intelligence developed by OpenAI, which analyzes natural language and generates code in response. This AI was trained from large amounts of text and programming data available on the web. It is used in particular to feed GitHub Copilot, a programming auto-completion tool; it can satisfy approximately 37% of requests, and thus makes it possible to speed up programming.

An example of a perfect translation of a natural language statement into Isabelle code.

To develop an AI capable of formalizing mathematical problems, Yuhuai Wu, a researcher at Google, and his colleagues had the idea of ​​exploiting this Codex – assuming that programming languages ​​and mathematical demonstrations share some similarities.

They thus provided Codex with a set of 150 problems drawn from secondary school mathematics competitions. It turns out that the AI ​​was able to perfectly translate a significant portion (25.3%) of the given problems into a language compatible with Isabella — an open source (open code) program for solving mathematical demonstrations. According to the team, most of the missed translations were due to the system not understanding some of the mathematical concepts used (due to a mismatch between formal and informal definitions).

An automated formalization much better than human formalization

To test the effectiveness of this process, the researchers then submitted to Codex a set of problems that had already been formalized by humans, thus obtaining a second set of formalized problems. They then called on another AI, called MiniF2F, to verify the two versions. Result: Codex appeared much better than humans in formalizing problems. ” Our methodology leads to a new cutting-edge result on the MiniF2F theorem proof benchmark, improving the proof rate from 29.6% to 35.2% “, specify the researchers.

The team also explored reverse translation, informalization, namely the translation of Isabelle code into natural language. Of the 38 examples tested, 36 translated into a “reasonably consistent” statement, of which 29 (76%) were more or less correct. Conclusion: the success rate of informalization is significantly higher than that of formalization.

The success rate of self-formalization may seem modest, but given the small amount of Isabelle code included in Codex training, the fact that the model can write syntactically correct code is already fascinating, the researchers point out. Additionally, there is virtually no data aligned between natural language and Isabelle code.

The team thinks that the success rate could quickly be improved, to rival the greatest mathematicians. This self-formalization not only improves existing models, but could also be applied to many verification tasks (both in software design and hardware design).

A major challenge remains, however, to apply this model to mathematical research, most of whose work is written in LaTeX, a document composition language and system that uses its own syntax and commands. It could be much trickier for the neural network to interpret and translate this type of content.

Source : Y. Wu et al., arXiv

We want to thank the author of this article for this amazing content

An AI that translates math problems into code to make them easier to solve

Check out our social media accounts as well as the other related pages