My collaborator, Jiacheng Chen, is currently applying for Ph.D. programs. Recruit him if you're looking for a strong student in AI for mathematical reasoning and theorem proving! Feel free to contact me to discuss his qualifications.
We're looking for a postdoc to work on AI for formal mathematical reasoning. If you're interested, apply as soon as possible via this link and email me. We will close the search very soon.
Research
I work on AI for Mathematics, specifically focusing on:
Large language models (LLMs) for formal theorem proving: [CoqGym], [LeanDojo]
This position paper advocates for formal mathematical reasoning, i.e., mathematical reasoning grounded in formal systems such as proof assistants. It is complementary to the informal approach (training LLMs on mathematical texts) and is arguably indispensable for advancing AI4Math to the next level.
We introduce a framework for running neural network inference directly in Lean. It enables programmers to build various LLM-based proof automation tools that integrate seamlessly into the workflow of Lean users, including tools for suggesting proof steps and completing intermediate proof goals using LLMs.
We curated SciInstruct, a diverse and high-quality dataset of college-level mathematics, physics, chemistry, and formal proofs. Using SciInstruct to finetune the ChatGLM family of LLMs, we introduce SciGLM, a suite of scientific language models for college-level mathematical/scientific reasoning.
We release LeanEuclid, a benchmark for testing autoformalization, consisting of Euclid's Elements (Book I) manually formalized in Lean. It is challenging for state-of-the-art LLMs like GPT-4V. Furthermore, the process of constructing LeanEuclid has uncovered intriguing ambiguities in Euclid's original works.
Can LLMs generate mathematical proofs that can be rigorously checked? We release LeanDojo: an open-source playground consisting of toolkits, benchmarks, and models for LLMs to prove formal theorems in the Lean proof assistant.
Data drives progress in computer vision. We introduce Infinigen: a generator of unlimited high-quality 3D data. 100% procedural, no external assets, no AI. Free and open source.
We propose MetaQNL, a symbolic "Quasi-Natural" language that can express both formal logic and natural language. Instead of manually constructing MetaQNL rules, we propose MetaInduce: an algorithm for learning rules from data.
We introduce NLProofS (Natural Language ProofSearch) for multi-step logical reasoning in natural language. Given a hypothesis and a set of supporting facts, it generates a proof tree indicating how to derive the hypothesis from supporting facts.
We annotate human faces in ImageNet and obfuscate them for privacy protection. We show that face obfuscation does not hurt image classification and transfer learning.
We propose Minimally Contrastive Data Collection: a novel crowdsourcing method for reducing dataset bias. And we use it to construct Rel3D—the first large-scale, human-annotated dataset for grounding spatial relations in 3D.
We reveal and mitigate fairness issues of ImageNet, filtering its concept vocabulary and balancing its representation of various demographic groups in images.
We introduce CoqGym, one of the first and largest datasets for theorem proving in proof assistants, and ASTactic, a deep learning prover generating tactics as programs.
We propose Adversarial Crowdsourcing to reduce dataset bias and use it to construct SpatialSense, a challenging dataset for recognizing spatial relations in images.