Kaiyu Yang

I am a Computing, Data, and Society Postdoctoral Fellow at Caltech in the Computing + Mathematical Sciences (CMS) Department. I received my Ph.D. from Princeton University, where I was advised by Jia Deng and also worked with Olga Russakovsky and Danqi Chen.

杨凯峪  /  kaiyuy [MASK] caltech [MASK] edu  /  CV  /  Bio  /  Google Scholar  /  GitHub

profile photo
News
Research

I work on AI for Mathematics, specifically focusing on:

  • Large language models (LLMs) for formal theorem proving: [CoqGym], [LeanDojo]
  • AI as copilots for mathematicians: [Lean Copilot]
  • LLMs for solving problems in mathematics and sciences: [SciGLM]
  • Synergies between reasoning with natural language or verified code generation: [NLProofS]
  • Neuro-symbolic machine learning: [MetaQNL]
SciGLM: Training Scientific Language Models with Self-Reflective Instruction Annotation and Tuning
Dan Zhang, Ziniu Hu, Sining Zhoubian, Zhengxiao Du, Kaiyu Yang, Zihan Wang, Yisong Yue, Yuxiao Dong, and Jie Tang
In submission, 2024
arXiv / code

We curated SciInstruct, a diverse and high-quality dataset of college-level mathematics, physics, chemistry, and formal proofs. Using SciInstruct to finetune the ChatGLM family of LLMs, we introduce SciGLM, a suite of scientific language models for college-level mathematical/scientific reasoning.

Towards Large Language Models as Copilots for Theorem Proving in Lean
Peiyang Song, Kaiyu Yang, and Anima Anandkumar
In submission, 2024
code / demo / talk

We introduce a framework for running neural network inference directly in Lean. It enables programmers to build various LLM-based proof automation tools that integrate seamlessly into the workflow of Lean users, including tools for suggesting proof steps and completing intermediate proof goals using LLMs.

LeanDojo: Theorem Proving with Retrieval-Augmented Language Models
Kaiyu Yang, Aidan Swope, Alex Gu, Rahul Chalamala, Peiyang Song, Shixing Yu, Saad Godil, Ryan Prenger, and Anima Anandkumar
Neural Information Processing Systems (NeurIPS) Datasets and Benchmarks Track, 2023, Oral presentation
arXiv / project / code / talk / slides / media

Can LLMs generate mathematical proofs that can be rigorously checked? We release LeanDojo: an open-source playground consisting of toolkits, benchmarks, and models for LLMs to prove formal theorems in the Lean proof assistant.

Infinite Photorealistic Worlds using Procedural Generation
Alexander Raistrick*, Lahav Lipson*, Zeyu Ma*, Lingjie Mei, Mingzhe Wang, Yiming Zuo, Karhan Kayan, Hongyu Wen, Beining Han, Yihan Wang, Alejandro Newell, Hei Law, Ankit Goyal, Kaiyu Yang, and Jia Deng
Conference on Computer Vision and Pattern Recognition (CVPR), 2023
arXiv / project / code

Data drives progress in computer vision. We introduce Infinigen: a generator of unlimited high-quality 3D data. 100% procedural, no external assets, no AI. Free and open source.

Learning Symbolic Rules for Reasoning in Quasi-Natural Language
Kaiyu Yang and Jia Deng
Transactions on Machine Learning Research (TMLR), 2023
arXiv / code

We propose MetaQNL, a symbolic "Quasi-Natural" language that can express both formal logic and natural language. Instead of manually constructing MetaQNL rules, we propose MetaInduce: an algorithm for learning rules from data.

Generating Natural Language Proofs with Verifier-Guided Search
Kaiyu Yang, Jia Deng, and Danqi Chen
Empirical Methods in Natural Language Processing (EMNLP), 2022, Oral presentation
arXiv / code / slides / poster

We introduce NLProofS (Natural Language Proof Search) for multi-step logical reasoning in natural language. Given a hypothesis and a set of supporting facts, it generates a proof tree indicating how to derive the hypothesis from supporting facts.

A Study of Face Obfuscation in ImageNet
Kaiyu Yang, Jacqueline Yau, Li Fei-Fei, Jia Deng, and Olga Russakovsky
International Conference on Machine Learning (ICML), 2022
arXiv / code / slides / talk / poster / project / media

We annotate human faces in ImageNet and obfuscate them for privacy protection. We show that face obfuscation does not hurt image classification and transfer learning.

Strongly Incremental Constituency Parsing with Graph Neural Networks
Kaiyu Yang and Jia Deng
Neural Information Processing Systems (NeurIPS), 2020
arXiv / code / slides / talk / poster

We propose a novel transition-based constituency parser named Attach-Juxtapose, inspired by how humans perform parsing.

Rel3D: A Minimally Contrastive Benchmark for Grounding Spatial Relations in 3D
Ankit Goyal, Kaiyu Yang, Dawei Yang, and Jia Deng
Neural Information Processing Systems (NeurIPS), 2020, Spotlight presentation
arXiv / code

We propose Minimally Contrastive Data Collection: a novel crowdsourcing method for reducing dataset bias. And we use it to construct Rel3D—the first large-scale, human-annotated dataset for grounding spatial relations in 3D.

Filtering and Balancing the Distribution of the People Subtree in the ImageNet Hierarchy
Kaiyu Yang, Klint Qinami, Li Fei-Fei, Jia Deng, and Olga Russakovsky
Conference on Fairness, Accountability, and Transparency (FAT*), 2020
arXiv / slides / talk / blog / media

We reveal and mitigate fairness issues of ImageNet, filtering its concept vocabulary and balancing its representation of various demographic groups in images.

Learning to Prove Theorems via Interacting with Proof Assistants
Kaiyu Yang and Jia Deng
International Conference on Machine Learning (ICML), 2019
arXiv / code / slides / poster

We introduce CoqGym, one of the first and largest datasets for theorem proving in proof assistants, and ASTactic, a deep learning prover generating tactics as programs.

SpatialSense: An Adversarially Crowdsourced Benchmark for Spatial Relation Recognition
Kaiyu Yang, Olga Russakovsky, and Jia Deng
International Conference on Computer Vision (ICCV), 2019
arXiv / code / poster

We propose Adversarial Crowdsourcing to reduce dataset bias and use it to construct SpatialSense, a challenging dataset for recognizing spatial relations in images.

Stacked Hourglass Networks for Human Pose Estimation
Alejandro Newell, Kaiyu Yang, and Jia Deng
European Conference on Computer Vision (ECCV), 2016
arXiv / code

We introduce Stacked Hourglass Networks—one of the most popular architectures for human pose estimation, object detection, and more.

Workshops & Tutorials

I'm a co-organizer of the following events:

Media

My works are covered by:

Teaching

Website template credit: Jon Barron