Research
Currently, I work on LLM reasoning, alignment, and self-improvement.
Previously, I worked on unsupervised learning of perception, prediction, and planning in robotics.
Eventually, I aim to build general-purpose agents that are capable, reliable, and truth-seeking.
Selected Publications
Generative Verifiers: Reward Modeling as Next-Token Prediction
Lunjun Zhang, Arian Hosseini, Hritik Bansal, Mehran Kazemi, Aviral Kumar, Rishabh Agarwal
arxiv preprint, 2024
[Paper] [Website]
Reward models are better with next token prediction and chain of thoughts, too.
Copilot4D: Learning Unsupervised World Models for Autonomous Driving via Discrete Diffusion
Lunjun Zhang, Yuwen Xiong, Ze Yang, Sergio Casas, Rui Hu, Raquel Urtasun
International Conference on Learning Representations (ICLR), 2024
[Paper] [Proceedings] [Poster] [Website]
A foundation model for self-driving that explicitly reasons in both 3D space and time.
Towards Unsupervised Object Detection from LiDAR Point Clouds
Lunjun Zhang, Anqi Joyce Yang, Yuwen Xiong, Sergio Casas, Bin Yang, Mengye Ren, Raquel Urtasun
Conference on Computer Vision and Pattern Recognition (CVPR), 2023
[Paper] [Proceedings] [Poster] [Website]
Self-supervised, scalable object discovery in the wild.
World Model as a Graph: Learning Latent Landmarks for Planning
Lunjun Zhang, Ge Yang, Bradly Stadie
International Conference on Machine Learning (ICML), 2021 (Long Talk)
[Paper] [Proceedings] [Poster] [Website] [Code]
Unsupervised long-horizon planning via graph-structured world models.