My research interest lies in developing a mathematical understanding of LLMs, interpreting their diverse behaviors, and providing principled theoretical guidance for the design and optimization of system-2 reasoning agents. Specifically, my approach involves expanding reasoning boundaries, e.g. enabling effective long-context reasoning and precise multi-round backtracking, and calibrating reasoning skills into practical experience via reinforcement learning in the post-training stage. You can find my publications on Google Scholar .

I am currently looking for 2025 summer intern opportunities!!!

🔥 News

  • 2025.04:  🏆 Our recent work “When More is Less: Understanding Chain-of-Thought Length in LLMs” has been awarded the Best Paper Runner-up Award at ICLR 2025 Workshop on Reasoning and Planning for LLMs!
  • 2025.04:  🎤 I will present an oral talk on our recent work “When More is Less: Understanding Chain-of-Thought Length in LLMs” at ICLR 2025 Workshop on Reasoning and Planning for LLMs!
  • 2024.12:  🍁 I attended NuerIPS 2024 at Vancouver and illustrated our poster.
  • 2024.10:  🎉 Our paper “A Theoretical Understanding of Self-Correction through In-context Alignment” has been accepted to NeurIPS 2024!
  • 2024.06:  🏆 “A Theoretical Understanding of Self-Correction through In-context Alignment” received the Best Paper Award at ICML Workshop on In-context Learning!

📝 Publications

(*: Equal Contribution)

ICLR-W'25
cot_poster

When More is Less: Understanding Chain-of-Thought Length in LLMs

Yuyang Wu*, Yifei Wang*, Ziyu Ye, Tianqi Du, Stefanie Jegelka, Yisen Wang

  • Best Paper Runner-Up Award at ICLR 2025 Workshop on Reasoning and Planning for Large Language Models.
  • We revealed two counterintuitive findings: longer CoTs are not always better, and during reinforcement learning, models exhibit a simplicity bias, converging to the shortest CoT they can effectively manage.
NeurIPS 2024
self_correction

A Theoretical Understanding of Self-Correction through In-context Alignment

Yifei Wang*, Yuyang Wu*, Zeming Wei, Stefanie Jegelka, Yisen Wang

  • Best Paper Award at ICML 2024 Workshop on In-context Learning
  • I established the first rigorous understanding of LLMs’ self-correction ability and developed a simple and efficient self-correction algorithm (CaC) that shows significant improvements across different tasks.

🎖 Honors and Awards

  • 2025.04 Best Paper Runner-up Award at ICLR 2025 Workshop on Reasoning and Planning for LLMs
  • 2024.06 Best Paper Award at ICML 2024 Workshop on In-context Learning
  • 2021.12 Silver Medal, Chinese Mathematical Olympiad

🎤 Talks

  • 2025.04 “When More is Less: Understanding Chain-of-Thought Length in LLMs”, Oral presentation at ICLR 2025 Workshop on Reasoning and Planning for Large Language Models, Singapore

📖 Education

  • 2022.09 - Present, Peking University, BS in Computer Science

💻 Research Experience

  • 2023.10 - Present, Research Intern at ZERO Lab, Peking University
    • Researching the in-context abilities in LLMs, including self-correction and chain-of-thought.
    • Collaborating with Postdoc Yifei Wang (MIT) and advised by Prof. Yisen Wang (PKU)
  • 2025.3 - 2025.6 (terminated due to non-academic political reasons), Research Intern at Sky Computing Lab, UC Berkeley
    • Researching on meta-reasoning ability in LLMs.
    • Collaborating with Dacheng Li and advised by Prof. Ion Stoica

💪 Skills

  • Programming Languages: Python(proficient), C++(proficient), C#, Core Skills (Git/Linux/TeX/etc.)
  • Deep Learning Technologies: Pytorch(proficient), CUDA parallel programming