Guangxuan Xiao   肖光烜
I am a fourth-year Ph.D. candidate at MIT EECS, advised by
Prof. Song Han.
My research focuses on efficient algorithms and systems for deep learning, particularly large
foundation models.
I graduated from Tsinghua University with B.Eng. in Computer Science and B.Econ. in Finance in
2022 with honors, and was a visiting student researcher at Stanford University during 2020-2021.
Email
 / 
Google Scholar
 / 
Github / 
X / 
Linkedin
|
|
Deff = W
|ln(ε)|
|ln(1-α)|
|
Why Stacking Sliding Windows Can't See Very Far
August 25, 2025
A mathematical explanation of why sliding window attention's effective receptive field is O(W) rather than the theoretical O(LW), regardless of depth, due to information dilution and exponential decay from residual connections.
|
|
Statistics behind Block Sparse Attention
August 22, 2025
A statistical model revealing how block sparse attention achieves efficiency and accuracy through
learned similarity gaps.
|
|
How Attention Sinks Keep Language Models Stable
August 7, 2025
We discovered that attention sinks—where models park unused attention on initial tokens—are crucial for language model stability. Without them, models catastrophically fail when processing long conversations, but with attention sinks, they maintain stable performance across millions of tokens.
|
|
XAttention: Block Sparse Attention with Antidiagonal Scoring
Ruyi Xu*,
Guangxuan Xiao*,
Haofeng Huang,
Junxian Guo,
Song Han
ICML 2025
[paper]
[code]
|
|
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
Guangxuan Xiao,
Jiaming Tang,
Jingwei Zuo,
Junxian Guo,
Shang Yang,
Haotian Tang,
Yao Fu,
Song Han
ICLR 2025
[paper]
[code]
[demo]
|
|
Efficient Streaming Language Models with Attention Sinks
Guangxuan Xiao,
Yuandong Tian,
Beidi Chen,
Song Han,
Mike Lewis
ICLR 2024
[paper]
[code]
[MIT
News]
[NVIDIA
TensorRT-LLM]
[on iPhone]
|
|
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
Guangxuan Xiao*,
Ji Lin*,
Mickael Seznec,
Hao Wu,
Julien Demouth,
Song Han
ICML 2023
[paper]
[code]
[NVIDIA
TensorRT-LLM]
|
|
FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention
Guangxuan Xiao*,
Tianwei Yin*,
William T. Freeman,
Frédo Durand,
Song Han
IJCV 2024
[website]
[paper]
[code]
|
 |
Massachusetts Institute of Technology
2022.08 - Present
S.M. in Computer Science
Ph.D. Candidate in Computer Science
Advisor: Prof. Song Han
|
 |
Tsinghua University
2018.08 - 2022.07
B.Eng. in Computer Science
B.Econ. in Economics (Second Major)
Advisor: Prof. Zhiyuan Liu.
|
 |
Stanford University
2020.07 - 2021.06
Visiting Research Student
Advisor: Prof. Jure Leskovec
Mentor: Jiaxuan You
|
 |
Stanford University
2021.06 - 2021.11
Visiting Research Student through the UGVR
program
Advisor: Prof. Jiajun Wu, Prof. Leslie Pack Kaelbling
Mentor: Jiayuan Mao
|
 |
NVIDIA
2024 - 2025
Research Intern
Santa Clara, CA
with Song Han
Researching efficient large language models.
|
 |
Meta Inc.
2023
Research Scientist Intern
Menlo Park, CA
with Mike Lewis
Developed efficient streaming language models.
|
Honors & Awards
- Hewlett Packard Fellowship, 2022
- Boeing Scholarship, 2021
- Tsinghua "Future Scholar" Scientific Research Grant (30000$), 2021
- National Scholarship, 2020
- Contemporary Undergraduate Mathematical Contest in Modeling, 1st Prize, 2020
- Beijing "Challenge Cup" Academic Science and Technology Competition, 1st Prize, 2020
- Tsinghua Comprehensive Excellence Scholarship, 2019
|
Miscellaneous
I love to play soccer. I was the captain and striker of my department soccer team.
I also love to play table tennis, Go (Weiqi), and piano. Beethoven's works are my favorite.
|
|