|
Guangxuan Xiao   肖光烜
I am a Member of Technical Staff at Thinking Machines
Lab, working on
pre-training.
I finished my Ph.D. from MIT EECS in 2025, advised by
Prof. Song Han.
My research focuses on efficient algorithms and systems for deep learning, particularly large
foundation models.
I graduated from Tsinghua University with B.Eng. in Computer Science and B.Econ. in Finance in
2022 with honors, and was a visiting student researcher at Stanford University during 2020-2021.
Email
 / 
Google Scholar
 / 
Github / 
X / 
Linkedin
|
|
Linear: O(d)
Softmax: O(ed)
|
The Memory Capacity of Attention
September 1, 2025
How much information can attention mechanisms store? Using relative error analysis, we show that
linear attention scales linearly with head dimension while softmax attention scales exponentially
with head dimension.
|
Deff = W
|ln(ε)|
|ln(1-α)|
|
Why Stacking Sliding Windows Can't See Very Far
August 25, 2025
A mathematical explanation of why sliding window attention's effective receptive field is O(W)
rather than the theoretical O(LW), regardless of depth, due to information dilution and exponential
decay from residual connections.
|
|
|
Statistics behind Block Sparse Attention
August 22, 2025
A statistical model revealing how block sparse attention achieves efficiency and accuracy through
learned similarity gaps.
|
softmax([sink,a₁,...,aₜ])
|
How Attention Sinks Keep Language Models Stable
August 7, 2025
We discovered that attention sinks—where models park unused attention on initial tokens—are crucial
for language model stability. Without them, models catastrophically fail when processing long
conversations, but with attention sinks, they maintain stable performance across millions of tokens.
|
|
Optimizing Mixture of Block Attention
Guangxuan Xiao*,
Junxian Guo*,
Kasra Mazaheri,
Song Han
arXiv 2025
[paper]
[code]
|
|
StreamingVLM: Real-Time Understanding for Infinite Video Streams
Ruyi Xu*,
Guangxuan Xiao*,
Yukang Chen,
Liuning He,
Kelly Peng,
Yao Lu,
Song Han
arXiv 2025
[paper]
[code]
|
|
XAttention: Block Sparse Attention with Antidiagonal Scoring
Ruyi Xu*,
Guangxuan Xiao*,
Haofeng Huang,
Junxian Guo,
Song Han
ICML 2025
[paper]
[code]
|
|
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
Guangxuan Xiao,
Jiaming Tang,
Jingwei Zuo,
Junxian Guo,
Shang Yang,
Haotian Tang,
Yao Fu,
Song Han
ICLR 2025
[paper]
[code]
[demo]
|
|
Efficient Streaming Language Models with Attention Sinks
Guangxuan Xiao,
Yuandong Tian,
Beidi Chen,
Song Han,
Mike Lewis
ICLR 2024
[paper]
[code]
[MIT
News]
[NVIDIA
TensorRT-LLM]
[on iPhone]
|
|
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
Guangxuan Xiao*,
Ji Lin*,
Mickael Seznec,
Hao Wu,
Julien Demouth,
Song Han
ICML 2023
[paper]
[code]
[NVIDIA
TensorRT-LLM]
|
|
FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention
Guangxuan Xiao*,
Tianwei Yin*,
William T. Freeman,
Frédo Durand,
Song Han
IJCV 2024
[website]
[paper]
[code]
|
 |
Massachusetts Institute of Technology
2022.08 - 2025.12
Ph.D. in Computer Science
S.M. in Computer Science
Thesis: Efficient Algorithms and Systems for Large Language Models
Advisor: Prof. Song Han
|
 |
Tsinghua University
2018.08 - 2022.07
B.Eng. in Computer Science
B.Econ. in Economics (Second Major)
Advisor: Prof. Zhiyuan Liu.
|
 |
Stanford University
2020.07 - 2021.06
Visiting Research Student
Advisor: Prof. Jure Leskovec
Mentor: Jiaxuan You
|
 |
Stanford University
2021.06 - 2021.11
Visiting Research Student through the UGVR
program
Advisor: Prof. Jiajun Wu, Prof. Leslie Pack Kaelbling
Mentor: Jiayuan Mao
|
 |
NVIDIA
2024 - 2025
Research Intern
Santa Clara, CA
with Song Han
Researching efficient large language models.
|
 |
Meta Inc.
2023
Research Scientist Intern
Menlo Park, CA
with Mike Lewis
Developed efficient streaming language models.
|
Honors & Awards
- Hewlett Packard Fellowship, 2022
- Boeing Scholarship, 2021
- Tsinghua "Future Scholar" Scientific Research Grant ($30,000), 2021
- National Scholarship, 2020
- Contemporary Undergraduate Mathematical Contest in Modeling, 1st Prize, 2020
- Beijing "Challenge Cup" Academic Science and Technology Competition, 1st Prize, 2020
- Tsinghua Comprehensive Excellence Scholarship, 2019
|
|
Miscellaneous
I love to play soccer. I was the captain and striker of my department soccer team.
I also love to play table tennis, Go (Weiqi), and piano. Beethoven's works are my favorite.
|
|