News
People
Publications
Contact
English
English
中文 (简体)
Artificial Intelligence
Fourier Position Embedding: Enhancing Attentions Periodic Extension for Length Generalization
Extending the context length of Language Models (LMs) by improving Rotary Position Embedding (RoPE) has become a trend. While existing …
Ermo Hua
,
Che Jiang
,
Xingtai Lv
,
Kaiyan Zhang
,
Ning Ding
,
Youbang Sun
,
Biqing Qi
,
Yuchen Fan
,
Xuekai Zhu
,
Bowen Zhou
PDF
Cite
Code
Free Process Rewards without Process Labels
Different from its counterpart outcome reward models (ORMs), which evaluate the entire responses, a process reward model (PRM) scores a …
Lifan Yuan
,
Wendi Li
,
Huayu Chen
,
Ganqu Cui
,
Ning Ding
,
Kaiyan Zhang
,
Bowen Zhou
,
Zhiyuan Liu
,
Hao Peng
PDF
Cite
Code
How to Synthesize Text Data without Model Collapse?
Model collapse in synthetic data indicates that iterative training on self-generated data leads to a gradual decline in performance. …
Xuekai Zhu
,
Daixuan Cheng
,
Hengli Li
,
Kaiyan Zhang
,
Ermo Hua
,
Xingtai Lv
,
Ning Ding
,
Zhouhan Lin
,
Bowen Zhou
PDF
Cite
MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding
We introduce MedXpertQA, a highly challenging and comprehensive benchmark to evaluate expert-level medical knowledge and advanced …
Yuxin Zuo
,
Shang Qu
,
Linhai Xie
,
Yifei Li
,
Zhangren Chen
,
Xuekai Zhu
,
Ermo Hua
,
Kaiyan Zhang
,
Ning Ding
,
Bowen Zhou
PDF
Cite
Code
GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning
Recent advancements in Large Language Models (LLMs) have shown that it is promising to utilize Process Reward Models (PRMs) as …
Jian Zhao
,
Runze Liu
,
Kaiyan Zhang
,
Zhimu Zhou
,
Junqi Gao
,
Dong Li
,
Jiafei Lyu
,
Zhouyi Qian
,
Biqing Qi
,
Xiu Li
,
Bowen Zhou
PDF
Cite
Code
A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond.
Recent Large Reasoning Models (LRMs), such as DeepSeek-R1 and OpenAI o1, have demonstrated strong performance gains by scaling up the …
Xiaoye Qu
,
Yafu Li
,
Zhaochen Su
,
Weigao Sun
,
Jianhao Yan
,
Dongrui Liu
,
Ganqu Cui
,
Daizong Liu
,
Shuxian Liang
,
Junxian He
,
Peng Li
,
Wei Wei
,
Jing Shao
,
Chaochao Lu
,
Yue Zhang
,
Xian-Sheng Hua
,
Bowen Zhou
,
Yu Cheng
PDF
Cite
Technologies on Effectiveness and Efficiency: A Survey of State Spaces Models
State Space Models (SSMs) have emerged as a promising alternative to the popular transformer-based models and have been increasingly …
Xingtai Lv
,
Youbang Sun
,
Kaiyan Zhang
,
Shang Qu
,
Xuekai Zhu
,
Yuchen Fan
,
Yi Wu
,
Ermo Hua
,
Xinwei Long
,
Ning Ding
,
Bowen Zhou
PDF
Cite
Less is More: Efficient Model Merging with Binary Task Switch
As an effective approach to equip models with multi-task capabilities without additional training, model merging has garnered …
Biqing Qi
,
Fangyuan Li
,
Zhen Wang
,
Junqi Gao
,
Dong Li
,
Peng Ye
,
Bowen Zhou
PDF
Cite
Code
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling
Test-Time Scaling (TTS) is an important method for improving the performance of Large Language Models (LLMs) by using additional …
Runze Liu
,
Junqi Gao
,
Jian Zhao
,
Kaiyan Zhang
,
Xiu Li
,
Biqing Qi
,
Wanli Ouyang
,
Bowen Zhou
PDF
Cite
Code
Process reinforcement through implicit rewards.
Dense process rewards have proven a more effective alternative to the sparse outcome-level rewards in the inference-time scaling of …
Ganqu Cui
,
Lifan Yuan
,
Zefan Wang
,
Hanbin Wang
,
Wendi Li
,
Bingxiang He
,
Yuchen Fan
,
Tianyu Yu
,
Qixin Xu
,
Weize Chen
,
Jiarui Yuan
,
Huayu Chen
,
Kaiyan Zhang
,
Xingtai Lv
,
Shuo Wang
,
Yuan Yao
,
Xu Han
,
Hao Peng
,
Yu Cheng
,
Zhiyuan Liu
,
Maosong Sun
,
Bowen Zhou
,
Ning Ding
PDF
Cite
Code
»
Cite
×