News
People
Publications
Contact
English
English
中文 (简体)
Maosong Sun
Latest
Process reinforcement through implicit rewards.
Advancing LLM Reasoning Generalists with Preference Trees
Empowering private tutoring by chaining large language models.
Enhancing Chat Language Models by Scaling High-quality Instructional Conversations
Sparse Low-rank Adaptation of Pre-trained Language Models
Cite
×