I’m a CS Ph.D. student @University of Southern California, advised by Prof. Jieyu Zhao. Before that, I was a M.Eng student at Graduated School of Creative Science and Engineering @Waseda University (早稲田大学), Tokyo, supervised by Prof. Masayuki Goto (Japanese only). I also spent my time as a research assistant @University of Maryland, advised by Prof.Tianyi Zhou, and University of Tokyo (東京大学), advised by Prof.Irene Li. I also work closely with Jieyu Zhang, who focuses on interactive and data-centric AI/ML.
Research Interests: My research interest lies in the realm of natural language processing and synthetic data. Specifically, I’m trying to answer the following questions:
- How can we comprehensively evaluate an LLM/VLM in different domains?
- How can we extend ability of LLM/VLM with minimal costs?
- How can we let LLM/VLMs collaborate safely, efficiently, and effectively to solve real-world problems?
📢 News
[04/08/2025] A new preprint is released. Check Efficient Reinforcement Finetuning via Adaptive Curriculum Learning for more details!
[03/31/2025] A new preprint is released. Check Discovering Knowledge Deficiencies of Language Models on Massive Knowledge Base for more details!
[04/10/2024] I will join CS@USC as a PhD student this fall!
📝 Selected Publications
(* denotes equal contribution)
Improving Language Models
- Efficient Reinforcement Finetuning via Adaptive Curriculum Learning
Taiwei Shi, Yiyang Wu, Linxin Song, Tianyi Zhou, Jieyu Zhao - ProVision: Programmatically Scaling Vision-centric Instruction Data for Multimodal Language Models
Jieyu Zhang, Le Xue, Linxin Song, Jun Wang, Weikai Huang, Manli Shu, An Yan, Zixian Ma, Juan Carlos Niebles, silvio savarese, Caiming Xiong, Zeyuan Chen, Ranjay Krishna, Ran Xu.
Posted by: VentureBeat | MarkTechPost - Investigating the Scaling Effect of Instruction Templates for Training Multimodal Language Model
Shijian Wang*, Linxin Song*, Jieyu Zhang, Ryotaro Shimizu, Ao Luo, Li Yao, Cunjian Chen, Julian McAuley, Haiqian Wu
Language Model Evaluation
- Discovering Knowledge Deficiencies of Language Models on Massive Knowledge Base
Linxin Song, Xuwei Ding, Jieyu Zhang, Taiwei Shi, Ryotaro Shimizu, Rahul Gupta, Yang Liu, Jian Kang, Jieyu Zhao
Webpage - NLPBench: Evaluating Large Language Models on Solving NLP Problems
Linxin Song, Jieyu Zhang, Lechao Cheng, Pengyuan Zhou, Tianyi Zhou, Irene Li
ITIF @ NeurIPS 2023. - Explaining Length Bias in LLM-Based Preference Evaluations
Zhengyu Hu, Linxin Song, Jieyu Zhang, Zheyuan Xiao, Jingang Wang, Zhenyu Chen, Hui Xiong
Language Model Agent
- Adaptive In-conversation Team Building for Language Model Agents
Linxin Song*, Jiale Liu*, Jieyu Zhang, Shaokun Zhang, Ao Luo, Shijian Wang, Qingyun Wu, Chi Wang - Offline Training of Language Model Agents with Functions as Learnable Weights
Shaokun Zhang*, Jieyu Zhang*, Jiale Liu, Linxin Song, Chi Wang, Ranjay Krishna, Qingyun Wu
ICML 2024.
Before PhD
- Better Explain Transformers by Illuminating Important Information
Linxin Song, Yan Cui, Ao Luo, Freddy Lecue, Irene Li
EACL 2024 (findings). - Leveraging Instance Features for Label Aggregation in Programmatic Weak Supervision
Jieyu Zhang*, Linxin Song*, Alexander Ratner
AISTATS 2023. - Adaptive Ranking-based Sample Selection for Weakly Supervised Class-imbalanced Text Classification
Linxin Song, Jieyu Zhang, Tianxiang Yang, Masayuki Goto
EMNLP 2022 (findings).
🧑🏫 Teaching
- (TA) DSCI-250: Introduction to Data Science, 2024 Fall
- (TA) DSCI-566: Deep Learning and its Applications, 2025 Spring
👨💻 Internships (Before PhD)
- University of Tokyo - Research Assistant
2023.02-2024.02
Advised by Irene Li - University of Maryland, College Park - Research Assistant
2022.07-2023.10
Advised by Tianyi Zhou - University of Washington - Research Intern
2022.03-2022.11
Advised by Alex Ratner and Jieyu Zhang
🏅 Professional Services
- Maintainer of AG2 (Autogen).
- Reviewer: WACV 2023, ECML-PKDD 2023, NeurIPS 2023, DMLR 2023, ICLR 2024, AISTATS 2024, ACL 2024 (ARR Feb), NeurIPS 2024, EMNLP 2024 (ARR June), ICLR 2025, KDD 2025