Chen Zhang's Homepage

About Me

I am currently a Ph.D. student at Beijing Institute of Technology (BIT), where I am advised by Prof. Dawei Song. I collaborate closely with Prof. Benyou Wang from CUHK-SZ on efficient language models and Prof. Qiuchi Li from UCPH on structural bias. I previously worked closely with Dr. Jingang Wang from Meituan NLP and Dr. Qifan Wang from Meta AI on large language models. I also enjoyed building structure-grounded language models with Binyuan Hui from Alibaba.

My current research interests lie in the general area of natural language processing, particularly efficient language models and language agents. Prior to that, I was devoted to opinion mining and model generalization.

I am on the job market 2024-2025 and actively seeking both academic and industrial positions. Drop me an email via chenzhang9702[AT]outlook[DOT]com if you are interested in collaborating with me.

Recent Highlights

Sep. 5th, 2024. LongLLaVA is released to become the first large multi-modal model that could maximally process over 1,000 images with only one Nvidia A100.

July 25th, Aug. 8th, 2024. Invited to give talks respectively on long-context efficiency at Li Auto and democratization of LLMs at ByteDance Research.

July 9th, 2024. MiniMA family is now completed with released MoE model MiniMix and long-context model MiniLoong.

June 28th, 2024. One paper got accepted to TOIS.

Feb. 19th, 2024. Two long papers got accepted to COLING 2024.

Jan. 18th, 2024. One long paper got accepted to EACL 2024.

Dec. 27th, 2023. MiniMA and MiniChat are lifted to MiniMA-2 and MiniChat-2 respectively. MiniMA-2 together with MiniMA and other arts completes the compute-performance frontier and MiniChat-2 surpasses Vicuna-7B on MT-Bench.

Experiences

BIT Sep. 2019 - Present

Dual M.Eng.-Ph.D. in Computer Science
Xiaohongshu NLP Mar. 2023 - Present

Algorithm Intern on Efficient Language Models
CUHK-SZ Aug. 2022 - Mar. 2023

Research Assistant on Elastic Language Models
Meituan NLP Mar. 2021 - Mar. 2023

Research Intern on Large Language Models
Alibaba DAMO Nov. 2020 - Feb. 2021

Research Intern on Structure-grounded Language Models
Zhejiang Lab July 2019 - Aug. 2019

Research Intern on Low-resource Information Extraction
BIT Sep. 2015 - July 2019

B.Eng. in Electrical Engineering

Projects

LongLLaVA
A hybrid-architecture large multi-modal model that becomes the first one who can process over 1,000 images with only one Nvidia A100.
[huggingface]

MiniMA, MiniMA-2, MiniMix, MiniLoong
A distilled language model family that establishes a new compute-performance pareto frontier among existing language models.
[github][huggingface][rank]

MiniChat, MiniChat-2
An instruction-following language model that achieves competitive performance with a small scale.
[github][huggingface][rank]

Phoenix
An instruction-following language model that is competitive with ChatGLM-6b.
[github][huggingface][rank][news]

WenJin
A large language model that reaches top-level performance on CLUE benchmark.
[rank]

Publications & Manuscripts

# indicates equal contributions.

Efficient Language Models

MoDification: Mixture of Depths Made Easy
Chen Zhang, Meizhi Zhong, Qimeng Wang, Xuantao Lu, Zheyu Ye, Chengqiang Lu, Yan Gao, Yao Hu, Kehai Chen, Min Zhang, and Dawei Song.
Preprint. [arXiv]

LongLLaVA: Scaling Multi-modal LLMs to 1,000 Images Efficiently via Hybrid Architecture
Xidong Wang, Dingjie Song, Shunian Chen, Chen Zhang and Benyou Wang.
Preprint. [arXiv][code]

Beyond the Speculative Game: A Survey of Speculative Execution in Large Language Models
Chen Zhang, Zhuorui Liu, Hanqing Zhang, and Dawei Song.
Preprint. [arXiv]

Towards the Law of Capacity Gap in Distilling Language Models.
Chen Zhang, Dawei Song, Zheyu Ye, and Yan Gao.
Preprint. [arXiv][code]

How Speculative Can Speculative Decoding Be?
Zhuorui Liu, Chen Zhang, and Dawei Song.
In COLING 2024. [paper][poster][code]

Task-agnostic Distillation of Encoder-Decoder Language Models.
Chen Zhang, Yang Yang, Qiuchi Li, Jingang Wang, and Dawei Song.
In COLING 2024. [arXiv][poster][code]

Lifting the Curse of Capacity Gap in Distilling Language Models.
Chen Zhang, Yang Yang, Jiahao Liu, Jingang Wang, Yunsen Xian, Benyou Wang, and Dawei Song.
In ACL 2023. [arXiv][slides][code]

On Elastic Language Models.
Chen Zhang, Benyou Wang, and Dawei Song.
In TOIS. [arXiv]

Minimal Distillation Schedule for Extreme Language Model Compression.
Chen Zhang, Yang Yang, Qifan Wang, Jiahao Liu, Jingang Wang, Wei Wu, and Dawei Song.
In EACL 2024 Findings. [arXiv][poster][code]

Sparse Teachers Can Be Dense with Knowledge.
Yi Yang#, Chen Zhang#, and Dawei Song.
In EMNLP 2022. [arXiv][poster][code]

Language Agents

Phoenix: Democratizing ChatGPT across Languages.
Zhihong Chen, Feng Jiang, Junying Chen, Tiannan Wang, Fei Yu, Guiming Chen, Hongbo Zhang, Juhao Liang, Chen Zhang, Zhiyi Zhang, Jianquan Li, Xiang Wan, Benyou Wang, and Haizhou Li.
Preprint. [arXiv][code]

Opinion Mining

PyABSA: A Modularized Framework for Reproducible Aspect-based Sentiment Analysis.
Heng Yang, Chen Zhang, and Ke Li.
In CIKM 2023 Demo. [arXiv][code]

Structural Bias For Aspect Sentiment Triplet Extraction.
Chen Zhang, Lei Ren, Fang Ma, Jingang Wang, Wei Wu, and Dawei Song.
In COLING 2022. [arXiv][slides][code][data][blog]

Aspect-specific Context Modeling for Aspect-based Sentiment Analysis.
Fang Ma, Chen Zhang, Bo Zhang, and Dawei Song.
In NLPCC 2022. [arXiv][slides][data]

Exploiting Position Bias for Robust Aspect Sentiment Classification.
Fang Ma#, Chen Zhang#, and Dawei Song.
In ACL 2021 Findings. [arXiv][slides][code]

End-to-end Emotion-Cause Pair Extraction via Learning to Link.
Haolin Song, Chen Zhang, Qiuchi Li, and Dawei Song.
Preprint. [arXiv][code]

A Multi-task Learning Framework for Opinion Triplet Extraction.
Chen Zhang, Qiuchi Li, Dawei Song, and Benyou Wang.
In EMNLP 2020 Findings. [arXiv][paper][code]

Aspect-based Sentiment Classification with Aspect-specific Graph Convolutional Networks.
Chen Zhang, Qiuchi Li, and Dawei Song.
In EMNLP 2019. [arXiv][slides][code]

Syntax-Aware Aspect-Level Sentiment Classification with Proximity-Weighted Convolution Network.
Chen Zhang, Qiuchi Li, and Dawei Song.
In SIGIR 2019. [arXiv][poster][code]

Model Generalization

Modular Retrieval for Generalization and Interpretation.
Juhao Liang, Chen Zhang, Zhengyang Tang, Jie Fu, Dawei Song, and Benyou Wang.
Preprint. [arXiv][code]

XPrompt: Exploring the Extreme of Prompt Tuning.
Fang Ma, Chen Zhang, Lei Ren, Jingang Wang, Qifan Wang, Wei Wu, Xiaojun Quan, and Dawei Song.
In EMNLP 2022. [arXiv][poster]

Making Pretrained Language Models Good Long-tailed Learners.
Chen Zhang, Lei Ren, Jingang Wang, Wei Wu, and Dawei Song.
In EMNLP 2022. [arXiv][poster][code]

Doge Tickets: Uncovering Domain-general Language Models by Playing Lottery Tickets.
Yi Yang#, Chen Zhang#, Benyou Wang, and Dawei Song.
In NLPCC 2022, Best Paper Award. [arXiv][slides][code]

Adaptable Text Matching via Meta-Weight Regulator.
Bo Zhang, Chen Zhang, Fang Ma, and Dawei Song.
In SIGIR 2022. [arXiv][paper][slides]

A Simple Baseline for Cross-domain Few-shot Text Classification.
Chen Zhang and Dawei Song.
In NLPCC 2021. [paper][slides][code]

Talks

On Long-context Efficiency at Li Auto. 2024/7/25. [slides]

Democratization of LLMs at ByteDance Research. 2024/8/8. [slides]

Services

Organizer: WSDM CUP 2024.

Reviewer: ARR, ACL, EMNLP, NAACL, SIGIR, CIKM, AAAI.

Secondary Reviewer: WSDM, ICTIR, TOIS.

Volunteer: EMNLP.

Honors & Awards

Best Paper Award at NLPCC. 2022.

Elite Ph.D. Student at BIT. 2021.

XIAOMI Scholarship. 2021.

Excellent Undergraduate & Graduation Thesis at BIT. 2019.

SIGIR Student Travel Grant by SIGIR. 2019.

Excellent Prize in International Collegiate Competition for Brain-inspired Computing. 2018.

Quite A Few Medals from Chinese (CCGC) and International (TAAI, ICGA) Computer Games Competition. 2017, 2018, 2019.

Third Prize at China University ROBOt COmpetitioN (ROBOCON), member of Robot Team DreamChaser at BIT. 2018.

First Prize at China Undergraduate Mathematical Contest in Modeling, Beijing Division. 2016.