Zheng Wang
I am the first year PhD candidate from Computer Science, University of Illinois Urbana-Champaign
, and my advisor is Prof. Minjia Zhang. My current research interests focus on the following two directions:
(1) Leveraging sparsity, enhancing information processing efficiency, and combining system and algorithms to accelerate inference and training processes for LLMs and VLMs.
(2) Utilizing empirical and theoretical insights to deeply understand LLMs and VLMs and optimize their performance.
Prior to joining UIUC, I was very fortunate to be advised by Prof. Yingyan (Celine) Lin of EIC Lab as a Research Assistant from School of Computer Science, Georgia Tech.
Outside of my academic life, I like to stay healthy by working out regularly. I'm also really into playing ๐พ tennis ๐พ โit's a fun, challenging sport that keeps me both physically and mentally sharp.
Email /
Google Scholar /
|
|
|
LAMB: A Training-Free Method to Enhance the Long-Context Understanding of SSMs via Attention-Guided Token Filtering
Zhifan Ye, Zheng Wang, Kejing Xia, Jihoon Hong, Leshu Li, Lexington A. Whalen, Cheng Wan, Haoran You, Celine Lin, Souvik Kundu
63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025
PDF
|
|
Model Tells You Where to Merge: Adaptive KV Cache Merging for LLMs on Long-Context Tasks
Zheng Wang, Boxiao Jin, Zhongzhi Yu, Minjia Zhang
preprint
PDF
|
|
Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration
Zhongzhi Yu*, Zheng Wang*, Yonggan Fu, Huihong Shi, Khalid Shaikh, Yingyan (Celine) Lin
2024 International Conference of Machine Learning, ICML 2024
PDF |
Code |
|
|
When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models
Haoran You, Yichao Fu, Zheng Wang, Amir Yazdanbakhsh, Yingyan (Celine)Lin
2024 International Conference of Machine Learning, ICML 2024
PDF |
Code |
|
|
EDGE-LLM: Enabling Efficient Large Language Model Adaptation on Edge Devices via Layerwise Unified Compression and Adaptive Layer Tuning & Voting
Zhongzhi Yu, Zheng Wang, Yuhan Li, Haoran You, Ruijie Gao, Xiaoya Zhou, Sreenidhi Reedy Bommu, Yang Katie Zhao, Yingyan Celine Lin
61st ACM/IEEE Design Automation Conference, DAC 2024
PDF |
Code |
|
|
XRouting: Explainable Vehicle Rerouting for Urban Road Congestion Avoidance using Deep Reinforcement Learning
Zheng Wang, Shen Wang
2022 IEEE Smart City Conference, ISC2 2022
PDF |
Code |
|
Teaching
-
Teaching Assistant CSE 8803 Machine Learning for Neural/Behavior Data, Georgia Tech, 2024 Fall, instructor:
Prof. Anqi Wu.
-
Teaching Assistant CSE 6740 Computational Data Analysis (Machine Learning), Georgia Tech, 2025 Spring, instructor:
Prof. Anqi Wu.
|
Services
- Conference Reviewer ICML 2025, NeurIPS 2025
|
Selected Awards
- [Jun. 2023] Excellent Graduates of Beijing
- [Nov. 2022] Presidential Fellowship in 2021-2022 Academic Year
- [Nov. 2022] Xiaomi Special Scholarship in 2021-2022 Academic Year
|
|