Mingyu Cao 曹明宇
Ph.D. Student
University of Surrey

Hi, I am Mingyu Cao (曹明宇), a first-year PhD student in the School of Computer Science and Electronic Engineering at the University of Surrey, advised by Dr. Lu Yin. My research focuses on Diffusion Language Models and Efficient LLMs.

Prior to my PhD, I worked as an NLP Algorithm Engineer at several leading Chinese internet companies, including NetEase, ByteDance, and Shopee, where I specialised in machine translation. Earlier, during my master's studies, my research focused on biomedical information extraction and knowledge-graph-based question answering, advised by Dr. Ling Luo.

Outside of research, I enjoy movies, social media, and travelling. I am a strong advocate for work-life balance. I am always happy to chat about research, life, travelling, or anything interesting.


Education
  • University of Surrey
    University of Surrey
    Department of Computer Science and Electrical Engineering
    Ph.D. Student
    Jan. 2026 - present
  • Dalian University of Technology
    Dalian University of Technology
    M.S. in Computer Science
    Sep. 2017 - Jun. 2020
  • Dalian University of Technology
    Dalian University of Technology
    B.S. in Computer Science
    Sep. 2013 - Jun. 2017
News
2026
Our Paper "SOAR" Has Been Accepted to ICML 2026!🌈
Apr 30
Our Paper "Condense, Don't Just Prune" Has Been Accepted to TMLR. This is my first first-author publication! 🎉
Mar 20
I Started My PhD at the University of Surrey
Feb 05
Farewell, Shopee — Thank You for Everything. I said goodbye to Shopee to begin my PhD journey. Shopee has been the best workplace I have ever experienced — every colleague was kind, supportive, and genuinely fun to work with. I will deeply miss those days. on to the next adventure! 🚀
Feb 03
Selected Publications (view all )
Search or Accelerate: Confidence-Switched Position Beam Search for Diffusion Language Models

Mingyu Cao, Alvaro H.C. Correia, Christos Louizos, Shiwei Liu#, Lu Yin# (# corresponding author)

ICML 2026 Regular

SOAR is a training-free decoding algorithm for Diffusion Language Models that adaptively switches between wider search and parallel decoding based on model confidence, improving reasoning and code generation quality without sacrificing inference speed.

Search or Accelerate: Confidence-Switched Position Beam Search for Diffusion Language Models

Mingyu Cao, Alvaro H.C. Correia, Christos Louizos, Shiwei Liu#, Lu Yin# (# corresponding author)

ICML 2026 Regular

SOAR is a training-free decoding algorithm for Diffusion Language Models that adaptively switches between wider search and parallel decoding based on model confidence, improving reasoning and code generation quality without sacrificing inference speed.

Condense, Don't Just Prune: Enhancing Efficiency and Performance in MoE Layer Pruning

Mingyu Cao, Gen Li, Jie Ji, Jiaqi Zhang, Ajay Jaiswal, Li Shen, Xiaolong Ma, Shiwei Liu, Lu Yin# (# corresponding author)

TMLR 2026

A pruning method for Mixture-of-Experts models that merges multiple experts per layer into a reduced set, preserving model quality while reducing memory usage and improving inference efficiency.

Condense, Don't Just Prune: Enhancing Efficiency and Performance in MoE Layer Pruning

Mingyu Cao, Gen Li, Jie Ji, Jiaqi Zhang, Ajay Jaiswal, Li Shen, Xiaolong Ma, Shiwei Liu, Lu Yin# (# corresponding author)

TMLR 2026

A pruning method for Mixture-of-Experts models that merges multiple experts per layer into a reduced set, preserving model quality while reducing memory usage and improving inference efficiency.

A neural network-based joint learning approach for biomedical entity and relation extraction from biomedical literature

Ling Luo, Zhihao Yang, Mingyu Cao, Lei Wang, Yin Zhang, Hongfei Lin

Journal of biomedical informatics (JBI) 2020-03-01

Joint extraction of entities and relations for biomedical text.

A neural network-based joint learning approach for biomedical entity and relation extraction from biomedical literature

Ling Luo, Zhihao Yang, Mingyu Cao, Lei Wang, Yin Zhang, Hongfei Lin

Journal of biomedical informatics (JBI) 2020-03-01

Joint extraction of entities and relations for biomedical text.

All publications