Homepage - Yuanmin Tang

Hi, my name is Yuanmin Tang (Chinese: 唐源民), a Ph.D. Candidate (4^th year student, 2021-2026) at the University of Chinese Academy of Sciences (UCAS), Beijing, China, advised by Dr. Jing Yu. Currently, I am also an intern in the Data, Knowledge, Intelligence (DKI) team at Microsoft, previously part of Microsoft Research Asia, supervised by Dr. Jue Zhang. I also closely collaborate with Prof. Qi Wu from the University of Adelaide and Prof. Massimiliano Mancini.

📖 Research

My primary research interests are centered around advanced multimodal learning and explainable artificial intelligence, specifically focusing on:

⭐ Zero-Shot Composed Image Retrieval
Multimodal Large Language Models
Cross-modal Sponsored Search
Explainable Machine Learning
Watermarking Techniques in Vision-Language Models

PS: If you’re interested in collaboration or just want to chat, feel free to reach out via e-mail at tangyuanmin@iie.ac.cn, or via wechat at Peter20200618.

🎉 News

2025.10: 🎉 Two paper accepted at AAAI’26, including one corresponding author!
2025.06: 🎉 Two paper accepted at CVPR’25, including one Highlight presentation!
2025.06: 🎉 One co-author paper accepted to ACM MM 2024!
2024.12: 🏆 Awarded Top Reviewer at NeurIPS 2024!
2024.10: 🎉 Started internship at Microsoft DKI (supervisor Dr. Jue Zhang).
2024.08: 🎉 Oral presentation at AAAI’24.
2021.09: 🎓 Started Ph.D. journey at UCAS.
2021.10: 🎉 Co-author journal paper published in IEEE TWC (IF: 9.6).
2020.09: 🎉 Co-author journal paper published in Sustainable Cities and Society (IF: 11.0).

📝 Selected Publications

: Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval; Yuanmin Tang, Jue Zhang, Xiaoting Qin, Jing Yu, Gaopeng Gou, Gang Xiong, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Wu; Conference on Computer Vision and Pattern Recognition (CVPR'25), Highlight, 2025, CCF-A

: Missing Target-Relevant Information Prediction with World Model for Accurate Zero-Shot Composed Image Retrieval; Yuanmin Tang, Jing Yu, Keke Gai, Jiamin Zhuang, Gang Xiong, Gaopeng Gou, Qi Wu; Conference on Computer Vision and Pattern Recognition (CVPR'25), 2025, CCF-A

: Context-I2W: Mapping Images to Context-dependent Words for Accurate Zero-Shot Composed Image Retrieval; Yuanmin Tang, Jing Yu, Keke Gai, Jiamin Zhuang, Gang Xiong, Yue Hu, Qi Wu; The Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI'24), Oral, 2024, CCF-A

: Visual-Semantic Decomposition and Partial Alignment for Document-based Zero-Shot Learning; Xiangyan Qu, Jing Yu, Keke Gai, Jiamin Zhuang, Yuanmin Tang, Gang Xiong, Gaopeng Gou, Qi Wu; ACM International Conference on Multimedia (ACM MM), 2024, CCF-A

🌟 Academic Service

Conference Committee Member

TPC Member / Reviewer for CVPR’25, ICCV’25, ICML’25, ICME’25, ACM MM’25, NeurIPS’25, AAAI’26, ICLR’26, ICASSP’26, CVPR’26
TPC Member / Reviewer for ICLR’24, NeurIPS’24 (Top Reviewer Award)
TPC Member / Reviewer for ICME’23

Journal Reviewer

Reviewer for IEEE Transactions on Image Processing (TIP)
Reviewer for IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)
Reviewer for Transactions on Machine Learning Research (TMLR)

Always excited about new ideas and discussions!