Hi there! I’m currently a postdoctoral researcher at the University of Oxford, supervised by Prof. Philip Torr. I’m also visiting Le Cong and Mengdi Wang’s Lab at Stanford and Princeton. I did my Ph.D. at the University of Sydney, working with Prof. Wanli Ouyang and Prof. Zhiyong Wang. Previously, I was a rising star research fellow at the Shanghai AI Lab selected by Prof. Xiaoou Tang, where I collaborated with outstanding researchers like Dr. Lei Bai, and Dr. Amanda Shao. I also had a wonderful time as a visitor at the Chinese University of Hong Kong. Before starting my Ph.D., I was part of SenseTime’s AGI group, working closely with Dr. Junjie Yan. I earned my bachelor’s degree from HUST, where I had the honor of being the ACM-ICPC team captain, guided by Prof. Kun He.

News

Research Highlights & Profile

Zhenfei (Jeremy) Yin is a postdoctoral researcher at the University of Oxford, supervised by Prof. Philip Torr, and a visiting researcher at Stanford and Princeton. He received his Ph.D. from the University of Sydney. His research focuses on advancing the next generation of AI: systems that can not only understand and generate, but also act, adapt, and drive discovery in the real world. His work spans foundation model agents, multi-agent systems, self-evolving agents, embodied agents and robotics, and AI Scientist systems, with the goal of building general-purpose AI agents that can operate across both physical and virtual worlds and uncover new scaling laws for agent-based intelligence and automated scientific discovery.

Dr. Yin has authored 90+ papers including preprints, with 50+ papers published at top AI conferences and journals, and his work has received 2,000+ citations. He has also contributed to open-source AI projects with 20,000+ GitHub stars in total. Across agentic AI, multi-agent systems, embodied intelligence, and AI scientists, he has built research and open-source efforts that help push AI beyond passive assistance toward execution, continual learning, and innovation. His representative efforts include open platforms and systems for multimodal foundation models, large-scale agent societies, multi-agent systems, and embodied intelligence. His work has also received broader recognition beyond academia, including coverage by Nature and The Washington Post.

Selected Publications

Topics: Foundation Model Agents / Robotics / AI Scientists

(*: indicates equal contribution; ‡: indicates corresponding; †: indicates project lead)

Visit Google Scholar for the complete list of publications.

Preprint
sym

SciAgentGym: Benchmarking Multi-Step Scientific Tool-use in LLM Agents

Yujiong Shen, Yajie Yang, Zhiheng Xi, Binze Hu, Huayu Sha, Jiazheng Zhang, Qiyuan Peng, Junlin Shang, Jixuan Huang, Yutao Fan, Jingqi Tong, Shihan Dou, Ming Zhang, Lei Bai, Zhenfei Yin, Tao Gui, Xingjun Ma, Qi Zhang, Xuanjing Huang, Yu-Gang Jiang

Preprint 2026

PDF

Preprint
sym

Charting Empirical Laws for LLM Fine-Tuning in Scientific Multi-Discipline Learning

Lintao Wang, Zhuqiang Lu, Yilin Zhu, Kun Hu, Zhenfei Yin, Shixiang Tang, Zhiyong Wang, Wanli Ouyang, Xinzhu Ma

Preprint 2026

PDF

Preprint
sym

TodoEvolve: Learning to Architect Agent Planning Systems

Jiaxi Liu, Yanzuo Jiang, Guibin Zhang, Zihan Zhang, Heng Chang, Zhenfei Yin, Qibing Ren, Junchi Yan

Preprint 2026

PDF

Preprint
sym

LatentChem: From Textual CoT to Latent Thinking in Chemical Reasoning

Xinwu Ye, Yicheng Mao, Jia Zhang, Yimeng Liu, Li Hao, Fang Wu, Zhiwei Li, Yuxuan Liao, Zehong Wang, Zhiyuan Liu, Zhenfei Yin, Li Yuan, Philip Torr, Huan Sun, Xiangxiang Zeng, Mengdi Wang, Le Cong, Shenghua Gao, Xiangru Tang

Preprint 2026

PDF

ACL 2026 Findings
sym

Behavioral Consistency Validation for LLM Agents: An Analysis of Trading-Style Switching through Stock-Market Simulation

Zeping Li, Guancheng Wan, Keyang Chen, Yu Chen, Yiwen Zhao, Philip Torr, Guangnan Ye, Zhenfei Yin, Hongfeng Chai

Findings of the Association for Computational Linguistics, ACL 2026

PDF

Preprint
sym

Vision-deepresearch benchmark: Rethinking visual and textual search for multimodal large language models

Yu Zeng, Wenxuan Huang, Zhen Fang, Shuang Chen, Yufan Shen, Yishuo Cai, Xiaoman Wang, Zhenfei Yin, Lin Chen, Zehui Chen, Shiting Huang, Yiming Zhao, Xu Tang, Yao Hu, Philip Torr, Wanli Ouyang, Shaosheng Cao

Preprint 2026

PDF

ACL 2026
sym

Rethinking the Role of Entropy in Optimizing Tool-Use Behaviors for Large Language Model Agents

Zeping Li, Hongru Wang, Yiwen Zhao, Guanhua Chen, Yixia Li, Keyang Chen, Yixin Cao, Guangnan Ye, Hongfeng Chai, Zhenfei Yin

The 64th Annual Meeting of the Association for Computational Linguistics, Main Conference, ACL 2026

PDF

Preprint
sym

Why Reasoning Fails to Plan: A Planning-Centric Analysis of Long-Horizon Decision Making in LLM Agents

Zehong Wang, Fang Wu, Hongru Wang, Xiangru Tang, Bolian Li, Zhenfei Yin, Yijun Ma, Yiyang Li, Weixiang Sun, Xiusi Chen, Yanfang Ye

Preprint 2026

PDF

Preprint
sym

Vision-deepresearch: Incentivizing deepresearch capability in multimodal large language models

Wenxuan Huang, Yu Zeng, Qiuchen Wang, Zhen Fang, Shaosheng Cao, Zheng Chu, Qingyu Yin, Shuang Chen, Zhenfei Yin, Lin Chen, Zehui Chen, Xu Tang, Yao Hu, Philip Torr, Feng Zhao, Wanli Ouyang

Preprint 2026

PDF

Preprint
sym

TouchGuide: Inference-Time Steering of Visuomotor Policies via Touch Guidance

Zhemeng Zhang, Jiahua Ma, Xincheng Yang, Xin Wen, Yuzhi Zhang, Boyan Li, Yiran Qin, Jin Liu, Can Zhao, Li Kang, Haoqin Hong, Zhenfei Yin, Philip Torr, Hao Su, Ruimao Zhang, Daolin Ma

Preprint 2026

PDF

Tech. Report
sym

Advances and Innovations in the Multi-Agent Robotic System (MARS) Challenge

Li Kang, Heng Zhou, Xiufeng Song, Rui Li, Bruno NY Chen, Ziye Wang, Ximeng Meng, Stone Tao, Yiran Qin, Xiaohong Liu, Ruimao Zhang, Lei Bai, Yilun Du, Hao Su, Philip Torr, Zhenfei Yin

Technical Report, 2026

PDF

Preprint
sym

Think3D: Thinking with Space for Spatial Reasoning

Zaibin Zhang, Yuhan Wu, Lianjie Jia, Yifan Wang, Zhongbo Zhang, Yijiang Li, Binghao Ran, Fuxi Zhang, Zhuohan Sun, Zhenfei Yin, Lijun Wang, Huchuan Lu

Preprint 2026

PDF

ICLR 2026 Workshop
sym

RoboSafe: Safeguarding Embodied Agents via Executable Safety Logic

Le Wang, Zonghao Ying, Xiao Yang, Quanchen Zou, Zhenfei Yin, Tianlin Li, Jian Yang, Yaodong Yang, Aishan Liu, Xianglong Liu

The First Workshop on Efficient Spatial Reasoning, ICLR 2026, Oral Presentation, Best Paper Award

PDF

ACL 2026
sym

From Word to World: Can Large Language Models be Implicit Text-based World Models?

Yixia Li, Hongru Wang, Jiahao Qiu, Zhenfei Yin, Dongdong Zhang, Cheng Qian, Zeping Li, Pony Ma, Guanhua Chen, Heng Ji, Mengdi Wang

The 64th Annual Meeting of the Association for Computational Linguistics, Main Conference, ACL 2026

PDF

Preprint
sym

Memory in the Age of AI Agents

Yuyang Hu*, Shichun Liu*, Yanwei Yue*, Guibin Zhang*, Boyang Liu, Fangyi Zhu, Jiahang Lin, Honglin Guo, Shihan Dou, Zhiheng Xi, Senjie Jin, Jiejun Tan, Yanbin Yin, Jiongnan Liu, Zeyu Zhang, Zhongxiang Sun, Yutao Zhu, Hao Sun, Boci Peng, Zhenrong Cheng, Xuanbo Fan, Jiaxin Guo, Xinlei Yu, Zhenhong Zhou, Zewen Hu, Jiahao Huo, Junhao Wang, Yuwei Niu, Yu Wang, Zhenfei Yin, Xiaobin Hu, Yue Liao, Qiankun Li, Kun Wang, Wangchunshu Zhou, Yixin Liu, Dawei Cheng, Qi Zhang, Tao Gui, Shirui Pan, Yan Zhang, Philip Torr, Zhicheng Dou, Ji-Rong Wen, Xuanjing Huang, Yu-Gang Jiang, Shuicheng Yan

Preprint 2025

PDF

Preprint
sym

LiveSearchBench: An Automatically Constructed Benchmark for Retrieval and Reasoning over Dynamic Knowledge

Heng Zhou*, Ao Yu*, Yuchen Fan*, Jianing Shi, Li Kang, Hejia Geng, Yongting Zhang, Yutao Fan, Yuhao Wu, Tiancheng He, Yiran Qin, Lei Bai, Zhenfei Yin

Preprint 2025

PDF

ACL 2026
sym

SecureWebArena: A Holistic Security Evaluation Benchmark for LVLM-based Web Agents

Zonghao Ying*, Yangguang Shao*, Jianle Gan, Gan Xu, Junjie Shen, Wenxin Zhang, Quanchen Zou, Junzheng Shi, Zhenfei Yin, Mingchuan Zhang, Aishan Liu, Xianglong Liu

Findings of the Association for Computational Linguistics, ACL 2026

PDF

ICLR 2026
sym

CoMAS: Co-Evolving Multi-Agent Systems via Interaction Rewards

Xiangyuan Xue, Yifan Zhou, Guibin Zhang, Zaibin Zhang, Yijiang Li, Chen Zhang, Zhenfei Yin, Philip Torr, Wanli Ouyang, Lei Bai

The Fourteenth International Conference on Learning Representations, ICLR 2026

PDF

Preprint
sym

A-MemGuard: A Proactive Defense Framework for LLM-Based Agent Memory

Qianshan Wei*, Tengchao Yang*, Yaochen Wang*, Xinfeng Li, Lijun Li, Zhenfei Yin, Yi Zhan, Thorsten Holz, Zhiqiang Lin, XiaoFeng Wang

Preprint 2025

PDF

Preprint
sym

LatentEvolve: Self-Evolving Test-Time Scaling in Latent Space

Guibin Zhang, Fanci Meng, Guancheng Wan, Zherui Li, Kun Wang, Zhenfei Yin, Lei Bai, Shuicheng Yan

Preprint 2025

PDF

ACL 2026
sym

Scaling Behaviors of LLM Reinforcement Learning Post-Training: An Empirical Study in Mathematical Reasoning

Zelin Tan, Hejia Geng, Mulei Zhang, Xiaohang Yu, Guancheng Wan, Yifan Zhou, Qiang He, Xiangyuan Xue, Heng Zhou, Yutao Fan, Zhongzhi Li, Zaibin Zhang, Guibin Zhang, Chen Zhang, Zhenfei Yin, Lei Bai

The 64th Annual Meeting of the Association for Computational Linguistics, Main Conference, ACL 2026, Oral Presentation

PDF

Preprint
sym

Diagnose, Localize, Align: A Full-Stack Framework for Reliable LLM Multi-Agent Systems under Instruction Conflicts

Guancheng Wan*, Leixin Sun*, Longxu Dou, Zitong Shi, Fang Wu, Eric Hanchen Jiang, Wenke Huang, Guibin Zhang, Hejia Geng, Xiangru Tang, Zhenfei Yin, Yizhou Sun, Wei Wang

Preprint 2025

PDF

ICLR 2026
sym

Eigen-Agent: Adaptive Multi-Agent Scientific Reasoning with Monitor-Based RAG

Xiangru Tang*, Wanghan Xu*, Yujie Wang*, Zijie Guo*, Daniel Shao, Jiapeng Chen, Cixuan Zhang, Ziyi Wang, Lixin Zhang, Guancheng Wan, Wenlong Zhang, Lei Bai, Zhenfei Yin, Philip Torr, Hanrui Wang, Di Jin

The Fourteenth International Conference on Learning Representations, ICLR 2026

PDF

ICLR 2026
sym

Interleaving Reasoning for Better Text-to-Image Generation

Wenxuan Huang, Shuang Chen, Zheyong Xie, Shaosheng Cao, Shixiang Tang, Yufan Shen, Qingyu Yin, Wenbo Hu, Xiaoman Wang, Yuntian Tang, Junbo Qiao, Yue Guo, Yao Hu, Zhenfei Yin, Philip Torr, Yu Cheng, Wanli Ouyang, Shaohui Lin

The Fourteenth International Conference on Learning Representations, ICLR 2026

PDF

TMLR
sym

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Guibin Zhang*, Hejia Geng*, Xiaohang Yu*, Zhenfei Yin, Zaibin Zhang, Zelin Tan, Heng Zhou, Zhongzhi Li, Xiangyuan Xue, Yijiang Li, Yifan Zhou, Yang Chen, Chen Zhang, Yutao Fan, Zihu Wang, Songtao Huang, Yue Liao, Hongru Wang, Mengyue Yang, Heng Ji, Michael Littman, Jun Wang, Shuicheng Yan, Philip Torr, Lei Bai

Transactions on Machine Learning Research, TMLR 2026

PDF

NeurIPS 2025
sym

BMMR: A Large-Scale Bilingual Multimodal Multi-Discipline Reasoning Dataset

Zhiheng Xi*, Guanyu Li*, Yutao Fan*, Honglin Guo*, Yufang Liu, Xiaoran Fan, Jiaqi Liu, Jingchao Ding, Wangmeng Zuo, Zhenfei Yin, Lei Bai, Tao Ji, Tao Gui, Qi Zhang, Philip Torr, Xuanjing Huang

The Thirty-Ninth Annual Conference on Neural Information Processing Systems, Datasets and Benchmarks Track, NeurIPS 2025

PDF | Project Page | Code

Preprint
sym

VeriGUI: Verifiable Long-Chain GUI Dataset

Shunyu Liu*, Minghao Liu*, Huichi Zhou, Zhenyu Cui, Yang Zhou, Yuhao Zhou, Wendong Fan, Ge Zhang, Jiajun Shi, Weihao Xuan, Jiaxing Huang, Shuang Luo, Fang Wu, Heli Qi, Qingcheng Zeng, Ziqi Ren, Jialiang Gao, Jindi Lv, Junjie Wang, Aosong Feng, Heng Zhou, Wangchunshu Zhou, Zhenfei Yin, Wenlong Zhang, Guohao Li, Wenhao Yu, Irene Li, Lei Ma, Lei Bai, Qunshu Lin, Mingli Song, Dacheng Tao

Preprint 2025

PDF | Code

NeurIPS 2025
sym

VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning

Li Kang*, Xiufeng Song*, Heng Zhou*, Yiran Qin, Jie Yang, Xiaohong Liu, Philip Torr, Lei Bai, Zhenfei Yin

The Thirty-Ninth Annual Conference on Neural Information Processing Systems, Datasets and Benchmarks Track, NeurIPS 2025

PDF

NeurIPS 2025
sym

LabUtopia: High-Fidelity Simulation and Hierarchical Benchmark for Scientific Embodied Agents

Rui Li*, Zixuan Hu*, Wenxi Qu*, Jinouwen Zhang, Zhenfei Yin, Sha Zhang, Xuantuo Huang, Hanqing Wang, Tai Wang, Jiangmiao Pang, Wanli Ouyang, Lei Bai, Wangmeng Zuo, Ling-Yu Duan, Dongzhan Zhou, Shixiang Tang

The Thirty-Ninth Annual Conference on Neural Information Processing Systems, Datasets and Benchmarks Track, NeurIPS 2025

PDF

Preprint
sym

X-MAS: Towards Building Multi-Agent Systems with Heterogeneous LLMs

Rui Ye*, Xiangrui Liu*, Qimin Wu, Xianghe Pang, Zhenfei Yin, Lei Bai, Siheng Chen

Preprint 2025

PDF

Preprint
sym

MASLab: A Unified and Comprehensive Codebase for LLM-based Multi-Agent Systems

Rui Ye, Keduan Huang, Qimin Wu, Yuzhu Cai, Tian Jin, Xianghe Pang, Xiangrui Liu, Jiaqi Su, Chen Qian, Bohan Tang, Kaiqu Liang, Jiaao Chen, Yue Hu, Zhenfei Yin, Rongye Shi, Bo An, Yang Gao, Wenjun Wu, Lei Bai, Siheng Chen

Preprint 2025

PDF

Preprint
sym

AI-Driven Automation Can Become the Foundation of Next-Era Science of Science Research

Renqi Chen*, Haoyang Su*, Shixiang Tang, Zhenfei Yin, Qi Wu, Hui Li, Ye Sun, Nanqing Dong, Wanli Ouyang, Philip Torr

Preprint 2025

PDF

ICCV 2025
sym

RoboFactory: Exploring Embodied Agent Collaboration with Compositional Constraints

Yiran Qin*, Li Kang*, Xiufeng Song*, Zhenfei Yin, Xiaohong Liu, Xihui Liu, Ruimao Zhang, Lei Bai

International Conference on Computer Vision, ICCV 2025

PDF | Project Page

EMNLP 2025
sym

ReSo: A Reward-driven Self-organizing LLM-based Multi-Agent System for Reasoning Tasks

Heng Zhou*, Hejia Geng*, Xiangyuan Xue, Li Kang, Yiran Qin, Zhiyong Wang, Zhenfei Yin, Lei Bai

Empirical Methods in Natural Language Processing, EMNLP 2025, Oral Presentation, SAC Highlight Award, Outstanding Paper Candidates(Top 1%)

PDF | Code

ICCV 2025
sym

VLIPP: Towards Physically Plausible Video Generation with Vision and Language Informed Physical Prior

Xindi Yang, Baolu Li, Yiming Zhang, Zhenfei Yin, Lei Bai, Liqian Ma, Zhiyong Wang, Jianfei Cai, Tien-Tsin Wong, Huchuan Lu, Xu Jia

International Conference on Computer Vision, ICCV 2025

PDF | Project Page

ICML 2025
sym

MAS-GPT: Training LLMs to Build LLM-based Multi-Agent Systems

Rui Ye, Shuo Tang, Rui Ge, Yaxin Du, Zhenfei Yin, Siheng Chen, Jing Shao

Forty-Second International Conference on Machine Learning, ICML 2025

ICLR 2025 Workshop on Reasoning and Planning for Large Language Models, 2025

PDF

Preprint
sym

Are We There Yet? Revealing the Risks of Utilizing Large Language Models in Scholarly Peer Review

Rui Ye*, Xianghe Pang*, Jingyi Chai, Jiaao Chen, Zhenfei Yin, Zhen Xiang, Xiaowen Dong, Jing Shao, Siheng Chen

Preprint, 2024

PDF | Project Page

ICCV 2025
sym

B-VLLM: A Vision Large Language Model with Balanced Spatio-Temporal Tokens

Zhuqiang Lu, Zhenfei Yin, Mengwei He, Zhihui Wang, Zicheng Liu, Zhiyong Wang, Kun Hu

International Conference on Computer Vision, ICCV 2025

PDF | Code

NeurIPS-W 2024
sym

OASIS: Open Agents Social Interaction Simulations on One Million Agents

Ziyi Yang*, Zaibin Zhang*, Zirui Zheng, Yuxian Jiang, Ziyue Gan, Zhiyu Wang, Zijian Ling, Jinsong Chen, Martz Ma, Bowen Dong, Prateek Gupta, Shuyue Hu, Zhenfei Yin, Guohao Li, Xu Jia, Lijun Wang, Bernard Ghanem, Huchuan Lu, Wanli Ouyang, Yu Qiao, Philip Torr, Jing Shao

NeurIPS Workshop on Open-World Agents, 2024

PDF | Project Page | Code

ICML 2025
sym

WorldSimBench: Towards Video Generation Models as World Simulators

Yiran Qin*, Zhelun Shi*, Jiwen Yu, Xijun Wang, Enshen Zhou, Lijun Li, Zhenfei Yin, Xihui Liu, Lu Sheng, Jing Shao, Lei Bai, Wanli Ouyang, Ruimao Zhang

Forty-Second International Conference on Machine Learning, ICML 2025

PDF | Project Page

ACL 2025
sym

Two Heads Are Better Than One: A Multi-Agent System Has the Potential to Improve Scientific Idea Generation

Haoyang Su*, Renqi Chen*, Shixiang Tang, Xinzhe Zheng, Jingzhe Li, Zhenfei Yin, Wanli Ouyang, Nanqing Dong

The 63rd Annual Meeting of the Association for Computational Linguistics, Main Conference, ACL 2025

PDF | Project Page

Preprint
sym

GenderBias-VL: Benchmarking Gender Bias in Vision Language Models via Counterfactual Probing

Yisong Xiao, Aishan Liu, QianJia Cheng, Zhenfei Yin, Siyuan Liang, Jiapeng Li, Jing Shao, Xianglong Liu, Dacheng Tao

Preprint, 2024

PDF

CVPR 2025
sym

SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Model

Yongting Zhang*, Lu Chen*, Guodong Zheng, Yifeng Gao, Rui Zheng, Jinlan Fu, Zhenfei Yin, Senjie Jin, Yu Qiao, Xuanjing Huang, Feng Zhao, Tao Gui, Jing Shao

The IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2025

PDF | Code

IROS 2025
sym

RH20T-P: A Primitive-Level Robotic Dataset Towards Composable Generalization Agents

Zeren Chen*, Zhelun Shi*, Xiaoya Lu*, Lehan He*, Sucheng Qian, Hao Shu Fang, Zhenfei Yin, Wanli Ouyang, Jing Shao, Yu Qiao, Cewu Lu, Lu Sheng

IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2025

NeurIPS Workshop on Open-World Agents, 2024

PDF | Project Page

Preprint
sym

Assessment of Multimodal Large Language Models in Alignment with Human Values

Zhelun Shi*, Zhipin Wang*, Hongxing Fan*, Zaibin Zhang, Lijun Li, Yongting Zhang, Zhenfei Yin, Lu Sheng, Yu Qiao, Jing Shao

Preprint, 2024

PDF | Project Page | Code

IROS 2025
sym

MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simulated-World Control

Enshen Zhou*, Yiran Qin*, Zhenfei Yin, Yuzhou Huang, Ruimao Zhang, Lu Sheng, Yu Qiao, Jing Shao

IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2025

NeurIPS Workshop on Open-World Agents, 2024

PDF | Project Page | Code

ACL 2024
sym

Towards Tracing Trustworthiness Dynamics: Revisiting Pre-training Period of Large Language Models

Chen Qian*, Jie Zhang*, Wei Yao*, Dongrui Liu, Zhenfei Yin, Yu Qiao, Yong Liu, Jing Shao

The 62nd Annual Meeting of the Association for Computational Linguistics, Findings, ACL 2024

PDF | Code

Tech. Report
sym

From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities

Chaochao Lu, Chen Qian, Guodong Zheng, Hongxing Fan, Hongzhi Gao, Jie Zhang, Jing Shao, Jingyi Deng, Jinlan Fu, Kexin Huang, Kunchang Li, Lijun Li, Limin Wang, Lu Sheng, Meiqi Chen, Ming Zhang, Qibing Ren, Sirui Chen, Tao Gui, Wanli Ouyang, Yali Wang, Yan Teng, Yaru Wang, Yi Wang, Yinan He, Yingchun Wang, Yixu Wang, Yongting Zhang, Yu Qiao, Yujiong Shen, Yurong Mou, Yuxi Chen, Zaibin Zhang, Zhelun Shi, Zhenfei Yin, Zhipin Wang

Technical Report, 2024

PDF

ECCV 2024
sym

Depicting Beyond Scores: Advancing Image Quality Assessment through Multi-modal Language Models

Zhiyuan You*, Zheyuan Li*, Jinjin Gu*, Zhenfei Yin, Tianfan Xue, Chao Dong

The 18th European Conference on Computer Vision, ECCV 2024

PDF | Project Page | Code

CVPR 2024
sym

MP5: A Multi-modal Open-ended Embodied System in Minecraft via Active Perception

Yiran Qin*, Enshen Zhou*, Qichang Liu*, Zhenfei Yin, Lu Sheng, Ruimao Zhang, Yu Qiao, Jing Shao

The IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024

PDF | Project Page | Code

Tech. Report
sym

ChEF: A Comprehensive Evaluation Framework for Standardized Assessment of Multimodal Large Language Models

Zhelun Shi*, Zhipin Wang*, Hongxing Fan*, Zhenfei Yin, Lu Sheng, Yu Qiao, Jing Shao

Technical Report, 2024

PDF | Code

ICLR 2024
sym

Octavius: Mitigating Task Interference in MLLMs via LoRA-MoE

Zeren Chen*, Ziqin Wang*, Zhen Wang, Huayang Liu, Zhenfei Yin, Si Liu, Lu Sheng, Wanli Ouyang, Jing Shao

The Twelfth International Conference on Learning Representations, ICLR 2024

PDF | Code

NeurIPS 2023
sym

LAMM: Language-Assisted Multi-Modal Instruction-Tuning Dataset, Framework, and Benchmark

Zhenfei Yin*, Jiong Wang*, Jianjian Cao*, Zhelun Shi*, Dingning Liu, Mukai Li, Xiaoshui Huang, Zhiyong Wang, Lu Sheng, Lei Bai, Jing Shao, Wanli Ouyang

The Thirty-Seventh Annual Conference on Neural Information Processing Systems, Datasets and Benchmarks Track, NeurIPS 2023

PDF | Project Page | Code

ICME 2024
sym

3D Point Cloud Pre-Training with Knowledge Distilled from 2D Images

Yuan Yao, Yuanhan Zhang, Zhenfei Yin, Jiebo Luo, Wanli Ouyang, Xiaoshui Huang

IEEE International Conference on Multimedia and Expo, 2024

PDF

ECCV 2022
sym

Benchmarking Omni-Vision Representation Through the Lens of Visual Realms

Yuanhan Zhang, Zhenfei Yin, Jing Shao, Ziwei Liu

European Conference on Computer Vision, 2022

PDF | Project Page | Code

ECCV 2022
sym

X-Learner: Learning Cross Sources and Tasks for Universal Visual Representation

Yinan He*, Gengshi Huang*, Siyu Chen*, Jianing Teng*, Kun Wang, Zhenfei Yin, Lu Sheng, Ziwei Liu, Yu Qiao, Jing Shao

European Conference on Computer Vision, 2022

PDF

IJCV
sym

Bamboo: Building Mega-Scale Vision Dataset Continually with Human-Machine Synergy

Yuanhan Zhang*, Qinghong Sun*, Yichun Zhou*, Zexin He*, Zhenfei Yin, Kun Wang, Lu Sheng, Yu Qiao, Jing Shao, Ziwei Liu

International Journal of Computer Vision 10.1007/s11263-025-02450-2

PDF | Code

Tech. Report
sym

INTERN: A New Learning Paradigm Towards General Vision

Jing Shao*, Siyu Chen*, Yangguang Li*, Kun Wang*, Zhenfei Yin*, Yinan He*, Jianing Teng*, Qinghong Sun*, Mengya Gao*, Jihao Liu*, Gengshi Huang*, Guanglu Song, Yichao Wu, Yuming Huang, Fenggang Liu, Huan Peng, Shuo Qin, Chengyu Wang, Yujie Wang, Conghui He, Ding Liang, Yu Liu, Fengwei Yu, Junjie Yan, Dahua Lin, Xiaogang Wang, Yu Qiao

Technical Report, 2021

PDF | Code

Professional Service