Bin Lin

Bin Lin林彬

Ph.D. Candidate in Computer Science计算机科学博士研究生

Peking University北京大学  ·  Shenzhen, China中国 · 深圳

linbin.ece@stu.pku.edu.cn  ·  linbin203279@gmail.com

I work on multimodal large models — building open, scalable systems that understand and generate across vision and language, and increasingly unify the two. Advised by Prof. Li Yuan (袁粒) at Peking University.

我的研究方向是多模态大模型——构建开放、可扩展的系统,使其能够跨视觉与语言进行理解生成,并逐步走向二者的统一。师从北京大学袁粒教授。

I'm a strong believer in open research: I lead and contribute to widely-used projects such as Open-Sora Plan, UniWorld, and Video-LLaVA, releasing the code, models, and data behind every paper.

我笃信开放研究:我主导并参与了多个被广泛使用的项目,如 Open-Sora PlanUniWorldVideo-LLaVA,并为每篇论文开源代码、模型与数据。

Multimodal Understanding多模态理解 Multimodal Generation多模态生成 Unified Models统一模型 Video Generation视频生成
Tencent Project UP Scholarship (腾讯青云奖学金)
One of only 15 awardees nationwide · ~400 applicants from 70+ universities · 2026全国仅 15 人入选 · 来自 70+ 所高校的约 400 名申请者 · 2026
🔥

News最新动态

🚀

Selected Work代表性工作

GEAR: up to 10x faster autoregressive image generation
PreprintFirst Author第一作者

GEAR: Guided End-to-End AutoRegression for Image Synthesis

Bin Lin, Zheyuan Liu, et al., Li Yuan

Jointly trains a VQ tokenizer and an autoregressive generator end-to-end, guided by representation alignment — up to 10× faster AR training than LlamaGen-REPA, with better spatial features that generalize across quantizers and to text-to-image.将 VQ 分词器与自回归生成器端到端联合训练,并由表征对齐引导——自回归训练较 LlamaGen-REPA 最高加速 10×,特征更具空间一致性,且可泛化到多种量化器与文生图。

PreprintCo-First Author共同一作 #1 GitHub TrendingGitHub 趋势榜第一

Open-Sora Plan: Open-Source Large Video Generation Model

Bin Lin*, Yunyang Ge*, Xinhua Cheng*, et al., Li Yuan

A fully open recipe for large-scale video generation — causal video VAE, 3D / sparse attention, and complete training pipelines. The first open model trained from scratch natively on NPUs (v1.5).面向大规模视频生成的全开放方案——因果视频 VAE、3D / 稀疏注意力以及完整的训练流程。首个原生在 NPU 上从零训练的开源模型(v1.5)。

arXiv Code 12.2k 🤗 Models·Dataset Cited by
PreprintFirst Author第一作者 #9 GitHub TrendingGitHub 趋势榜第 9

UniWorld-V1: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation

Bin Lin, Zongjian Li, Xinhua Cheng, et al., Li Yuan

One unified model for visual understanding, generation, and manipulation — 2.7M curated samples powering 20+ tasks within a single framework.单一统一模型即可完成视觉理解、生成与编辑——以 270 万条精选数据,在同一框架内支撑 20+ 种任务。

arXiv Code 875 🤗 Models·Dataset Cited by
Video-LLaVA
EMNLP 2025First Author第一作者 #6 GitHub TrendingGitHub 趋势榜第 6

Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Bin Lin, Yang Ye, Bin Zhu, et al., Li Yuan

#1 Most Influential EMNLP 2025 Paper with 1,000+ citations — reached video-QA SOTA in just two days on a single node.EMNLP 2025 最具影响力论文第 1 名,引用量 1,000+——在单机上仅用两天即达到视频问答 SOTA。

arXiv Code 3.5k 🤗 Models·Dataset Cited by
📚

More Publications更多论文

For the complete list, see my Google Scholar. * denotes equal contribution.完整列表见我的 Google Scholar* 表示共同贡献。

Open-Source Projects开源项目

Open source is at the core of my research — collectively 20K+ GitHub stars and 14M+ model & data downloads.开源是我研究的核心 —— 累计获得 20K+ GitHub Stars,相关模型与数据下载量超过 14M+

🏆

Honors & Awards荣誉与奖项

Tencent Project UP Scholarship (腾讯青云奖学金)
One of only 15 AI scholars selected nationwide from ~400 applicants across 70+ universities.全国仅 15 名 AI 学者入选,从 70+ 所高校的约 400 名申请者中选出。
2026
PKU Hongqiao Scholarship (北大宏桥奖学金)
Peking University Shenzhen Graduate School honor for outstanding research students.北京大学深圳研究生院授予优秀科研学生的荣誉。
2025
  • 2023.06Outstanding Graduate of Sichuan Province, China.四川省优秀毕业生
  • 2022.11Outstanding Student, Sichuan Agricultural University (top 10 university-wide).四川农业大学优秀学生(全校前 10)。
  • 2022.10National Scholarship — highest scholarship from the Ministry of Education, China.国家奖学金 —— 教育部最高奖学金。
  • 2021.11National First Prize, National Undergraduate Mathematical Modeling Contest.全国一等奖,全国大学生数学建模竞赛。
  • 2021.10National Scholarship — highest scholarship from the Ministry of Education, China.国家奖学金 —— 教育部最高奖学金。
🎓

Education教育经历

2023.09 – now (exp. 2028)
Ph.D. in Computer Science计算机科学 博士
Peking University北京大学
2019.09 – 2023.06
B.E. — Outstanding Graduate, ranked 1st / 263工学学士 —— 优秀毕业生,年级第 1 / 263
Sichuan Agricultural University四川农业大学
🛠️

Academic Service学术服务

Conference Reviewer会议审稿人
ICLR 2026 NeurIPS 2026 ICML 2026 CVPR 2026 ECCV 2026
Journal Reviewer期刊审稿人
IEEE TPAMI
Workshop Organizer研讨会组织
Working Committee, CVM @ AAAI 2026 WorkshopConsistency in Video Generative Models: From Clip to Wild, featuring a video-generation competition.工作委员会成员,CVM @ AAAI 2026 研讨会——视频生成模型的一致性:从片段到真实场景,并举办视频生成竞赛