Diffusion Model 技术路线梳理
扩散模型(Diffusion Model)是一种生成模型,近年来在图像生成、音频合成、多模态融合等领域取得了显著的成果。本文将对扩散模型的技术路线进行梳理,包括其在CV领域上的应用、发展历程以及后期在NLP以及多模态领域的扩展,帮助读者更好地理解其工作原理和发展历程。
时间原因,先把PPT奉上,后续会出文字博客~
移动端用户请使用PC打开链接查看PPT
参考资料
📝 技术博客
- [CSDN 博客] Diffusion Models 扩散模型简单讲解与简单实现 (m0_73800360)
- [CSDN 博客] 详细版扩散模型实现解读 (wshzd)
- [CSDN 博客] 另一篇扩散模型实现教程 (m0_61899108)
- [Cnblogs 技术博客] 扩散模型 – 第一部分 (rh‑li)
- [Cnblogs 技术博客] 扩散模型 – 第二部分 (rh‑li)
- [知乎专栏] 扩散模型原理与实现 (知乎用户)
- [知乎问答] 扩散模型相关提问与回答 (知乎用户)
📄 论文
- [Berkeley] arXiv:2006.11239 – Denoising Diffusion Probabilistic Models (NeurIPS 2020)
- [Heidelberg] arXiv:2112.10752 – High‑Resolution Image Synthesis with Latent Diffusion Models (CVPR 2022)
- [FAIR] arXiv:2111.06377 – Masked Autoencoders Are Scalable Vision Learners (CVPR 2022)
- [FAIR] arXiv:2304.03283 – Diffusion Models as Masked Autoencoders (ICCV 2023)
- [Stanford] arXiv:2302.05543 – Adding Conditional Control to Text‑to‑Image Diffusion Models (ICCV 2023)
- [Minnesota] arXiv:2305.14671 – A Survey of Diffusion Models in Natural Language Processing
- [RUC & Huawei] arXiv:2406.03736 – Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data
- [RUC & Ant] arXiv:2502.09992 – Large Language Diffusion Models (LLaDA)
- [RUC & Ant] arXiv:2505.19223 – LLaDA 1.5: Variance‑Reduced Preference Optimization for Large Language Diffusion Models
- [ULCA] arXiv:2504.12216 – d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning
- [BD] arXiv:2505.15809 – MMaDA: Multimodal Large Diffusion Language Models
- [HKU & Apple] arXiv:2410.17891 – Scaling Diffusion Language Models via Adaptation from Autoregressive Models
- [HKU & Apple] Dream 系列博客(HKU NLP Blog)
Diffusion Model 技术路线梳理
http://zhaojingqian.github.io/2025/06/14/Diffusion-Model-技术路线梳理/