Diffusion Model 技术路线梳理
扩散模型(Diffusion Model)是一种生成模型,近年来在图像生成、音频合成、多模态融合等领域取得了显著的成果。本文将对扩散模型的技术路线进行梳理,包括其在CV领域上的应用、发展历程以及后期在NLP以及多模态领域的扩展,帮助读者更好地理解其工作原理和发展历程。
时间原因,先把PPT奉上,后续会出文字博客~
移动端用户请使用PC打开链接查看PPT
参考资料
📝 技术博客
- [CSDN 博客] Diffusion
Models 扩散模型简单讲解与简单实现 (m0_73800360)
- [CSDN 博客] 详细版扩散模型实现解读
(wshzd)
- [CSDN 博客] 另一篇扩散模型实现教程
(m0_61899108)
- [Cnblogs 技术博客] 扩散模型 – 第一部分
(rh‑li)
- [Cnblogs 技术博客] 扩散模型 – 第二部分
(rh‑li)
- [知乎专栏] 扩散模型原理与实现
(知乎用户)
- [知乎问答] 扩散模型相关提问与回答 (知乎用户)
📄 论文
- [Berkeley] arXiv:2006.11239 – Denoising
Diffusion Probabilistic Models (NeurIPS 2020)
- [Heidelberg] arXiv:2112.10752 –
High‑Resolution Image Synthesis with Latent Diffusion Models
(CVPR 2022)
- [FAIR] arXiv:2111.06377 –
Masked Autoencoders Are Scalable Vision Learners
(CVPR 2022)
- [FAIR] arXiv:2304.03283 –
Diffusion Models as Masked Autoencoders (ICCV 2023)
- [Stanford] arXiv:2302.05543 – Adding
Conditional Control to Text‑to‑Image Diffusion Models (ICCV
2023)
- [Minnesota] arXiv:2305.14671 – A Survey
of Diffusion Models in Natural Language Processing
- [RUC & Huawei] arXiv:2406.03736 – Your
Absorbing Discrete Diffusion Secretly Models the Conditional
Distributions of Clean Data
- [RUC & Ant] arXiv:2502.09992 – Large
Language Diffusion Models (LLaDA)
- [RUC & Ant] arXiv:2505.19223 – LLaDA
1.5: Variance‑Reduced Preference Optimization for Large Language
Diffusion Models
- [ULCA] arXiv:2504.12216 –
d1: Scaling Reasoning in Diffusion Large Language Models via
Reinforcement Learning
- [BD] arXiv:2505.15809 –
MMaDA: Multimodal Large Diffusion Language Models
- [HKU & Apple] arXiv:2410.17891 – Scaling
Diffusion Language Models via Adaptation from Autoregressive
Models
- [HKU & Apple] Dream 系列博客(HKU NLP Blog)
Diffusion Model 技术路线梳理
http://zhaojingqian.github.io/2025/06/14/Diffusion-Model-技术路线梳理/