【Diffusion勉強会】Diffusion Bridge Models

>100 Views

August 27, 24

スライド概要

シェア

またはPlayer版

埋め込む »CMSなどでJSが使えない場合

関連スライド

各ページのテキスト
1.

Diffusion Bridge Models 許諾なく撮影や第三者 への開示を禁止します 2024/08/27 O.Sodtavilan ©︎MATSUO LAB, THE UNIVERSITY OF TOKYO

2.

Introduction Agenda 1. Score-based Diffusion Models - SDE - VP-SDE - VE-SDE 2. Image 2 Image translations - SDEdit - Pix2Pix Instruct 3. DDBM ©︎MATSUO LAB, THE UNIVERSITY OF TOKYO 2

3.

1. Score-based Diffusion Models Stocastic Differential Equation: Forward process: 𝑑𝑥𝑡 = 𝐹 𝑥𝑡 , 𝜎𝑡 𝑑𝜎𝑡 + 𝐺 𝜎𝑡 𝑑𝜔𝑡 Reverse process: 1 𝑑𝑥𝑡 = 𝐹 𝑥𝑡 , 𝜎𝑡 − 𝐺 𝜎𝑡 2 ∇𝑥𝑡 log 𝑝𝜎 (𝑥𝑡 ) 𝑑𝜎𝑡 + 𝐺 𝜎𝑡 𝑑𝜔𝑡 2 ©︎MATSUO LAB, THE UNIVERSITY OF TOKYO 3

4.

1. Score-based Diffusion Models VP-SDE: DDPM-like diffusion models 1 𝐹 𝑥𝑡 , 𝜎𝑡 = − 𝛽 𝜎𝑡 𝑥𝑡 2 𝐺 𝜎𝑡 = 𝛽(𝜎𝑡 ) 確率微分方程式の解 : 𝑥𝑡 = 𝛼𝑡 𝑥0 + 𝜎𝑡 𝑧𝑡 𝑤ℎ𝑒𝑟𝑒 𝑧𝑡 ~𝑁(0, 𝐼) VE-SDE: Variance Exploding 𝐹 𝑥𝑡 , 𝜎𝑡 = 0 𝐺 𝜎𝑡 = 2𝜎𝑡 確率微分方程式の解 : 𝑥𝑡 = 𝑥0 + 𝜎𝑡 𝑧𝑡 𝑤ℎ𝑒𝑟𝑒 𝑧𝑡 ~𝑁(0, 𝐼) ©︎MATSUO LAB, THE UNIVERSITY OF TOKYO 4

5.

1. Score-based Diffusion Models Score-matching 1 𝑑𝑥𝑡 = 𝐹 𝑥𝑡 , 𝜎𝑡 − 𝐺 𝜎𝑡 2 ∇𝑥𝑡 log 𝑝𝜎 (𝑥𝑡 ) 𝑑𝜎𝑡 + 𝐺 𝜎𝑡 𝑑𝜔𝑡 2 Train this parameter ℒ 𝜃 = 𝔼𝑥𝑡~𝑝 𝑥𝑡 𝑥0 ,𝑥0~𝑝𝑑𝑎𝑡𝑎,𝑡~𝑈(0,𝑇) 𝑠𝜃 𝑥𝑡 , 𝜎𝑡 − ∇𝑥𝑡 log 𝑝𝜎 𝑥𝑡 2 Sampling = Solving ODE: 1 𝑑𝑥𝑡 = 𝐹 𝑥𝑡 , 𝜎𝑡 − 𝐺 𝜎𝑡 2 𝑠𝜃 (𝑥𝑡 , 𝜎𝑡 ) 𝑑𝜎𝑡 2 ©︎MATSUO LAB, THE UNIVERSITY OF TOKYO 5

6.

2. Image 2 Image translations Image 2 Image translation task: Sample Generated ©︎MATSUO LAB, THE UNIVERSITY OF TOKYO 6

7.

2. Image 2 Image translations Image 2 Image translation methods: SDEdit Instruct Pix2Pix Denoising Translation Conditional Translation Bridge Models 𝑦~𝑞(𝑦) Classifier-Free Guidance (CFG) for 2 conditions Optimal Transport: 𝑇# 𝑞 = 𝑝 𝑥~𝑝(𝑥) ©︎MATSUO LAB, THE UNIVERSITY OF TOKYO 7

8.

2. Image 2 Image translations Meng, C., He, Y., Song, Y., Song, J., Wu, J., Zhu, J., & Ermon, S. (2021). SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations. International Conference on Learning Representations. SDEdit ✓ Improved sampling speed. ✓ Easy to integrate with large diffusion models. × Realism-faithfulness trade-off. ©︎MATSUO LAB, THE UNIVERSITY OF TOKYO 8

9.

2. Image 2 Image translations SDEdit Sweet spot: Realism-faithfulness trade-off : ©︎MATSUO LAB, THE UNIVERSITY OF TOKYO 9

10.

2. Image 2 Image translations Brooks, T., Holynski, A., & Efros, A.A. (2022). InstructPix2Pix: Learning to Follow Image Editing Instructions. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 18392-18402. Instruct Pix2Pix Text Conditioned CFG (StableDiffusion) CFG for 2 conditions “Turn him into a cyborg!” “Turn him into a cyborg!” ✓ No realism-faithfulness trade-off. ✓ Visual quality is good. × Rely on guidance or projected sampling. (生成過程に組込コスト) ©︎MATSUO LAB, THE UNIVERSITY OF TOKYO 10

11.

2. Image 2 Image translations Instruct Pix2Pix 生成過程への組込 (1) Generate text edits: GPT-3 (finetuned) Input Caption: “photograph of a girl riding a horse” Instruction: “have her ride a dragon” Edited Caption: “photograph of a girl riding a dragon” (2) Generate paired images: Stable Diffusion + Prompt2Prompt Input Caption: “photograph of a girl riding a horse” Edited Caption: “photograph of a girl riding a dragon” Generated training examples: “have her ride a dragon” “Color the cars pink” “Make it lit by fireworks” “convert to brick” … ©︎MATSUO LAB, THE UNIVERSITY OF TOKYO 11

12.

3. DDBM Bortoli, Valentin De et al. “Diffusion Schrödinger Bridge with Applications to Score-Based Generative Modeling.” ArXiv abs/2106.01357 (2021): n. pag. Definition of Schrodinger Bridge Problem シュレディンガーが粒子の状態遷移を計算するときに提唱された(?) ・運動前の粒子の状態:𝜓1 ・運動後の粒子の状態: 𝜓2 確率密度の繊維: 𝜓1 2 → 𝜓2 2 確率空間における最適輸送問題: Parameters: 𝑝𝑑𝑎𝑡𝑎 : dataset distribution 𝑥𝑁 ~𝑝𝑑𝑎𝑡𝑎 = 𝜋0 𝑝𝑝𝑟𝑖𝑜𝑟 : condition distribution 𝑥𝑁 ~𝑝𝑝𝑟𝑖𝑜𝑟 = 𝜋𝑁 ©︎MATSUO LAB, THE UNIVERSITY OF TOKYO 12

13.

3. DDBM 𝑝 𝑋𝑡 𝑋0 → 𝑝(𝑋𝑡 |𝑋0 , 𝑋𝑇 ) 𝑋𝑇 ~𝑝𝑝𝑟𝑖𝑜𝑟 𝑋0 ~𝑝𝑑𝑎𝑡𝑎 Scheduler : 𝛾𝑘 = ? ©︎MATSUO LAB, THE UNIVERSITY OF TOKYO 13

14.

3. DDBM Zhou, L., Lou, A., Khanna, S., & Ermon, S. (2023). Denoising Diffusion Bridge Models. ArXiv, abs/2309.16948. Denoising Diffusion Bridge Model ✓ No realism-faithfulness trade-off. ✓ Visual quality is good. ✓ Not rely on guidance or projected sampling. ©︎MATSUO LAB, THE UNIVERSITY OF TOKYO 14

15.

3. DDBM Scheduler : 𝜎 𝑡 : (𝜎𝑚𝑎𝑥 = 10) VE-SDE sampling ©︎MATSUO LAB, THE UNIVERSITY OF TOKYO 15

16.

3. DDBM Sampling algorithm : Solving ODE or SDE equations. ©︎MATSUO LAB, THE UNIVERSITY OF TOKYO 16

17.

3. DDBM Benchmarks : Conditional Generations ©︎MATSUO LAB, THE UNIVERSITY OF TOKYO 17

18.

3. DDBM Benchmarks : Unconditional Generations ©︎MATSUO LAB, THE UNIVERSITY OF TOKYO 18

19.

3. DDBM Research Ideas : ・Computational Costs:1ステップ計算するために3回モデルの推論が必要。 ・Apply to Large Models:step t のスケジュールを拡張することでStable Diffusionと統合できるか?→Unconditional Samplingができる可能性があるた め、Stable DiffusionをBridge ModelにFinetuneする? ©︎MATSUO LAB, THE UNIVERSITY OF TOKYO 19

20.

©︎MATSUO LAB, THE UNIVERSITY OF TOKYO

21.

セクション ©︎MATSUO LAB, THE UNIVERSITY OF TOKYO 21