[DL輪読会]Life-Long Disentangled Representation Learning with Cross-Domain Latent Homologies

4.5K Views

October 12, 23

#deep learning #Deep Learning #Disentangled Representation Learning #InfoGAN #β-VAE #Lifelong Learning

スライド概要

"2018/09/14
Deep Learning JP:
http://deeplearning.jp/seminar-2/"

Deep Learning JP

@DeepLearning2023

スライド一覧

DL輪読会資料

またはPlayer版

埋め込む »CMSなどでJSが使えない場合

（ダウンロード不可）

関連スライド

【DL輪読会】KAN: Kolmogorov–Arnold Networks

Deep Learning JP 92.7K

【拡散モデル勉強会】拡散モデルの数理

Deep Learning JP 71.9K

【DL輪読会】Evolutionary Optimization of Model Merging Recipes モデルマージの進化的最適化

Deep Learning JP 61.6K

【DL輪読会】Conditional Flow Matching

Deep Learning JP 55.4K

【DL輪読会】Cosmos World Foundation Model Platform for Physical AI

Deep Learning JP 52.3K

【拡散モデル勉強会】Introduction to Diffusion Models

Deep Learning JP 50.5K

各ページのテキスト

DEEP LEARNING JP [DL Papers] ”Life-Long Disentangled Representation Learning with Cross-Domain Latent Homologies” (NIPS2018) Yusuke Iwasawa, Matsuo Lab http://deeplearning.jp/

http://deeplearning.jp/

DEEP LEARNING JP [DL Papers] “Unsupervised Learning” Disentangled Representation Yusuke Iwasawa, Matsuo Lab http://deeplearning.jp/

http://deeplearning.jp/

書誌情報 • Title: “Life-Long Disentangled Representation Learning with Cross-Domain Latent Homologies” • Authors: – Alessandro Achille, Tom Eccles, Loic Matthey, Christopher P Burgess, Nick Watters, Alexander Lerchner, Irina Higgins – 1stはUCLS、残りがDeepMind • 選定理由 – Disentangleという文字がNIPSで目立った – Lifelong大事（知能の研究という意味で） 3

Disentanglement in NIPS2018 • • • • VAE（β-VAE）系 “Life-Long Disentangled Representation Learning with Cross-Domain Latent Homologies” “Isolating Sources of Disentanglement in Variational Autoencoders” “Learning Disentangled Joint Continuous and Discrete Representations” “Learning to Decompose and Disentangle Representations for Video Prediction” その他 • “A Unified Feature Disentangler for Multi-Domain Image Translation and Manipulation” • “Image-to-image translation for cross-domain disentanglement” • “Learning Deep Disentangled Embeddings with the F-Statistic Loss” 4

Agenda • Disentangle Representation Learning • Method for Disentangle Representation Learning – InfoGAN [Chen, NIPS2016] – β-VAE [Higgins, ICLR2017] – Advance of β-VAE [Chen, ICML2018] • Disentanglement for Lifelong Learning [Achille, NIPS2018] 5

What is Disentangled Representation Learning? • disentangle = もつれを解く • Disentangled RL：もつれのない表現を学習 • Example: 顔画像を構成する要素 – 性別 – 顔の向き – 髪の長さ – メガネの有無 – Etc… これらの要素は本来的に互いに独立に制御できる => NNが学ぶ表現もそうなっていてほしい 6

Why Disentanglement is Important? 1. 人間もそういう表現学習している気がする – 顔の位置と目の大きさは多分分けて表現されている 2. 解釈しやすい 3. 効率が良い（最小限のユニットで表現できる） 4. 後継タスクが解きやすくなる（ような気がする） – 特に、転移を考える場合には複数の因子が混ざっていると厄介 • 具体的な応用研究 – Concept Learning [Higgins, ICLR2018] – Reinforcement Learning [Higgins, ICML2017] – Lifelong Learning [Achille, NIPS2018] 7

Difficulty 1. 教師なしである必要がある/望ましい – DLに勝手に表現のもつれを解く可能性はある（特に教師あり） – いちいち各画像に各因子をラベル付けするとかやってられない 2. 予測できる方法である必要がある – やってみたらdisentangleされていた、ではなくdisentangleされると言う根拠がほしい 8

Agenda • Disentangle Representation Learning • Method for Unsup. Disentangle Representation Learning – InfoGAN [Chen, NIPS2016] – β-VAE [Higgins, ICLR2017] – Advance of β-VAE [Chen, ICML2018] • Disentanglement for Lifelong Learning [Achille, NIPS2018] 9

10.

代表的な2系譜 • InfoGAN [Chen, NIPS2016] – GANベース – ある因数分解可能な潜在コードから生成された画像が元の潜在コードに関する情報を持つように • β-VAE [Higgins, ICLR2017] – VAEベース – 事後分布q(z|x)が因数分解可能な事前分布p(z)に近づくように 10

11.

InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets (NIPS2016) Xi Chen et al., 普通のGAN 潜在コードcと生成画像の相互情報量最大化 • • • • D：Discriminator G：Generator z：ノイズ c：分解可能な潜在コード（例：c ~ Cat(K=10, p=0.1) or c ~ Unif(-1, 1)） • λ：重み付けパラメタ 11

12.

InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets (NIPS2016) Xi Chen et al., 12

13.

InfoGANの問題点 • GANベースなので学習が難しい – W-GANとかそのへんにより緩和されている気もする – 相互情報量の制約をつけるとサンプルの多様性も減る（らしい）（by βVAE論文、単純なノイズzの大きさに依存する気もする） • Prior p(c)の選択が難しい（タスクに関する知識を使ってる） – 例：MNISTならカテゴリ10個 • GANベースなので推論分布（ネットワーク）がない 13

14.

β-VAE: LEARNING BASIC VISUAL CONCEPTS WITH A CONSTRAINED VARIATIONAL FRAMEWORK (ICLR2016) Irina Higgins et al., • 基本的な考え方：得られる潜在変数zが因数分解可能な分布に近づくように制約を付与すればよい • ラグランジュの未定乗数法を使うと次のようになる（βというパラメータを持つ）VAE！！！ 14

15.

β-VAE: LEARNING BASIC VISUAL CONCEPTS WITH A CONSTRAINED VARIATIONAL FRAMEWORK (ICLR2016) Irina Higgins et al., 15

16.

β-VAEの問題点：βによるトレードオフ図は“Understanding disentangling in β-VAE”より抜粋 • Β=150の場合再構築があまりうまく言ってない • ガウス分布に単に近づけようと思うと、q(z|x)の分布が平らになる（異なるzが重なるようになる） 16

17.

β- VAEの問題：βによるトレードオフ “Disentangling by Factorizing”より抜粋 • • KLはxとzの相互情報量とq(z)とp(z)のKLに分解可能相互情報量は維持しないと再構成できないのは当然 => KL(q(z)||p(z))の方だけ制約かけたい 17

18.

対策論文 [Burgess+, NIPS2017] “Understanding disentangling in β-VAE” [Kim+, ICML2018] “Disentangling by Factorizing” [Chen+, NIPS2018] “Isolating Sources of Disentanglement in VAE” 18

19.

Understanding disentangling in β-VAE (NIPS2017) Christopher P. Burgess et al., Controlled Capacity Increase β-VAE (CCI-VAE) KLがターゲットCに近づくように（zの情報ボトルネックを緩和） • C：ターゲット情報量 • 学習中にはCを徐々に大きくする • （zは徐々に大きな情報を獲得することを許容される） • 実験的には線形に大きくする 19

20.

Understanding disentangling in β-VAE (NIPS2017) Christopher P. Burgess et al., 20

21.

Isolating Sources of Disentanglement in VAE (NIPS2018) Hyunjik Kim and Andriy Mnih p(x)とq(z)が独立 =>0 （小さくなると☓） zi同士の独立性（Total Correlationと呼ばれる） => 小さくなってほしい ※ q(z)は重点サンプリングで求める ※ α=γ=1にしてβだけ大きくする β-TCVAE 21

22.

Isolating Sources of Disentanglement in VAE (NIPS2018) Hyunjik Kim and Andriy Mnih 22

23.

Disentangling by Factorizing (ICML2018) Hyunjik Kim and Andriy Mnih 普通のVAE Total Correlation • • q(z)をどう求めるか？ MCMCとかはだるい（そもそも多峰分布） => Density Ratio Trick（図参照） https://www.slideshare.net/DeepLearningJP2016/dldisentangling-by-factorising 23

https://www.slideshare.net/DeepLearningJP2016/dldisentangling-by-factorising

24.

Disentangling by Factorizing (ICML2018) Hyunjik Kim and Andriy Mnih, ICML2018 24

25.

ここまでのまとめ • disentangle大事 • 代表手法１：InfoGAN – GANに起因する難しさ（最適化、推論ネットワークがない） • 代表手法２：βVAE – 再構築とdisentanglementのトレードオフ => 種々の研究 25

26.

27.

What is Lifelong Learning (Continuous Learning)? • Aspect1: “The ability to acquire new knowledge from a sequence of experiences to solve progressively more tasks, while maintaining performance on previous ones” • Aspect2: “The ability to sensibly reuse previously learnt representations in new domains” • 次々と現れるタスクを解くのに必要な知識を過去の情報を忘れずにかつ高速に獲得する 27

28.

Why Lifelong Learning is Important? • 科学的：人間もそうしている（again – 知能っぽい – どちらかというと汎用AIっぽい方向性 • 工学的：過去の知識をうまく生かせないといつまでもデータが大量に必要 28

29.

Proposal：Disentanglement for Lifelong Learning • 現実世界で起こるタスクの系列は何らかの因子を共有しているはず – a.k.a 物理/化学法則は同じ • 各タスクを最小で記述するdisentangleされた表現（と各タスクでどの因子が有用かを判定する手段）があればいろいろなタスクを忘却無しで解けるのでは？ Disentanglement Prior 29

30.

Difficulty • β-VAE（あるいは普通のdisentanglement）はデータの分布や生成過程が変化しないことを仮定している • Lifelong学習では明らかに偽（タスクが変わるので） • => β-VAEをLifelong学習に拡張 30

31.

具体的な方法：データ分布に関する仮定 • • • • S = {s1, s2, s3, …, sk}：K個の環境（タスク） Z = {z1, z2, z3, …, zk}：全環境共通のデータ生成因子 Zs ¥in Z：環境sに関係する潜在因子データ分布に関する仮定の図示 as：ans = 1 if zn ¥in Zs • xs ~ p(.|zs, s)という生成過程 – つまりデータ依存のzから環境sのデータは生成されている 31

32.

再パラメータ化 Variational Autoencoder with Shared Embeddings (VSAE) 参考：Controlled Capacity Increase β-VAE (CCI-VAE) • ほぼほぼCCI-VAE • ただし、q(zs|xs)のモデル化とsの推定法はnon trivial 32

33.

q(z|s) xの生成過程に入っている（と思われる）zについては普通のVAE ただし、asは以下の基準で定めるatypicality 入ってないと思われるやつは単にPrior scoreが (1) 一定以下の場合には1、(2) その他の場合に0とする ※ atipicality = any state that is not typical あるsにおけるあるzの平均的な振る舞いとPriorのKL （気持ち：あるzがxの生成過程に含まれているならば学習が進めば平均的にはpriorに近づくはず） 33

34.

sの推定 34

35.

Catacrotic Forgettingへの対策：hallucinating • 過去の情報を忘れてしまうのは困る • 過去のスナップショットから生成されるサンプルが現在のバージョンでも正しくモデル化できることを定期的に保証 35

36.

アルゴリズム全体 36

37.

実験1 VSAE vs. CC-VAE 37

38.

実験2. Ablation Study 38

39.

実験3. Dealing with ambiguity 39

40.

実験4. Semantic Transfer 40

41.

実験5. Imagination-driven-exploration 41

42.

まとめ • β-VAEをLifelong学習に適した形で拡張 – 普通のβ-VAEはデータの分布が変化することを仮定していない – 具体的には複数の環境が生成因子を共有しているという仮定をおいて、学習 • 破滅的忘却はDreamingにより回避 • 大量の実験により有効性を確認 – 詳しくは論文参照してください 42