【DL輪読会】"Masked Siamese Networks for Label-Efficient Learning"

172 Views

May 06, 22

#Deep Learning #Masked Siamese Networks #Self-supervised Learning #Transfer Learning #Label-efficient Learning

スライド概要

2022/05/06
Deep Learning JP:
http://deeplearning.jp/seminar-2/

Deep Learning JP

@DeepLearning2023

スライド一覧

DL輪読会資料

またはPlayer版

埋め込む »CMSなどでJSが使えない場合

（ダウンロード不可）

関連スライド

【DL輪読会】KAN: Kolmogorov–Arnold Networks

Deep Learning JP 90.8K

【拡散モデル勉強会】拡散モデルの数理

Deep Learning JP 67.5K

【DL輪読会】Evolutionary Optimization of Model Merging Recipes モデルマージの進化的最適化

Deep Learning JP 61.2K

【DL輪読会】Conditional Flow Matching

Deep Learning JP 49.9K

【DL輪読会】Cosmos World Foundation Model Platform for Physical AI

Deep Learning JP 47.3K

【拡散モデル勉強会】Introduction to Diffusion Models

Deep Learning JP 47.2K

各ページのテキスト

DEEP LEARNING JP [DL Papers] “Masked Siamese Networks for Label-Efficient Learning” Naoki Nonaka http://deeplearning.jp/ 2022/5/5 1

http://deeplearning.jp/

書誌情報 • 会議：? • 著者：Meta AI 2022/5/5 2

概要 p 自己教師あり学習の手法 Masked Siamese Networks(MSN)を提案 p 新規性ランダムにマスクされたパッチの表現とマスクされていない元画像の表現を一致させるように学習 p 画像でのLow-shot learningタスクで自己教師あり学習のSOTAを達成 2022/5/5 3

背景: Mask-denoising + Joint-embedding p Mask-denoising p Joint-embedding [1] 2022/5/5 [2] 4

背景: Mask-denoising + Joint-embedding p Mask-denoising p Visionで優れた性能 p Pixel or Tokenレベルでの再構成が必要 p Joint-embedding p 再構成不要再構成不要で自己教師あり学習をする手法を提案 2022/5/5 5

提案手法：Masked Siamese Network (MSN) 提案手法の概念図: MSN p Masked prediction + Joint-embedding p 先行研究との相違点：Maskなしデータの表現に近づけるように学習 2022/5/5 6

提案手法: MSN（学習手順） 1 2 3 1. 入力画像をランダムなdata augmentationにより二通りに変換（anchor & target） 2. Anchorにランダムにmaskを適用（Targetはそのまま） 3. Anchor & Targetについてのprototype集合に対するsoft-distributionを計算して学習 2022/5/5 7

提案手法: MSN（学習手順）損失関数クロスエントロピー正則化項 (Mean entropy maximization; ME-MAX) 2022/5/5 8

実験概要 p Label-efficient learning p Linear Evaluation & Fine-tuning p Transfer Learning p Ablations 獲得した表現が優れていることを示す = 少数ラベル条件，線形分類器の学習，再学習，転移学習の性能 2022/5/5 9

10.

実験概要 p Label-efficient learning n Extreme Low-shot n ImageNet-1K p Linear Evaluation & Fine-tuning p Transfer Learning p Ablations 2022/5/5 10

11.

実験：Extreme low-shotでの結果 p データセット：ImageNet-1K p 事前学習で得たweightを固定し，線形分類器のみを学習し評価 p 使用するラベル付きデータ数を極少数（1, 2, 5/class）にして学習 p 3回の試行での Top1 accuracyの平均値で評価 MSNが最も高い精度を示した → 少数ラベルでも分類問題を解ける特徴量を獲得できている 2022/5/5 11

12.

実験：1% ImageNet-1Kでの結果 p データセット：ImageNet-1K p 各クラス1%（= 10枚）のラベル付きデータを使用 p MSN（提案手法），DINO，iBOTは Fine-Tuningなしの結果 p パラメータ数の多いSimCLRv2を上回る p Fine-Tuningなしで同程度の大きさのモデルを大きく上回る 2022/5/5 12

13.

実験概要 p Label-efficient learning p Linear Evaluation & Fine-tuning n Linear Evaluation n Fine-Tuning p Transfer Learning p Ablations 2022/5/5 13

14.

実験：全ラベルを使用した場合の実験結果 Linear Evaluation Fine-Tuning → 線形分類器のみを学習 → ネットワーク全体を再学習両条件でSOTAに近い性能を達成 2022/5/5 14

15.

実験概要 p Label-efficient learning p Linear Evaluation & Fine-tuning p Transfer Learning n Fine-Tuning Transfer Learning n Linear Evaluation Transfer Learning p Ablations 2022/5/5 15

16.

実験：獲得した表現による転移学習の性能 Linear Evaluation Fine-Tuning → 線形分類器のみ転移先で学習 → ネットワーク全体を転移先で学習 DINOと同程度の性能を達成 2022/5/5 16

17.

結論・まとめ p 自己教師あり学習手法Masked Siamese Networks(MSN)を提案 p ランダムにマスクされたパッチの表現とマスクされていない元画像の表現を一致させるように学習 p 少数ラベルでの分類，獲得した特徴量の線形分類，Fine-Tuning, 転移学習の実験において，既存手法を上回るまたは同程度の性能を示した 2022/5/5 17

18.

Reference 1. Masked Autoencoders Are Scalable Vision Learners 2. Exploring Simple Siamese Representation Learning 2022/5/5 18

19.

Appendix 2022/5/5 19

20.

実験：Low-shot learning結果まとめ 1% ImageNet-1K 2022/5/5 Extreme Low-shot Evaluation 20

21.

実験概要 Ablations n Combining Random and Focal Masking n Random Masking ratio n Augmentation Invariance and Low-shot Learning n Random Masking Compute and Memory 2022/5/5 21

22.

実験：Ablation study Maskingの条件を検討 → RandomとFocalを組み合わせた場合が最も良い結果 Maskingの比率の検討 → Architectureが大きいほど，最適なMasking比は高い 2022/5/5 22

23.

実験：Ablation study Augmentationの効果 Maskingと計算効率 2022/5/5 23