ai-edge-torchの紹介

8.8K Views

October 31, 24

#ai-edge-torch #PyTorch #TFLite #モデル変換 #量子化

スライド概要

https://github.com/motokimura/timm2tflite

Motoki Kimura

@motokimura

スライド一覧

大きめのトイプードルが好きです

またはPlayer版

埋め込む »CMSなどでJSが使えない場合

ダウンロード

関連スライド

image-matching-modelsの紹介

Motoki Kimura 3.3K

猫でも分かるUnreal Engineの学び方 - 超初心者向け編 - 2023 v1.0

ue4 ue5 ue-beginner

エピックゲームズジャパン 1.6M

Unreal Engine5 Lumenの仕組みと肝心なところ

ue5 ue-rendering ue-lumen

エピックゲームズジャパン 1.3M

Meta XR SDK(V66-74)でQuestアプリを開発

spatial anchor unity quest pro shaperecognizeractivatestate oculus integration transformfeaturestateprovider building blocks transformrecognizeractivestate ovrsemanticclassification jointdeltaprovider ovrscenemanager jointvelocityactivestate オクルージョン sequenceactivestate scene manager ambisonic depth api metaxraudiosource playerlocomotor meta xr sdk quest3 ovrplayercontroller マルチモーダル meta haptics studio direct touch ui meta xr haptics sdk ovrspatialanchor ovrtrackedkeyboard hapticclipplayer fingerfeaturestateprovider hapticclip ワイドモーションモード wmm mruk mr utility kit voice sdk jointrotationactivestate meta horizon os ui set asw application spacewarp ovr metrics tool unityscene manager colocation discovery コロケーション mx ink passthrough camera api hand tracking microgestures webcamtexturemanager passthroughcamerautils cameraviewermanager hand pose selector recorder

あうぜん 1M

UE5レンダリングフロー総おさらい(2024) 基礎編！[CEDEC+KYUSHU 2024]

ue5 unreal engine ue-rendering

エピックゲームズジャパン 1M

各ページのテキスト

ai-edge-torchの紹介 2024.10.30 GO株式会社木村元紀 AI

ai-edge-torch 概要 ▪ https://github.com/google-ai-edge/ai-edge-torch ▪ PyTorchモデルをTFLite（Tensorﬂow Lite）に変換するOSS ▪ Google (google-ai-edge) が開発を主導している ▪ ▪ 2024年9月にTFLite→LiteRTとなりましたが、本資料では TFLite で統一しますなお、この資料ではai-edge-torch v0.2.0での実行を前提としています AI 2

既存の変換ツールに対する優位性 ▪ ▪ ▪ 既存ツールでは変換にONNXを経由するものがほとんど（onnx2tf, onnx-tensorﬂow） ai-edge-torchではONNXを介さないことで、変換のカバレッジを向上している推論速度もonnx2tfと遜色ない表：変換カバレッジの既存ツールとの比較（”AI Edge Torch: モバイルデバイスでの PyTorch モデルの高速推論”より）表：推論速度の既存ツールとの比較（”AI Edge Torch: モバイルデバイスでの PyTorch モデルの高速推論”より）「torchvision、timm、torchaudio、HuggingFace の 70 以上のモデルで検証済み」とあるが、具体的にどのモデルかは明確な記載がない AI 3

使い方（環境構築） 1) READMEの通りにpip installする 2) LD_LIBRARY_PATHにPythonのパスを通す 3) GPU環境の場合、環境変数のPJRT_DEVICE=CPUにする Error during convertion custom LSTM model to edge model · Issue #145 · google-ai-edge/ai-edge-torch · GitHub AI 4

使い方（PyTorch→TFLite変換の実装例） ▪ ai_edge_torch.convert() にモデルとダミー入力を渡すだけ ▪ model.export() で.tﬂiteファイルとして保存できる https://github.com/google-ai-edge/ai-edge-torch/blob/v0.2.0/README.md#pytorch-converter Colabで動くnotebook もある AI 5

使い方（ちなみに） ai_edge_torch.convert() 後には、 TFLite runtimeで動く状態になっている出力された.tﬂiteファイルは、 ai_edge_torch.load() で読み込むことができる https://github.com/google-ai-edge/ai-edge-torch/blob/v0.2.0/ai_edge_torch/model.py#L56-L142 を見ると、 edge_model（TfLiteModelクラス）は tensorﬂow.lite.Interpreter (TFLiteの既存のPython API）を使って実装されている AI 6

https://github.com/google-ai-edge/ai-edge-torch/blob/v0.2.0/ai_edge_torch/model.py#L56-L142

実験）timmのモデルをTFLite変換してみた ▪ ▪ 適当に選んだモデルを、ImageNet valからランダム抽出した5,000枚で評価 eﬃcientformerv2はなぜか精度が大きく下がる🤔 モデル (from timm v1.0.9) top-1 (PT) top-1 (TFLite) resnet18.a1_in1k 73.94 73.94 convnextv2_tiny.fcmae_ft_in22k_in1k 84.82 84.82 tf_efficientnetv2_s.in21k_ft_in1k 84.42 84.42 efficientnet_lite0.ra_in1k 75.44 75.44 mobilenetv4_conv_small.e2400_r224_in1k 75.06 75.06 mobilenetv4_hybrid_medium.e200_r256_in12k_ft_in1k 83.04 83.04 maxvit_small_tf_224.in1k 84.54 84.54 efficientformerv2_s0.snap_dist_in1k 76.02 2.82 実装：https://github.com/motokimura/timm2tflite AI 7

どうやってPyTorchをTFLiteに変換しているのか？ AI 8

ai-edge-torchの PyTorch→TFLite 変換フロー PyTorch PyTorch ExportedProgram StableHLO TFLite AI https://github.com/google-ai-edge/ai-edge-torch/blob/v0.2.0/ai_edge_torch/convert/conversion.py#L76-L117 9

https://github.com/google-ai-edge/ai-edge-torch/blob/v0.2.0/ai_edge_torch/convert/conversion.py#L76-L117

10.

PyTorch → PyTorch ExportedProgram ▪ ExportedProgramは、PyTorchモデルを他の環境・バックエンドで効率的に実行するための新しい中間表現 ▪ PyTorch 2.1で導入された torch.export.export() で生成できる ▪ torch.export.export() の内部ではTorchDynamoやtorch.fxが使われている https://github.com/google-ai-edge/ai-edge-torch/blob/v0.2.0/ai_edge_torch/convert/conversion.py#L90-L93 AI 10

11.

PyTorch ExportedProgram → StableHLO ▪ ▪ ▪ StableHLOはMLモデルにおける高レベル演算（High-Level Operations）のセット異なるMLフレームワーク（TensorFlow, JAX, PyTorch）および、異なるMLコンパイラ間でポータブルなレイヤを提供する OpenXLA（ML compiler ecosystem）において、MLモデルの中間表現的な役割を担う ▪ ▪ a-edge-torchでは、変換にtorch_xlaを利用している https://github.com/google-ai-edge/ai-edge-torch/blob/v0.2.0/ai_edge_torch/convert/conversion_utils.py AI Google Open Source Blog "OpenXLA is available now to accelerate and simplify machine learning" 11

12.

StableHLO → TFLite ▪ ▪ StableHLOをtensorﬂowにマッピングし、saved_modelとして保存 tensorﬂow.lite.TFLiteConverter() でTFLiteを生成 https://github.com/google-ai-edge/ai-edge-torch/blob/v0.2.0/ai_edge_torch/convert/conversion_utils.py#L333-L421 AI 12

https://github.com/google-ai-edge/ai-edge-torch/blob/v0.2.0/ai_edge_torch/convert/conversion_utils.py#L333-L421

13.

PyTorch（PT2E）で量子化したモデルの TFLite変換 AI 13

14.

PyTorchで量子化したモデルのTFLite変換 ▪ PyTorch 2 Export (PT2E) Quantizationで量子化したPyTorchモデルも ai-edge-torchでTFLiteに変換できる ▪ ▪ TFLiteの既存の量子化APIもあるが、PyTorchの量子化の方が自由度が高い ▪ PyTorchでは学習時量子化（QAT: Quantization-aware training）も使える ▪ ▪ ちなみにPyTorchの量子化には、PT2E・FX Graph Mode・Eager Modeの3種類があるが、ai-edge-torchでサポートされているのはPT2Eだけ AI 14

15.

PyTorchで量子化したモデルのTFLite変換学習済みモデルをPyTorchで量子化してTFLite変換する例： PyTorch（PT2E）で量子化をした後、 ai_edge_torch.convert() するだけ https://github.com/google-ai-edge/ai-edge-torch/blob/main/docs/pytorch_converter/README.md#quantization AI 15

https://github.com/google-ai-edge/ai-edge-torch/blob/main/docs/pytorch_converter/README.md#quantization

16.

実験）timmのモデルを量子化してTFLite変換してみた ▪ ▪ 量子化の条件 ▪ get_symmetric_quantization_conﬁg(is_dynamic=False, is_per_channel=False) ▪ post-training static quantization ▪ weight: per-tensor, symmetric, MinMaxObserver ▪ activation: per-tensor, aﬃne, HistogramObserver ▪ ランダムに選んだ512枚のtrain画像をキャリブレーションに使用結果： ▪ eﬃcientnet_lite0はTFLiteへの変換（PyTorch int8 → TFLite int8）で精度が大きく低下する🤔 ▪ p.7 の表のようにfp32ではPyTorch→TFLiteへの変換で精度に差が無かったことを考えると、量子化パラメータの変換がうまくいっていない？モデル (from timm v1.0.9) top-1 (PT fp32) top-1 (PT int8) top-1 (TFLite int8) resnet18.a1_in1k 73.94 68.70 67.96 efficientnet_lite0.ra_in1k 75.44 68.20 62.90 実装：https://github.com/motokimura/timm2tflite AI 16

17.

実験）timmのモデルを量子化してTFLite変換してみた ▪ ▪ ▪ ▪ ▪ （PyTorch int8の時点で）精度低下が大きいので、weightの量子化をper-channelにしてみる ▪ get_symmetric_quantization_conﬁg(is_dynamic=False, is_per_channel=True) TFLite変換後、なぜか全てのconvのweightがtransposeされている🤔 そのせいなのか非常に遅い（TFLite fp32より遅い） issueによると、対応を検討しているが、当面はTFLiteの既存の量子化APIを使うか、ai-edge-quantizer という新しいTFLite用の量子化ライブラリを試してほしい、とのこと現状では、PyTorch（PT2E）の量子化を使ってweightをper-channelで量子化すると速度が出ない resnet18のTFLite変換後のグラフの一部（左： per-tensor量子化、右： per-channel量子化） AI 17

18.

ai-edge-torchの制約 AI 18

19.

ai-edge-torchの制約 ▪ ▪ ▪ ▪ ▪ PyTorch ExportedProgramに変換できないモデルは変換できない Limitations of torch.export UltralyticsのYOLOが変換に失敗する関連issue: ▪ https://github.com/google-ai-edge/ai-edge-torch/issues/170 ExportedProgramより後の変換で失敗する場合、エラーの原因を特定するためのツールが提供されているので、その結果をissueに貼って下さいとのこと ▪ https://github.com/google-ai-edge/ai-edge-torch/blob/v0.2.0/docs/pytorc h_converter/README.md#error-during-exportedprogram-to-edge-model-lo wering AI 19

20.

まとめ ▪ ai-edge-torchは、PyTorchモデルをTFLiteに変換するためのOSS ▪ モデルの中間表現としてPyTorch ExportedProgram・StableHLOを利用しており、既存のONNX経由のモデル変換ツールよりも変換のカバレッジを向上している ▪ PyTorch（PT2E）で量子化したモデルをTFLiteに変換することもできるが、現状、モデルによっては変換後の精度低下が大きかったり、量子化の設定によっては推論効率の悪いモデルが生成されたりすることがあるようなので注意が必要 AI 20