Cutting-Edge Insights: A Deep Dive into Gen-AI Tech and Trends. What's New and What's Next?

12.3K Views

March 09, 24

#ai #generative ai #machine learning #deep learning #artificial intelligence #生成AI #大規模言語モデル #LLM #Transformer #深層学習

スライド概要

ものづくり研究会スーパーコンピューティング技術産業応用協議会講演資料_240308

Kunihiro Sugiyama

@KunihiroSugiyama

スライド一覧

Generative Ai Study Group Master

またはPlayer版

埋め込む »CMSなどでJSが使えない場合

ダウンロード

関連スライド

公開用のLangCore会社紹介資料

Kunihiro Sugiyama 23.1K

Generative AI Study Group_2ndSesssion_20230620

ai generative ai artificial intelligence machine learning deep learning

Kunihiro Sugiyama 16.4K

Generative AI Study Group_11thSesssion_20231114

ai generative ai machine learning deep learning artificial intelligence

Kunihiro Sugiyama 15.5K

Generative AI Study Group_FirstSesssion_20230606

ai generative ai artificial intelligence machine learning deep learning

Kunihiro Sugiyama 15.2K

Generative AI Study Group_振り返り会

ai generative ai machine learning deep learning artificial intelligence

Kunihiro Sugiyama 14.8K

Generative AI Study Group_3rdSesssion_20230704

ai generative ai machine learning deep learning

Kunihiro Sugiyama 14.6K

各ページのテキスト

Cutting-Edge Insights A Deep Dive into Gen-AI Tech and Trends. What's New and What's Next? 8th Mar 2024 Kunihiro Sugiyama a.k.a Generative AI Study Group .host AI Technology Consortium @ AIST

Agenda

https://www.linkedin.com/in/kunihiro-sugiyama-49b0372a/

https://www.linkedin.com/in/kunihiro-sugiyama-49b0372a/

Introduction • GASG brochure • http://tinyurl.com/ysjh5ua4 • 次回 3月12日火曜18:00~

http://tinyurl.com/ysjh5ua4

Agenda

Theme •Title • Cutting-Edge Insights ▪ A Deep Dive into Gen-AI Tech and Trends. ▪ What's New and What's Next? •Contents • Tech trend, Use case, Issue

Theme •「マルチモーダルAI」「小規模言語モデル」2024年の生成AI重要トレンド（Forbes JAPAN） - Yahoo!ニュース https://news.yahoo.co.jp/articles/633019417533 9a101c2f6e8ad36b0c4109943939?page=1

https://news.yahoo.co.jp/articles/6330194175339a101c2f6e8ad36b0c4109943939?page=1

Tech trend

Tech trend •Small model •Beyond Transformer •Related tech

10.

Tech trend • Small model • Open LLM Leaderboard - a Hugging Face Space by HuggingFaceH4 https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard • Points ▪ コンピューティング量 ▪ 実行メモリ量 ▪ 推論速度 ▪ 特化型カスタマイゼーション

https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard

11.

Tech trend • Small model • Pickup ▪ TinyLlama • jzhang38/TinyLlama: The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens. https://github.com/jzhang38/TinyLlama • 【軽量かつ高速なLLM】TinyLlamaについてまとめてみた #LLM - Qiita https://qiita.com/sergicalsix/items/7cd7665ab90b9f3b343c • LLAMAと完全互換のアーキテクチャおよびトークンナイザー • 1.1Bパラメータモデルは4bit量子化でおよそ550MB RAM上で動作

12.

Tech trend • Small model • Pickup ▪ Phi-2 • Phi-2: The surprising power of small language models - Microsoft Research https://www.microsoft.com/en-us/research/blog/phi-2-the-surprising-powerof-small-language-models/ • [2306.11644] Textbooks Are All You Need https://arxiv.org/abs/2306.11644 • 2.7Bパラメータモデル • 高品質な学習データセットでモデル品質を確保

13.

Tech trend • Small model • Pickup ▪ Orca 2 • Orca - Microsoft Research https://www.microsoft.com/en-us/research/project/orca/ • Microsoft's Orca 2 LLM Outperforms Models That Are 10x Larger https://www.infoq.com/news/2023/12/microsoft-orca-2-llm/ • 7B, 13Bパラメータモデル • LLAMA-2のFinetunedモデル • Reasoningが含まれる合成データセットで訓練

14.

Tech trend • Small model • Pickup ▪ DeciLM • Deci/DeciLM-7B · Hugging Face https://huggingface.co/Deci/DeciLM-7B • [2305.13245] GQA:Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints https://arxiv.org/abs/2305.13245 • What is Grouped Query Attention (GQA)? — Klu https://klu.ai/glossary/grouped-queryattention • GQAの仕組みを採用し軽量で高性能 • GQAは例えばLLaMA2 70Bでも使われている

15.

New!! Tech trend • Small model • Pickup ▪ Gemma • Gemma: Google introduces new state-of-the-art open models https://blog.google/technology/developers/gemma-open-models • 2B and 7B

https://blog.google/technology/developers/gemma-open-models

16.

New!! Tech trend • Small model • Pickup ▪ [2310.11453] BitNet: Scaling 1-bit Transformers for Large Language Models https://arxiv.org/abs/2310.11453 ▪ Advancing AI for humanity | Foundation of AI https://thegenerality.com/agi/

17.

Tech trend 二次的複雑性 (Quadratic complexity O(n^2)) http://tinyurl.com/yqs8pwec • Beyond Transformer • Points ▪ Transformerの課題解決の試み • 推論速度 • 実行メモリ量 • シーケンス長 • コンピューティング量

http://tinyurl.com/yqs8pwec

18.

Tech trend • Beyond Transformer • Pickup ▪ MoE (Mixture of Experts) Reference: http://tinyurl.com/ylxsvomj

http://tinyurl.com/ylxsvomj

19.

Tech trend • Beyond Transformer • Pickup ▪ MoE (Mixture of Experts) • 論文解説 Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer (MoE) - ディープラーニングブログ https://deeplearning.hatenablog.com/entry/moe • Mixture of Experts Explained https://huggingface.co/blog/moe • MixtralSparseMoeBlockを読む https://zenn.dev/if001/articles/fcea9fe9f1bdb1 • Introducing Gemini 1.5, Google's next-generation AI model

20.

Tech trend • Beyond Transformer • Pickup ▪ RWKV (Reinventing RNNs for the Transformer Era) • RWKVについて解説 | AGIRobots Blog https://developers.agirobots.com/jp/rwkv/ • RWKVを論文と実装から読み解く https://zenn.dev/jow/articles/f66d6403b9a509 • RNNでTransformer並みの性能を実現するRWKVがやばい https://zenn.dev/hikettei/articles/5d6c1318998411 • これは衝撃!1.5Bで超高性能LLM!RWKV-5-World-v2｜shi3z https://note.com/shi3zblog/n/nfc8dd1abf494

21.

Tech trend • Beyond Transformer • Pickup ▪ Mamba • state-spaces/mamba https://github.com/state-spaces/mamba • [2312.00752] Mamba: Linear-Time Sequence Modeling with Selective State Spaces https://arxiv.org/abs/2312.00752 • Mamba: Redefining Sequence Modeling and Outforming Transformers Architecture - Unite.AI https://www.unite.ai/mambaredefining-sequence-modeling-and-outforming-transformers-architecture/ • Mamba: Linear-Time Sequence Modeling with Selective State Spaces — Arxiv Dives | by Oxen | Dec, 2023 | Medium https://medium.com/@oxenai/mamba-linear-time-sequence-modeling-with-selective-state-spaces-arxiv-dives-cf96518d7ec4 ▪ StripedHyena • Architectures for longer sequences and efficient inference: StripedHyena | hessian.AI https://hessian.ai/architectures-for-longersequences-and-efficient-inference-stripedhyena/ • [2302.10866] Hyena Hierarchy:Towards Larger Convolutional Language Models https://arxiv.org/abs/2302.10866

22.

Tech trend • Beyond Transformer • Pickup ▪ MoE-Mamba • [2401.04081] MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts https://arxiv.org/abs/2401.04081 ▪ Vision Mamba • [2401.09417] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model https://arxiv.org/abs/2401.09417 ▪ MambaByte ▪ [2401.13660] MambaByte:Token-free Selective State Space Model https://arxiv.org/abs/2401.13660

23.

New!! Tech trend • Beyond Transformer • Pickup ▪ [2402.13753] LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens https://arxiv.org/abs/2402.13753 ▪ LongRoPE extends LLM context windows to 2 million tokens, utilizing nonuniform positional embedding and a progressive context extension strategy to enhance model performance. ▪ Supports the recovery of short contexts, demonstrating reduced complexity and effective retrieval of large content. ▪ Through improved performance in benchmark tests with extended context windows, it enables deeper text analysis and more accurate information extraction.

https://arxiv.org/abs/2402.13753

24.

New!! Tech trend • Beyond Transformer • Pickup ▪ [2402.13753] LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens https://arxiv.org/abs/2402.13753 ▪ Exploitation of Non-uniform Positional Interpolation: Optimizes RoPE for extended contexts using evolutionary search to minimize interpolation loss. ▪ Progressive Extension Strategy: Extends context first to 256k, then to 2048k tokens, avoiding initial fine-tuning on long contexts. ▪ Adjustment for Shorter Context Recovery: Ensures sustained performance across context lengths by readjusting embeddings post-extension.

https://arxiv.org/abs/2402.13753

25.

New!! Tech trend • Beyond Transformer • Pickup ▪ [2402.08268] World Model on Million-Length Video And Language With RingAttention https://arxiv.org/abs/2402.08268 ▪ LWM processes long video sequences and textual data, handling up to 1 million tokens using the RingAttention technique for scalable training. ▪ It tackles vision-language training challenges, enabling efficient training and the creation of a model-driven QA dataset for improved chat functionalities. ▪ Achieves notable outcomes in understanding long videos and fact retrieval, showcasing adaptability across different task contexts.

https://arxiv.org/abs/2402.08268

26.

Tech trend •Related tech •RAG (Retrieval-Augmented Generation) •Agent •Synthetic data (合成データ) •Distributed

27.

Tech trend • Related tech • RAG (Retrieval-Augmented Generation) ▪2023年9月5日GASG第6回で取り上げました ▪進化するRAGアーキテクチャ ▪[2312.10997] Retrieval-Augmented Generation for Large Language Models:A Survey https://arxiv.org/abs/2312.10997

https://arxiv.org/abs/2312.10997

28.

Reference: https://arxiv.org/pdf/2312.10997.pdf Figure 6: RAG compared with other model optimization methods Theme • Tech Trend • Related tech ▪ RAG (Retrieval-Augmented Generation)

https://arxiv.org/pdf/2312.10997.pdf

29.

Reference: https://arxiv.org/pdf/2312.10997.pdf Figure 2: A representative instance of the RAG process applied to question answering Theme • Tech Trend • Related tech ▪ RAG (Retrieval-Augmented Generation)

https://arxiv.org/pdf/2312.10997.pdf

30.

Reference: https://arxiv.org/pdf/2312.10997.pdf Figure 3: Comparison between the three paradigms of RAG Theme • Tech Trend • Related tech ▪ RAG (Retrieval-Augmented Generation)

https://arxiv.org/pdf/2312.10997.pdf

31.

Tech trend • Related tech • RAG (Retrieval-Augmented Generation) ▪Advanced • Optimizing data indexing • Pre retrieval process • Post retrieval process

32.

Tech trend • Related tech • RAG (Retrieval-Augmented Generation) ▪ Modular • 多様な機能モジュール • ニーズに適したPipeline

33.

Tech trend • Related tech • RAG (Retrieval-Augmented Generation) ▪ A Cheat Sheet and Some Recipes For Building Advanced RAG | by Andrei | Jan, 2024 | LlamaIndex Blog https://blog.llamaindex.ai/a-cheat-sheet-and-somerecipes-for-building-advanced-rag-803a9d94c41b

https://blog.llamaindex.ai/a-cheat-sheet-and-some-recipes-for-building-advanced-rag-803a9d94c41b

34.

Tech trend • Related tech • RAG (Retrieval-Augmented Generation) ▪ Scaling context window ▪ Robustness • Hallucination ▪ Hybrid (RAG+FT) ▪ Expanding LLM role ▪ Scaling law • Embedding model ▪ Production ready • 精度, 再現性, セキュリティ(アクセスコントロール) ▪ Multi modal • Image, Audio and video, Code

35.

New!! Tech trend •Related tech •RAG (Retrieval-Augmented Generation) ▪RAG vs LLM extension

36.

New!! Tech trend • Related tech • RAG (Retrieval-Augmented Generation) ▪ Impact of Gemini 1.5 with over 1M context window. • Details is unclear. ▪ What is the superior to Gemini 1.5 • Cost • Latency • Accuracy

37.

New!! Tech trend •Related tech •RAG (Retrieval-Augmented Generation) ▪RAG Future is...?

38.

Tech trend • Related tech • Agent ▪ 2023年11月14日GASG第11回で取り上げましたReference: https://medium.com/scisharp/understand-the-llm-agent-orchestration-043ebfaead1f

https://medium.com/scisharp/understand-the-llm-agent-orchestration-043ebfaead1f

39.

Tech trend • Related tech • Agent ▪ 2024 AI Agent ▪ https://e2b.dev/blog/ai-agents-in-2024 ▪ 評価フレームワーク • THUDM/AgentBench:A Comprehensive Benchmark to Evaluate LLMs as Agents https://github.com/THUDM/AgentBench • AutoGPT/benchmark at master · Significant-Gravitas/AutoGPT https://github.com/Significant-Gravitas/AutoGPT/tree/master/benchmark • Benchmarking Agent Tool Use https://blog.langchain.dev/benchmarking-agent-tool-use

40.

Tech trend • Related tech • 合成データ ▪ Synthetic data:Anthropic’s CAI, scaling, OpenAI’s Superalignment, tips, and open-source examples https://www.interconnects.ai/p/llm-synthetic-data ▪ [2305.15041] Generating Faithful Synthetic Data with Large Language Models: A Case Study in Computational Social Science https://arxiv.org/abs/2305.15041 ▪ [2310.07849] Synthetic Data Generation with Large Language Models for Text Classification: Potential and Limitations https://arxiv.org/abs/2310.07849 ▪ [2401.00368] Improving Text Embeddings with Large Language Models https://arxiv.org/abs/2401.00368 ▪ [2312.17742] Learning Vision from Models Rivals Learning Vision from Data https://arxiv.org/abs/2312.17742

41.

Tech trend • Related tech • Distributed ▪ Petals • Petals – Run LLMs at home, BitTorrent-style https://petals.dev/ • bigscience-workshop/petals: Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading https://github.com/bigscience-workshop/petals • [2209.01188] Petals: Collaborative Inference and Fine-tuning of Large Models https://arxiv.org/abs/2209.01188 • [2312.08361] Distributed Inference and Fine-tuning of Large Language Models Over The Internet https://arxiv.org/abs/2312.08361 ▪ AI Horde • AI Horde https://stablehorde.net/ • Haidra-Org/AI-Horde: A crowdsourced distributed cluster for AI art and text generation https://github.com/Haidra-Org/AI-Horde?tab=readme-ov-file

42.

Use case

43.

Use case • 企業における生成AIの未来：ChatGPTを越えてその先へ | ガートナー https://www.gartner.co.jp/ja/articles/beyond-chatgptthe-future-of-generative-ai-for-enterprises • Top 100+ Generative AI Applications / Use Cases in 2024 https://research.aimultiple.com/generative-ai-applications/ • 2024 AI Predictions | NVIDIA Blog https://blogs.nvidia.com/blog/2024-ai-predictions/

44.

Use case • Device ▪ Order Ai Pin Now https://hu.ma.ne/ ▪ rabbit — home https://www.rabbit.tech/ ▪ Brilliant Labs https://brilliant.xyz/ ▪ adamcohenhillel/ADeus: An open source AI wearable device that captures what you say and hear in the real world and then transcribes and stores it on your own server.You can then chat with Adeus using the app, and it will have all the right context about what you want to talk about - a truly personalized, personal AI. https://github.com/adamcohenhillel/ADeus

45.

Issue

46.

Issue •Security •Data contamination •Socialization

47.

Issue • Security • Overview ▪ Safety and security risks of generative artificial intelligence to 2025 (Annex B) - GOV.UK https://www.gov.uk/government/publications/frontier-ai-capabilities-andrisks-discussion-paper/safety-and-security-risks-of-generative-artificial-intelligence-to-2025-annex-b ▪ OWASP Top 10 for Large Language Model Applications | OWASP Foundation https://owasp.org/www-project-top-10-for-large-language-model-applications/ • • Prompt hack • GPTs のプロンプトリーキング対策｜ぬこぬこ https://note.com/schroneko/n/n6d6c2e645119 Prompt Hacking | Learn Prompting: Your Guide to Communicating with AI https://learnprompting.org/docs/category/-prompt-hacking • • https://owasp.org/www-project-top-10-for-large-language-model-applications/assets/PDF/OWASP-Top-10-for-LLMs-2023-slides-v1_1.pdf Solution ▪ Introducing Purple Llama for Safe and Responsible AI Development | Meta https://about.fb.com/news/2023/12/purple-llama-safe-responsible-ai-development/ ▪ New generative AI-powered SaaS security expert from AppOmni | VentureBeat https://venturebeat.com/security/new-generative-ai-powered-saas-securityexpert-from-appomni ▪ Cloudflare、Firewall for AIを発表 https://blog.cloudflare.com/ja-jp/firewall-for-ai-ja-jp/

48.

New!! Issue •Security • ComPromptMized https://sites.google.com/view/compromptmized

https://sites.google.com/view/compromptmized

49.

Issue • Security • ComPromptMized https://sites.google.com/view/compromptmized

50.

Issue • Data contamination • Why data contamination is a big issue for LLMs - TechTalks https://bdtechtalks.com/2023/07/17/llm-data-contamination/ • [2312.16337] Task Contamination: Language Models May Not Be Few-Shot Anymore https://arxiv.org/abs/2312.16337 • Socialization • Sotopia https://www.sotopia.world/ • [2310.11667] SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents https://arxiv.org/abs/2310.11667

51.

Agenda

52.

53.

Introduction How make AI do it is all you need!!

54.

Introduction Yes, AI can!!

55.

EOF https://www.linkedin.com/in/kunihiro-sugiyama-49b0372a/ https://www.ai-tech-c.jp/generative-ai-study-group-gasg/