>100 Views
October 14, 25
スライド概要
Unreal Fest Bali 2025 で行った Unreal Insightsを利用したパフォーマンス分析に関するスライドです。
動画は以下のページで公開されています。
https://www.youtube.com/watch?v=HQLYkwoDoT4
Unreal Engineを開発・提供しているエピック ゲームズ ジャパンによる公式アカウントです。 勉強会や配信などで行った講演資料を公開しています。 公式サイトはこちら https://www.unrealengine.com/ja/
Mastering Performance Analysis with Unreal Insights Ken Kuwano Lead Software Engineer, Developer Relations Epic Games Japan
Table of contents ● What Are Common Performance Issues? ● Basic Analysis ● Advanced Analysis ● Best Practices & Mindset
Section 1 What Are Common Performance Issues?
What Are Common Performance Issues?
Hitching Frame Drops
What is Performance Optimization? ● Smooth gameplay = better user experience ● Performance affects battery life, heat, platform compliance ● Optimization reduces crashes and load time ● It helps your team work faster
Start With ”stat unit” Stat Description Frame Total time for 1 frame Game Game Thread Draw Rendering Thread RHIT RHI Thread GPUTime GPU time for rendering ※RHIRendering Hardware Interface
Data Flow Input Game Game Thread Draw Render Thread CPU RHIT RHI Thread GPU GPU Display
Data Flow & Architecture Game Thread Render Thread RHI Thread GPU
Data Flow & Architecture Game Thread Render Thread RHI Thread GPU
Section 2 Basic Analysis Game, Render, GPU
Performance Capture ● Insights Capture Command -trace=[Channel] ● Option Command Description Trace.Bookmark Embed bookmark points Trace.Screenshot Embed screenshots Trace.RegionBeing <Name> Trace.RegionEnd <Name> Create a tracing region that spans multiple frames with the same string name Bugit BugitGo <location> <rotation> Logs current camera pos and rot / Moves camera to specified pos and rot
High load frames can be identified
Show more about events and processing times
Know the process and timing for each thread
How to Find the Bottleneck Point ● ● ● ● Sort by frame time Zoom in on longest frame Find threads with the longest processing time (bar) on that frame Find long processing time in that thread ○ Exclude markers like "xxx thread idle time", "Wait for Tasks", etc.
Game Thread
How should we investigate Game Thread bottlenecks? ● Command -trace=cpu,frame,assetloadtime -statnamedevents ● Option log, bookmark, object, objectproperties, net, slate, animation, stats, counter, screenshot, regions, metadata
Frames Game Frames Threads Game Thread Frame 10 Frame 11 Frame 12 Frame 13
Frames Game Frames Threads Game Thread Worker Thread Frame 10 Frame 11 Frame 12 Frame 13
Game Thread Optimization https://www.youtube.com/watch?v=KxREKDYu70&t
Notes: Game Thread profile capture ● Game main processing: ○ Intensive Game Thread processing directly to frame delays ○ A large number of processes called every frame and every tick results in an accumulative load ○ Identifies temporarily intensive processing in loading, GC, etc. ● Dependencies with other threads: ○ Check if Worker Threads are stuck waiting for processing ○ Game Thread is waiting for other threads to complete processing, so improve processing of other threads
Render Thread
How should we investigate Render Thread bottlenecks? ● Command -trace=cpu,frame,rendercommands,rhicommands,rdg ● Use cases ○ Long Render Thread = too many draw calls ○ Look for heavy RenderCommand queues ○ Visualize Scene rendering and PostProcess passes
Frames Game Frames Render Frames Threads Game Thread Render Thread Frame 10 Frame 9 Frame 11 Frame 10 Frame 12 Frame 11 Frame 13 Frame 12
Frames Game Frames Render Frames Threads Game Thread Render Thread Frame 10 Frame 9 Frame 11 Frame 10 Frame 12 Frame 11 Frame 13 Frame 12
Notes: Render Thread profile capture ● ● ● Long Render Thread times affect RHI Thread and GPU Understand the processing time per frame of Render Thread ○ Understand the cost of each rendering function ■ Are specific paths (PostProcess, Shadow, Lumen, etc.) heavy? Understand the dependency on the GPU ○ Check whether the Render Thread is too fast or too slow ■ Render Thread > GPU: CPU (Render Thread) bottleneck ■ Render Thread < GPU: GPU bottleneck
GPU
How should we investigate GPU bottlenecks? ● Command -trace=cpu,gpu,frame (default) ● GPU track shows render passes & timings ○ PostProcess, Shadow Pass, or full-screen effects may dominate ● If you want to analyze detailed shader content and rendering commands, use dedicated GPU profiler / debugger tools
Notes: GPU profile capture Capture profile at target screen resolution ● r.DynamicRes.TestScreenPercentage 60 Console : Fix Dynamic Resolution to a specific screen percentage ● r.screenpercentage 60 PC: render resolution can be explicitly set by output resolution percentage ● r.RDG.AsyncCompute 0 Disable asynchronous work if you want to accurately collect performance of individual render passes
New GPU Profiler
Signal Fence GPU Fence Relations Signal to Wait Fence) Wait Fence GPU Work GPU Wait
GPU Profiler and Debugger Platform Tools Windows DumpGPU, RenderDoc, PIX for Windows, iOS XCode GPU capture/trace Android RenderDoc (OpenGLES / Vulkan) ARM Graphics Analyzer (Mali GPU specific) Snapdragon Profiler (Qualcomm Adreno GPUs) Swicth/XBox/PlayStation Platform GPU profiler
Chapter Recap : Basic Analysis ● For Game, Draw, and GPU, first check for load on each thread and unit. ○ Frame ≈ Game → Game logic is too heavy ○ Frame ≈ Draw → Scene complexity ○ Frame ≈ GPU → Rendering/shader overload ● Game → Render → RHI → GPU and data flows ○ Frame times can be stretched in conjunction with other threads ○ Need to understand which threads, where they are waiting and how threads relate to each other
Section 3 Advanced Analysis Worker, FileIO
Worker Thread
How should we investigate Worker Thread bottlenecks? ● Command -trace=cpu,frame,task -statnamedevents -trace=taskgraph (cpu,gpu,frame,log,bookmark,screenshot,region,task) ● Threads Background Worker , Foreground Worker ● Use cases Navmesh building, Texture Streaming, Eval Animation, RHI TaskPipe
Notes: Worker Thread profile capture ● Worker Threads are a group of threads mainly used for parallel processing (multi-threading) ● Not task-dependent or lack of parallelism ○ Length of processing time per task: Is a particular task taking a long time ■ Cases where one thread is waiting for others to finish processing ○ Free thread time: Are threads idle and inefficient for long periods ■ Is there async-processing that can be assigned to unused threads? ○ Low concurrency: multiple Workers unused ■ Not enough threads? Low concurrency?
Waiting Other Tasks
Waiting Other Tasks
Activity Execute Task Overview Frames Game Frames Threads Game Thread Worker Thread#1 Frame 10 Frame 11 Frame 12 Frame 13
Activity Task Overview Frames Game Frames Threads Game Thread Worker Thread#1 Frame 10 Frame 11 Frame 12 Frame 13 Frame 14
Frames Game Frames Threads Game Thread Render Thread RHI Thread Worker Thread#1 Worker Thread#2 Worker Thread#3 Frame 10 Frame 11 Frame 12 Frame 13
ContextSwitch Trace
ContextSwitch Trace How to ● Command On Windows, the game/editor runtime should be “run as administrator” -trace=cpu,frame,contextswitch,task -statnamedevents ● Process in which the CPU saves the state (Context) of the currently running process/thread and switches to another process or thread ○ Too many short-lived threads = excessive context switches ○ Worker threads and Game thread may block other threads
FileIO
How should we investigate FileI/O bottlenecks? ● Command -trace=cpu,file,assetloadtime,loadtimes,iostore -statnamedevents ● Use cases ○ Find blocking I/O calls during gameplay ○ Inspect AsyncLoadingThread for long stalls
Notes: FileIO profile capture ● ● ● Asset loading and file access are performed by a thread dedicated to file IO ○ Game Thread issues requests, involving multiple threads and the system Identify where and when access occurs ○ Which thread and which process is generating IO? ○ Is the Game Thread or other threads stuck waiting for IO? Check the load size and frequency ○ Are there frequent accesses to small files? ○ Are there long accesses to a single file?
Asset Loading Game AsyncLoading Threads Game Thread Async Loading IoDispatcher IoServices Activity I/O File IOStore Memory
Asset Loading Game AsyncLoading Threads Game Thread Async Loading IoDispatcher IoServices Activity I/O File IOStore Memory
Asset Loading Game AsyncLoading Threads Game Thread Async Loading IoDispatcher 1 Optimized blocking process IoServices Activity 2 Reduction of waiting time I/O File IOStore Memory 3 Efficient file access
Chapter Recap : Advanced Analysis ● ● ● Task Graph, File IO bottlenecks cannot be determined directly from stat units ○ Check thread processing with Unreal Insights ○ Understand in advance what kinds of cases will cause bottlenecks Task Graph ○ Use threads efficiently ■ Assign work to idle threads → Avoid blocking the main process ■ Avoid intensive processing → Frame distribution FileIO ○ Identify blocking points in related threads and manage traffic ■ Since there are limits to file R/W at the OS level, control streaming effectively in the main process
Section 4 Best Practices & Mindset
What weʼve Learned ● “stat unit” detects symptoms ● Detailed analysis with “Unreal Insights” ● Break down from high to low level ● Iterate verification cycles
Optimization Mindset ● Data first ○ Measure everything before acting ● Avoid assumptions ○ Use tools, not guesswork ● Prioritize ○ Prioritize fixing what matters most ● Iterate ○ Re-test after every change Trace Diagnose Validate Optimize
Optimization Best Practice ● ● ● ● ● Effort to process as little as possible Keep it as simple and uncomplicated as possible Keep process short and complete them quickly Use as much mature functionality as possible ○ New features may not be optimized yet If you do something, keep the following in mind ○ Parallelization ○ Pooling, Reuse ○ Adjest gameplay priorities as needed Again, keep in mind to process as little as possible
References Documentation: Unreal Insights https://dev.epicgames.com/documentation/en-us/unreal-engine/unreal-insights-in-unreal-engine Documentation: Unreal Insights Reference https://dev.epicgames.com/documentation/en-us/unreal-engine/unreal-insights-reference-in-unreal-engine-5 Documentation: Stat Commands https://dev.epicgames.com/documentation/en-us/unreal-engine/stat-commands-in-unreal-engine Superluminal https://superluminal.eu/ Guidelines for Optimizing Rendering for Real-Time https://dev.epicgames.com/documentation/en-us/unreal-engine/guidelines-for-optimizing-rendering-for-real-timein-unreal-engine [Intel] Unreal Engine* Optimization Guide: Profiling Fundamentals https://www.intel.com/content/www/us/en/developer/articles/technical/unreal-engine-optimization-profiling-funda mentals.html Optimize Unreal Engine 5 performance on low-end computers https://irendering.net/optimize-unreal-engine-5-performance-on-low-end-computers/ Unreal Engine Performance Guide https://gpuopen.com/learn/unreal-engine-performance-guide/
Thank you!