diff --git a/docs/html/preview/images/perf-test-frame-latency.png b/docs/html/preview/images/perf-test-frame-latency.png new file mode 100644 index 0000000000000..87d1cfc2c0bff Binary files /dev/null and b/docs/html/preview/images/perf-test-frame-latency.png differ diff --git a/docs/html/preview/images/perf-test-framestats.png b/docs/html/preview/images/perf-test-framestats.png new file mode 100644 index 0000000000000..589a92319f70c Binary files /dev/null and b/docs/html/preview/images/perf-test-framestats.png differ diff --git a/docs/html/preview/testing/performance.jd b/docs/html/preview/testing/performance.jd new file mode 100644 index 0000000000000..a61091f9ff6c5 --- /dev/null +++ b/docs/html/preview/testing/performance.jd @@ -0,0 +1,667 @@ +page.title=Testing Display Performance + +@jd:body + + +
+ User interface (UI) performance testing ensures that your app not only meets its functional + requirements, but that user interactions with your app are buttery smooth, running at a + consistent 60 frames per second (why + 60fps?), without any dropped or delayed frames, or as we like to call it, jank. This + document explains tools available to measure UI performance, and lays out an approach to + integrate UI performance measurements into your testing practices. +
+ + ++ In order to improve performance you first need the ability to measure the performance of + your system, and then diagnose and identify problems that may arrive from various parts of your + pipeline. +
+ ++ dumpsys is an + Android tool that runs on the device and dumps interesting information about the status of system + services. Passing the gfxinfo command to dumpsys provides an output in logcat with + performance information relating to frames of animation that are occurring during the recording + phase. +
+ ++> adb shell dumpsys gfxinfo <PACKAGE_NAME> ++ +
+ This command can produce multiple different variants of frame timing data. +
+ ++ With the M Preview the command prints out aggregated analysis of frame data to logcat, collected + across the entire lifetime of the process. For example: +
+ ++Stats since: 752958278148ns +Total frames rendered: 82189 +Janky frames: 35335 (42.99%) +90th percentile: 34ms +95th percentile: 42ms +99th percentile: 69ms +Number Missed Vsync: 4706 +Number High input latency: 142 +Number Slow UI thread: 17270 +Number Slow bitmap uploads: 1542 +Number Slow draw: 23342 ++ +
+ These high level statistics convey at a high level the rendering performance of the app, as well + as its stability across many frames. +
+ + ++ With the M Preview comes a new command for gfxinfo, and that’s framestats which provides + extremely detailed frame timing information from recent frames, so that you can track down and + debug problems more accurately. +
+ ++>adb shell dumpsys gfxinfo <PACKAGE_NAME> framestats ++ +
+ This command prints out frame timing information, with nanosecond timestamps, from the last 120 + frames produced by the app. Below is example raw output from adb dumpsys gfxinfo + <PACKAGE_NAME> framestats: +
+ ++0,49762224585003,49762241251670,9223372036854775807,0,49762257627204,49762257646058,49762257969704,49762258002100,49762265541631,49762273951162,49762300914808,49762303675954, +0,49762445152142,49762445152142,9223372036854775807,0,49762446678818,49762446705589,49762447268818,49762447388037,49762453551527,49762457134131,49762474889027,49762476150120, +0,49762462118845,49762462118845,9223372036854775807,0,49762462595381,49762462619287,49762462919964,49762462968454,49762476194547,49762476483454,49762480214964,49762480911527, +0,49762479085548,49762479085548,9223372036854775807,0,49762480066370,49762480099339,49762481013089,49762481085850,49762482232152,49762482478350,49762485657620,49762486116683, ++ +
+ Each line of this output represents a frame produced by the app. Each line has a fixed number of + columns describing time spent in each stage of the frame-producing pipeline. The next section + describes this format in detail, including what each column represents. +
+ + ++ Since the block of data is output in CSV format, it's very straightforward to paste it to your + spreadsheet tool of choice, or collect and parse with a script. The following table explains the + format of the output data columns. All timestamps are in nanoseconds. +
+ ++ You can use this data in different ways. One simple but useful visualization is a + histogram showing the distribution of frames times (FRAME_COMPLETED - INTENDED_VSYNC) in + different latency buckets, see figure below. This graph tells us at a glance that most + frames were very good - well below the 16ms deadline (depicted in red), but a few frames + were significantly over the deadline. We can look at changes in this histogram over time + to see wholesale shifts or new outliers being created. You can also graph input latency, + time spent in layout, or other similar interesting metrics based on the many timestamps + in the data. +
+ +
+
+
+
+ If Profile GPU rendering is set to In adb shell dumpsys gfxinfo
+ in Developer Options, the adb shell dumpsys gfxinfo command prints out timing
+ information for the most recent 120 frames, broken into a few different categories with
+ tab-separated-values. This data can be useful for indicating which parts of the drawing pipeline
+ may be slow at a high level.
+
+ Similar to framestats above, it's very + straightforward to paste it to your spreadsheet tool of choice, or collect and parse with + a script. The following graph shows a breakdown of where many frames produced by the app + were spending their time. +
+ +
+
++ The result of running gfxinfo, copying the output, pasting it into a spreadsheet + application, and graphing the data as stacked bars. +
+ ++ Each vertical bar represents one frame of animation; its height represents the number of + milliseconds it took to compute that frame of animation. Each colored segment of the bar + represents a different stage of the rendering pipeline, so that you can see what parts of + your application may be creating a bottleneck. For more information on understanding the + rendering pipeline, and how to optimize for it, see the + Invalidations Layouts and Performance video. +
+ + ++ Both the framestats and simple frame timings gather data over a very short window - about + two seconds worth of rendering. In order to precisely control this window of time - for + example, to constrain the data to a particular animation - you can reset all counters, + and aggregate statistics gathered. +
+ ++>adb shell dumpsys gfxinfo <PACKAGE_NAME> reset ++ +
+ This can also be used in conjunction with the dumping commands themselves to collect and + reset at a regular cadence, capturing less-than-two-second windows of frames + continuously. +
+ + ++ Identification of regressions is a good first step to tracking down problems, and + maintaining high application health. However, dumpsys just identifies the existence and + relative severity of problems. You still need to diagnose the particular cause of the + performance problems, and find appropriate ways to fix them. For that, it’s highly + recommended to use the systrace tool. +
+ + ++ For more information on how Android’s rendering pipeline works, common problems that you + can find there, and how to fix them, some of the following resources may be useful to + you: +
+ ++ One approach to UI Performance testing is to simply have a human tester perform a set of + user operations on the target app, and either visually look for jank, or spend an very + large amount of time using a tool-driven approach to find it. But this manual approach is + fraught with peril - human ability to perceive frame rate changes varies tremendously, + and this is also time consuming, tedious, and error prone. +
+ ++ A more efficient approach is to log and analyze key performance metrics from automated UI + tests. The Android M developer preview includes new logging capabilities which make it + easy to determine the amount and severity of jank in your application’s animations, and + that can be used to build a rigorous process to determine your current performance and + track future performance objectives. +
+ ++ This article walks you through a recommended approach to using that data to automate your + performance testing. +
+ ++ This is mostly broken down into two key actions. Firstly, identifying what you're + testing, and how you’re testing it. and Secondly, setting up, and maintaining an + automated testing environment. +
+ + ++ Before you can get started with automated testing, it’s important to determine a few high + level decisions, in order to properly understand your test space, and needs you may have. +
+ ++ Remember that bad performance is most visible to users when it interrupts a smooth + animation. As such, when identifying what types of UI actions to test for, it’s useful to + focus on the key animations that users see most, or are most important to their + experience. For example, here are some common scenarios that may be useful to identify: +
+ ++ Work with engineers, designers, and product managers on your team to prioritize these key + product animations for test coverage. +
+ ++ From a high-level, it may be critical to identify your specific performance goals, and + focus on writing tests, and collecting data around them. For example: +
+ ++ In all of these cases, you’ll want historical tracking which shows performance across + multiple versions of your application. +
+ ++ Application performance varies depending on the device it's running on. Some devices may + contain less memory, less powerful GPUs, or slower CPU chips. This means that animations + which may perform well on one set of hardware, may not on others, and worse, may be a + result of a bottleneck in a different part of the pipeline. So, to account for this + variation in what a user might see, pick a range of devices to execute tests on, both + current high end devices, low end devices, tablets, etc. Look for variation in CPU + performance, RAM, screen density, size, and so on. Tests that pass on a high end device + may fail on a low end device. +
+ ++ Tool suites like UIAutomator, + and Espresso are built to help + automate the action of a user moving through your application. These are simple + frameworks which mimic user interaction with your device. To use these frameworks, you + effectively create unique scripts, which run through a set of user-actions, and play them + out on the device itself. +
+ +
+ By combining these automated tests, alongside dumpsys gfxinfo you can quickly
+ create a reproducible system that allows you to execute a test, and measure the
+ performance information of that particular condition.
+
+ Once you have the ability to execute a UI test, and a pipeline to gather the data from a + single test, the next important step is to embrace a framework which can execute that + test multiple times, across multiple devices, and aggregate the resulting performance + data for further analysis by your development team. +
+ ++ It’s worth noting that UI testing frameworks (like UIAutomator) + run on the target device/emulator directly. While performance gathering information done + by dumpsys gfxinfo is driven by a host machine, sending commands over ADB. To + help bridge the automation of these separate entities, MonkeyRunner framework was + developed; A scripting system that runs on your host machine, which can issue commands to + a set of connected devices, as well as receive data from them. +
+ ++ Building a set of scripts for proper Automation of UI Performance testing, at a minimum, + should be able to utilize monkeyRunner to accomplish the following tasks: +
+ ++ Once problem patterns or regressions are identified, the next step is identifying and + applying the fix. If your automated test framework preserves precise timing breakdowns + for frames, it can help you scrutinize recent suspicious code/layout changes (in the case + of regression), or narrow down the part of the system you’re analyzing when you switch to + manual investigation. For manual investigation, systrace is a great place to start, showing + precise timing information about every stage of the rendering pipeline, every thread and + core in the system, as well as any custom event markers you define. +
+ ++ It is important to note the difficulties in obtaining and measuring timings that come from + rendering performance. These numbers are, by nature, non deterministic, and often + fluctuate depending on the state of the system, amount of memory available, thermal + throttling, and the last time a sun flare hit your area of the earth. The point is that + you can run the same test, twice and get slightly different numbers that may be close to + each other, but not exact. +
+ ++ Properly gathering and profiling data in this manner means running the same test, + multiple times, and accumulating the results as an average, or median value. (for the + sake of simplicity, let’s call this a ‘batch’) This gives you the rough approximation of + the performance of the test, while not needing exact timings. +
+ ++ Batches can be used between code changes to see the relative impact of those changes on + performance. If the average frame rate for the pre-change Batch is larger than the + post-change batch, then you generally have an overall win wrt performance for that + particular change. +
+ ++ This means that any Automated UI testing you do should take this concept into + consideration, and also account for any anomalies that might occur during a test. For + example, if your application performance suddenly dips, due to some device issue (that + isn’t caused by your application) then you may want to re-run the batch in order to get + less chaotic timings. +
+ ++ So, how many times should you run a test, before the measurements become meaningful? 10 + times should be the minimum, with higher numbers like 50 or 100 yielding more accurate + results (of course, you’re now trading off time for accuracy) +