2013. 3. 29. The Gate of the AOSP #4 : Gerrit, Memory & Performance Memory and Performance 11 th Kandroid Conference www.kandroid.org 양정수 (yangjeongsoo at gmail.com) LG전자 이경민 (keimin.lee at lge.com)
2013. 3. 29.
The Gate of the AOSP #4 :
Gerrit, Memory & Performance
Memory and Performance
11th Kandroid Conference
www.kandroid.org 양정수 (yangjeongsoo at gmail.com)
LG전자 이경민 (keimin.lee at lge.com)
Memory
1. Introduction : Historical Distribution & Change History
2. Android Memory Management Overview
- Interpreting GC Log Messages
- Reference Management
- onLowMem(), onTrimMem(), OutOfMemoryError
3. OOM Case Study #1 : Snapshot Analysis w/ MAT
4. OOM Case Study #2 : Historical Analysis w/ kandroid MemTracer
5. Conclusion : Memory Issue Troubleshooting Flow
Performance
1. Introduction : Historical Distribution & Change History
2. Performance Analysis Tools
3. Case Study #1 : Scrolling performance
4. Case Study #2 : Android SMP Programming Guide
5. Conclusion : Performance Sensitive Paths
3 11th Kandroid Conference - www.kandroid.org
1. Introduction : Historical Distribution
Source : http://en.wikipedia.org/wiki/Android_version_history
memory issue
occurrence area
4 11th Kandroid Conference - www.kandroid.org
1. Introduction : Change History for Memory Mgt.
~ Froyo (2.2) Gingerbread (2.3) Honeycomb (3.1) ~
Garbage
Collection
• Stop-the-world
• Full heap collection
• Pause times often
> 100ms
• Concurrent (mostly)
• Partial collections
• Pause times usually
< 5ms
Bitmaps
• Pixel data is stored in native memory
• Freed via recycle() or finalizer
• Hard to debug
• Pixel data is stored on the
Dalvik heap
• Freed synchronously by GC
• Easy to debug
Large
Heaps
• android:largeHeap="true"
+ No Swap(?), Low Memory Killer, Out of Memory Killer
5 11th Kandroid Conference - www.kandroid.org
2. Android Memory Mgt. Overview : Interpreting GC Log Messages
GC_HPROF_DUMP_HEAP freed 47K, 50% free 2730K/5379K, external 8825K/9762K, paused 2262ms
GC_EXPLICIT freed 39K, 49% free 3434K/6727K, external 1625K/2137K, paused 61ms
GC_EXTERNAL_ALLOC freed 113K, 48% free 2849K/5447K, external 2069K/2137K, paused 29ms
GC_FOR_MALLOC freed 672K, 46% free 4505K/8263K, external 1625K/2137K, paused 36ms
GC_CONCURRENT freed 469K, 35% free 5537K/8455K, external 1625K/2137K, paused 2ms+3ms
GC_EXPLICIT freed 298K, 5% free 9395K/9820K, paused 5ms+4ms, total 46ms
GC_FOR_ALLOC freed 62K, 4% free 9981K/10356K, paused 14ms, total 14ms
GC_CONCURRENT freed 452K, 5% free 11442K/11924K, paused 2ms+3ms, total 32ms
GC_BEFORE_OOM freed <1K, 25% free 48653K/64327K, paused 20ms
honeycomb (api level 11)
Reason for GC : GCed size Java Heap Bitmap Heap Pause time
1. What is GC ? / What is GC Pause time ?
2. What is java heap ?
3. What is bitmap heap ?
6 11th Kandroid Conference - www.kandroid.org
time
space freq.
Heap Size GC Time GC Freq.
↓ ↓ ↑
↑ ↑ ↓
Performance Metrics - Throughput = 1 /(GC overhead)
- Pause time
- GC frequency
- Footprint
- Promptness
Reference: http://www.oracle.com/technetwork/java/javase/tech/memorymanagement-whitepaper-1-150020.pdf
What Garbage Collector Does? - Allocating memory,
- Ensuring that any referenced objects remain in memory, and
- Recovering memory used by objects that are no longer reachable
Tradeoff
GC Roots
Reachable
Objects
Unreachable
Objects
- Local variables
- Active Java threads
- Static variables
- JNI references
2. Android Memory Mgt. Overview : Interpreting GC Log Messages
7 11th Kandroid Conference - www.kandroid.org
Source: http://www.techpaste.com/2012/02/java-garbage-collectors-gc/
How Garbage Collector Works? - Reference counting
- Tracing
Design Choices - Serial vs. Parallel
- Concurrent vs. Stop-the-world
- Compacting vs. Non-compacting vs. Copying
2. Android Memory Mgt. Overview : Interpreting GC Log Messages
8 11th Kandroid Conference - www.kandroid.org
GC Reason Partial Concurrent Preserving* When is triggered?
GC_FOR_[M]ALLOC Y N Y Not enough space for an "ordinary" Object
to be allocated.
GC_CONCURRENT Y Y Y Automatic GC triggered by exceeding a
heap occupancy threshold.
GC_EXPLICT N Y Y Explicit GC via Runtime.gc(),
VMRuntime.gc(), or SIGUSR1
GC_BEFORE_OOM N N N Final attempt to reclaim memory before
throwing an OOM.
GC_EXTERNAL_ALLOC GC to try to reduce heap footprint to allow
more non-GC'ed memory.
GC_HPROF_DUMP_HEAP N N N GC to dump heap contents to a file, only
used under WITH_HPROF
* Preserving softly reachable objects or not
2. Android Memory Mgt. Overview : Interpreting GC Log Messages
GC_HPROF_DUMP_HEAP freed 47K, 50% free 2730K/5379K, external 8825K/9762K, paused 2262ms
GC_EXPLICIT freed 39K, 49% free 3434K/6727K, external 1625K/2137K, paused 61ms
GC_EXTERNAL_ALLOC freed 113K, 48% free 2849K/5447K, external 2069K/2137K, paused 29ms
GC_FOR_MALLOC freed 672K, 46% free 4505K/8263K, external 1625K/2137K, paused 36ms
GC_CONCURRENT freed 469K, 35% free 5537K/8455K, external 1625K/2137K, paused 2ms+3ms
GC_EXPLICIT freed 298K, 5% free 9395K/9820K, paused 5ms+4ms, total 46ms
GC_FOR_ALLOC freed 62K, 4% free 9981K/10356K, paused 14ms, total 14ms
GC_CONCURRENT freed 452K, 5% free 11442K/11924K, paused 2ms+3ms, total 32ms
GC_BEFORE_OOM freed <1K, 25% free 48653K/64327K, paused 20ms
9 11th Kandroid Conference - www.kandroid.org
-
5,000
10,000
15,000
20,000
25,000
30,000
35,000
1 3 5 7 9 11 13 15 17 19 21 23 25 270.000 20.000 40.000 60.000 80.000 100.000
curAllocated * 100 / curFootprint (%)
GC_HPROF_DUMP_HEAP freed 47K, 50% free 2730K/5379K, external 8825K/9762K, paused 2262ms
GC_EXPLICIT freed 39K, 49% free 3434K/6727K, external 1625K/2137K, paused 61ms
GC_EXTERNAL_ALLOC freed 113K, 48% free 2849K/5447K, external 2069K/2137K, paused 29ms
GC_FOR_MALLOC freed 672K, 46% free 4505K/8263K, external 1625K/2137K, paused 36ms
GC_CONCURRENT freed 469K, 35% free 5537K/8455K, external 1625K/2137K, paused 2ms+3ms
GC_EXPLICIT freed 298K, 5% free 9395K/9820K, paused 5ms+4ms, total 46ms
GC_FOR_ALLOC freed 62K, 4% free 9981K/10356K, paused 14ms, total 14ms
GC_CONCURRENT freed 452K, 5% free 11442K/11924K, paused 2ms+3ms, total 32ms
GC_BEFORE_OOM freed <1K, 25% free 48653K/64327K, paused 20ms
curFootprint
curAllocated
extAllocated
extLimit
curFootprint + extLimit
2. Android Memory Mgt. Overview : Interpreting GC Log Messages
10 11th Kandroid Conference - www.kandroid.org
GC_HPROF_DUMP_HEAP freed 47K, 50% free 2730K/5379K, external 8825K/9762K, paused 2262ms
GC_EXPLICIT freed 39K, 49% free 3434K/6727K, external 1625K/2137K, paused 61ms
GC_EXTERNAL_ALLOC freed 113K, 48% free 2849K/5447K, external 2069K/2137K, paused 29ms
GC_FOR_MALLOC freed 672K, 46% free 4505K/8263K, external 1625K/2137K, paused 36ms
GC_CONCURRENT freed 469K, 35% free 5537K/8455K, external 1625K/2137K, paused 2ms+3ms
GC_EXPLICIT freed 298K, 5% free 9395K/9820K, paused 5ms+4ms, total 46ms
GC_FOR_ALLOC freed 62K, 4% free 9981K/10356K, paused 14ms, total 14ms
GC_CONCURRENT freed 452K, 5% free 11442K/11924K, paused 2ms+3ms, total 32ms
GC_BEFORE_OOM freed <1K, 25% free 48653K/64327K, paused 20ms
2. Android Memory Mgt. Overview : Interpreting GC Log Messages
extAllocated vs. extLimit(?)
extAllocated is the sum of
• system bitmap(1625k) ,
• user-defined bitmap (drawable),
• view-inflated bitmap (drawable),
• dip based image conversion buffer
(depends on image size, process-wide image upscale buffer), and
• first activity stack (20k).
11 11th Kandroid Conference - www.kandroid.org
Platform Layer Memory Management Features
Linux Kernel LMK, OOM Killer
User Space Lib
Bionic libc dlmalloc
C++ libcutils RefBase ▶ sp<>, wp<>
Java core lib : java.lang.Ref ▶ soft, weak, phatom (GC)
Framework APIs
onLowMemory() and
onTrimMemory() Callback
OutOfMemoryError Exception
android.os.Debug Class
java.lang.Runtime Class
Tools
DDMS
dumysys meminfo
MAT
valgrind
libc.debug.malloc = 1, DDMS (ddms.cfg, native=true)
/proc fs based utilities : top, procrank, smem
2. Android Memory Mgt. Overview : Reference Mgt. / Low, Trim, Out of Mem
12 11th Kandroid Conference - www.kandroid.org
Hardware Platform
User Applications
Kernel
System Call Interface
glibc
Architecture Dependent Kernel Code
User
Space
Kernel
Space
NPTL
(pthread)
bionic
libc
lib
cutils
gnu-
libstdc++ libstdc++
gabi++
stlport
gnustl
libutils
libbinder
2. Android Memory Mgt. Overview : Reference Management
dlmalloc
RefBase
sp<>, wp<>
13 11th Kandroid Conference - www.kandroid.org
RefBase
IBinder
BpBinder BBinder BpRefBase
BpInterface<> BnInterface<>
ProcessState
IPCThreadState
Parcel
<<use>>
IInterface
sp<>, wp<>
Source : 9th kandroid conference, 안드로이드 네이티브 인프라 : C++, Looper, Binder – Ahn JoonSeok, NHN Corp.
2. Android Memory Mgt. Overview : Reference Management – C++ RefBase
14 11th Kandroid Conference - www.kandroid.org
class Point : public RefBase {
private:
int mx, my;
public:
Point() : mx(1), my(2) { }
Point(int x, int y) : mx(x), my(y) {
}
~Point() {
}
void onFirstRef() { }
void onLastStrongRef(const void* id) { }
void onLastWeakRef(const void* id) { }
void printPoint(void) {
cout << "point " << mx << " " << my
<< endl;
}
};
int main(int argc, char** argv)
{
{
sp<Point> strp;
{
Point p(3,4);
p.printPoint();
Point *pp = new Point(3,4);
pp->printPoint();
wp<Point> weakp(new Point(10, 20));
strp = weakp.promote();
(*strp).printPoint();
strp->printPoint();
strp.get()->printPoint();
weakp->printPoint();
weakp.printPoint();
weakp.unsafe_get()->printPoint();
}
}
return 0;
}
2. Android Memory Mgt. Overview : Reference Management – C++ RefBase
Compile time error
(based on template and
operator overloading)
• sp : strong pointer (object lifecycle)
• wp : weak pointer (comparison, promotion to sp)
15 11th Kandroid Conference - www.kandroid.org
Lifecycle of Java Object Reference: http://www.kdgregory.com/index.php?page=java.refobj
Created In Use Initialized Finalized Unreachable
java.lang.ref package introduced in JDK1.2
Created Strongly
Reachable Initialized Finalized
Softly
Reachable
Phantom
Reachable
Weakly
Reachable
2. Android Memory Mgt. Overview : Reference Management – Java References
• GC will attempt to preserve the Softly Reachable Object as long as possible,
but will collect it before throwing an OutOfMemoryError.
• GC is free to collect the Weakly Reachable Object at any time, with no attempt to
preserve it. In practice, the object will be collected during a major collection, but may
survive a minor collection.
• The Phantom Reachable Object has already been finalized, and the GC is ready to
reclaim its memory. It allows you to perform resource (non-memory) cleanup with
more flexibility than you get from finalizers.
16 11th Kandroid Conference - www.kandroid.org
L I NUX K ERNEL
2. Android Memory Mgt. Overview : onLowMem, onTrimMem, OutOf Memory
/proc/<pid>/oom_adj /proc/<pid>/oom_score
LowMemoryKiller OutOfMemoryKiller
Normal Low
Mem
Out of
Mem
Ashmem (Shrinker)
A PP L I CAT IONS
ActivityThread
L I B RAR I E S RUNTIME
Dalvik Virtual Machine
Core Libraries
HelloAndroid
Activity
Looper
Message
Queue
Service
Receiver
Provider
View
H
Handle
Message()
ViewRoot
Handle
Message()
A PP L I CAT ION F RAMEWORK
ActivityManager Service
Package Manager
Window Manager
Resource Manager
Process Record
…
L I B RAR I E S RUNTIME
Dalvik Virtual Machine
Core Libraries
scheduleLowMemory()
scheduleTrimMemory()
Heap
Limit
OutOfMemoryError
17 11th Kandroid Conference - www.kandroid.org
2. Android Memory Mgt. Overview : onLowMem, onTrimMem, OutOf Memory
Callback or Error Descriptions Practices
onLowMemory() This callback is called after
onTrimMemory() with trim level
TRIM_MEMORY_RUNNING_CRITICAL
onTrimMemory() Trim Level > 80 : scheduleDestroyActivities() 80 : TRIM_MEMORY_COMPLETE 60 : TRIM_MEMORY_MODERATE 40 : TRIM_MEMORY_BACKGROUND 20 : TRIM_MEMORY_UI_HIDDEN 15 : TRIM_MEMORY_RUNNING_CRITICAL 10 : TRIM_MEMORY_RUNNING_LOW 5 : TRIM_MEMORY_RUNNING_MODERATE
Clear cache
OutOfMemoryError This error is thrown when heap allocation
request fails due to process heap‟s limit
This exception is not related with OOM of
Linux kernel.
Bitmap resampling,
Logging
18 11th Kandroid Conference - www.kandroid.org
3. OOM Case Study : Snapshot vs. Historical Analysis
0
5000
10000
15000
20000
25000
30000
35000
1 2 3 4 5 6 7 8 9 10111213141516171819200
5000
10000
15000
20000
25000
30000
35000
1 3 5 7 9 11 13 15 17 19 21
0
5000
10000
15000
20000
25000
30000
35000
1 2 3 4 5 6 7 8 9 1011121314151617181920
0
5000
10000
15000
20000
25000
30000
35000
40000
1 8
15
22
29
36
43
50
57
64
71
78
85
92
99
106
External (alloc)
Java (alloc)
Java (footprint)
External (limit)
Heap (max/current)
19 11th Kandroid Conference - www.kandroid.org
-
500,000
1,000,000
1,500,000
2,000,000
2,500,000
3,000,000
Action
Time
0
5000
10000
15000
20000
25000
30000
35000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
3. OOM Case Study : Snapshot vs. Historical Analysis – Activity Overhead
20 11th Kandroid Conference - www.kandroid.org
0
5000
10000
15000
20000
25000
30000
35000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
3. OOM Case Study : Snapshot vs. Historical Analysis – Bitmap Overhead
21 11th Kandroid Conference - www.kandroid.org
3. OOM Case Study : Snapshot vs. Historical Analysis – External Alloc
0
5000
10000
15000
20000
25000
30000
35000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
22 11th Kandroid Conference - www.kandroid.org
3. OOM Case Study #1 : Snapshot Analysis w/ MAT
Shallow Heap vs. Retained Heap
Memory
Leak
GC Roots
Reachable
Objects
Unreachable
Objects
Reachable
but Unused
Objects
O1 O2
O3 O4
Shallow Retained
O1 1k 4k
O2 1k 1k
O3 1k 2k
O4 1k 1k
MAT is a Java heap analyzer that helps you find memory leaks & reduce memory footprint
Dominator Tree
http://www.eclipse.org/mat/
23 11th Kandroid Conference - www.kandroid.org
3. OOM Case Study #1 : Snapshot Analysis w/ MAT
Histogram View
Shows a list of classes sortable by # of instances, shallow heap size, or retained heap size.
Comparison
List objects with incoming references
List objects with outgoing references
24 11th Kandroid Conference - www.kandroid.org
3. OOM Case Study #1 : Snapshot Analysis w/ MAT
Dominator Tree View
Path to GC Root
25 11th Kandroid Conference - www.kandroid.org
3. OOM Case Study #1 : Snapshot Analysis w/ MAT
Typical Memory Leaks in Android - Static field of non-static inner class
- Instance referenced inside a Singleton
Reference: http://developer.samsung.com/android/technical-docs/Memory-Profiler-Identifying-Potential-Problems
public class MainActivity2 extends Activity { SingletonClass mSingletonClass = null; @Override public void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); mSingletonClass = SingletonClass.getInstance(this); } } class SingletonClass { private Context mContext = null; private static SingletonClass mInstance; private SingletonClass(Context context) { mContext = context; } public static SingletonClass getInstance(Context context) { if (mInstance == null) { mInstance = new SingletonClass(context); } return mInstance; } }
public class MainActivity extends Activity { static MyLeakedInnerClass leakInstance = null; class MyLeakedInnerClass { int someInt; } @Override public void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); if (leakInstance == null) leakInstance = new MyLeakedInnerClass(); setContentView(R.layout.activity_main); } }
26 11th Kandroid Conference - www.kandroid.org
3. OOM Case Study #2 : Historical Analysis w/ kmemtracer
-
5,000
10,000
15,000
20,000
25,000
30,000
35,000
40,000
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91
Dalvik Heap Limitation : 32,768(32M)
External
Java
27 11th Kandroid Conference - www.kandroid.org
0
5000
10000
15000
20000
25000
30000
35000
External
Java
3. OOM Case Study #2 : Historical Analysis w/ kmemtracer
onC
reate
-Activity_
A
infla
te-A
ctivity_
A
onR
esum
e-A
ctivity_
A
onP
ause-A
ctivity_
A
onC
reate
-Activity_
B
infla
te-A
ctivity_
B
onR
esum
e-A
ctivity_
B
onD
estr
oy-A
ctivity_
A
onP
ause-A
ctivity_
B
onC
reate
-Activity_
C
infla
te-A
ctivity_
C
onR
esum
e-A
ctivity_
C
onD
estr
oy-A
ctivity_
B
onP
ause-A
ctivity_
C
onC
reate
-Activity_
D
onC
reate
-Activity_
E
infla
te-A
ctivity_
E
onR
esum
e-A
ctivity_
E
onP
ause-A
ctivity_
E
onC
reate
-Activity_
F
infla
te-A
ctivity_
F
onR
esum
e-A
ctivity_
F
onP
ause-A
ctivity_
F
onR
esta
rt-A
ctivity_
D
onR
esta
rt-A
ctivity_
E
onR
esum
e-A
ctivity_
E
onD
esto
ry-A
ctivity_
F
onP
ause-A
ctivity_
E
onR
esta
rt-A
ctivity_
C
onR
esum
e-A
ctivity_C
onD
esto
ry-A
ctivity_
E
onP
ause-A
ctivity_
C
onD
esto
ry-A
ctivity_
C
28 11th Kandroid Conference - www.kandroid.org
3. OOM Case Study #2 : Historical Analysis w/ kmemtracer
Test package
Application package
android.test
InstrumentationTestRunner
Trace package
Application package
org.kandroid.memtracer
MemoryInstrumentation
android.app
Instrumentation
callActivityOnCreate(Activity, Bundle) callActivityOnDestroy(Activity) callActivityOnNewIntent(Activity, Intent) callActivityOnPause(Activity) callActivityOnPostCreate(Activity, Bundle) callActivityOnRestart(Activity) callActivityOnRestoreInstanceState(Activity, Bundle) callActivityOnResume(Activity) callActivityOnSaveInstanceState(Activity, Bundle) callActivityOnStart(Activity) callActivityOnStop(Activity) callActivityOnUserLeaving(Activity) callApplicationOnCreate(Application)
How to monitor or track the lifecyle of an Activity?
1. Override lifecycle callbacks (onCreate, onStart, …) of your Activity class
2. Modify the android.app.Activity class in the framework
3. Use the Instrumentation framework
29 11th Kandroid Conference - www.kandroid.org
3. OOM Case Study #2 : Historical Analysis w/ kmemtracer
How to get memory usage information ?
long nativeMax = Debug.getNativeHeapSize() / 1024; long nativeAllocated = Debug.getNativeHeapAllocatedSize() / 1024; long nativeFree = Debug.getNativeHeapFreeSize() / 1024;
Runtime runtime = Runtime.getRuntime(); long dalvikMax = runtime.totalMemory() / 1024; long dalvikFree = runtime.freeMemory() / 1024; long dalvikAllocated = dalvikMax - dalvikFree;
Debug.MemoryInfo memInfo = new Debug.MemoryInfo(); Debug.getMemoryInfo(memInfo);
public int dalvikPss; public int dalvikPrivateDirty; public int dalvikSharedDirty; public int nativePss; public int nativePrivateDirty; public int nativeSharedDirty; public int otherPss; public int otherPrivateDirty; public int otherSharedDirty;
Dalvik Heap
Native Heap
PSS
android.os.Debug.MemoryInfo Allocation Count/Size
getGlobalAllocCount() getGlobalAllocSize() getGlobalClassInitCount() getGlobalClassInitTime() getGlobalExternalAllocCount() getGlobalExternalAllocSize() getGlobalExternalFreedCount() getGlobalExternalFreedSize() getGlobalFreedCount() getGlobalFreedSize() getGlobalGcInvocationCount()
resetAllCounts() resetGlobalAllocCount() resetGlobalAllocSize() resetGlobalClassInitCount() resetGlobalClassInitTime() resetGlobalExternalAllocCount() resetGlobalExternalAllocSize() resetGlobalExternalFreedCount() resetGlobalExternalFreedSize() resetGlobalFreedCount() resetGlobalFreedSize() resetGlobalGcInvocationCount() resetThreadAllocCount() resetThreadAllocSize() resetThreadExternalAllocCount() resetThreadExternalAllocSize() resetThreadGcInvocationCount()
getThreadAllocCount() getThreadAllocSize() getThreadExternalAllocCount() getThreadExternalAllocSize() getThreadGcInvocationCount() getLoadedClassCount() printLoadedClasses(int)
android.os.Debug
30 11th Kandroid Conference - www.kandroid.org
3. OOM Case Study #2 : Historical Analysis w/ kmemtracer
<?xml version="1.0" encoding="utf-8"?> <manifest xmlns:android="http://schemas.android.com/apk/res/android" package="com.example.android.apis.test" android:versionCode="1" android:versionName="1.0" > <instrumentation android:name="org.kandroid.memtracer.MemoryInstrumentation" android:targetPackage="com.example.android.apis" /> <application android:icon="@drawable/ic_launcher" android:label="@string/app_name" > <uses-library android:name="android.test.runner" /> </application> </manifest>
D:\dev\android-sdk\tools>adb shell am instrument -e class com.example.android.apis.ApiDemos \ com.example.android.apis.test/org.kandroid.memtracer.MemoryInstrumentation D:\dev\android-sdk\tools>adb pull /sdcard/kmemtracer/kmemtrace.csv . 286 KB/s (2932 bytes in 0.010s)
How to use kmemtracer?
1. Create an Android Test Project for the trace package.
2. Add „kmemtracer.jar‟ in the „libs‟ directory
3. Modify the name of the Instrumentation in the manifest file.
4. Install the trace package as well as the application package.
5. Start the instrumentation with „am instrument‟ in the shell.
6. Do something with the application.
7. Pull the trace file „kmemtrace.csv‟ from /sdcard/kmemtracer.
8. Open the trace file with Excel and create charts.
31 11th Kandroid Conference - www.kandroid.org
3. OOM Case Study #2 : Historical Analysis w/ kmemtracer
D:\dev\android-sdk\tools>adb shell am instrument \ com.example.android.apis.test/com.example.android.apis.test.MyMemoryInstrumentation D:\dev\android-sdk\tools>adb pull /sdcard/kmemtracer/kmemtrace.csv ./my_kmemtrace.csv 286 KB/s (2932 bytes in 0.010s)
public class MyMemoryInstrumentation extends MemoryInstrumentation { private static final String[] MY_METRICS = { MemoryTracer.METRIC_KEY_LABEL, MemoryTracer.METRIC_KEY_JAVA_SIZE, MemoryTracer.METRIC_KEY_JAVA_ALLOCATED, MemoryTracer.METRIC_KEY_JAVA_FREE, MemoryTracer.METRIC_KEY_JAVA_PSS, MemoryTracer.METRIC_KEY_JAVA_PRIVATE_DIRTY, MemoryTracer.METRIC_KEY_JAVA_SHARED_DIRTY, }; @Override protected MemoryTracer createMemoryTracer() { return new MemoryTracer(new MemoryTraceCsvWriter(MY_METRICS); } @Override protected String getMainActivityName() { return "com.example.android.apis.ApiDemos"; } }
… <instrumentation android:name=“com.example.android.apis.MyMemoryInstrumentation" android:targetPackage="com.example.android.apis" /> …
How to customize kmemtracer?
1. Create a subclass extending MemoryInstrumentation class.
2. Override createMemoryTracer() method.
3. Override getMainActivityName() method.
Etc.) Custom MemoryTracer.ResultWriter, Precise tracing points with MemoryTracer.addSnapshot()
32 11th Kandroid Conference - www.kandroid.org
4. Conclusion : Memory Issue Troubleshooting Flow
OOM Occurrence
• Analyze & compare heap dumps
• Check objects‟ count & size
• Check paths to GC Roots
• Check historical vs. snapshot issues
• Select optimization areas
• Optimize code for low footprint
(if pre-honeycomb, optimize java footprint & external alloc.)
Check Platform
Version
Check OOM
Condition
Memory Leak
Memory Overload
• Memory Leak vs. Memory Overload
• Pre-honeycomb or Not.
Memory
1. Introduction : Historical Distribution & Change History
2. Android Memory Management Overview
- Interpreting GC Log Messages
- Reference Management
- onLowMem(), onTrimMem(), OutOfMemoryError
3. OOM Case Study #1 : Snapshot Analysis w/ MAT
4. OOM Case Study #2 : Historical Analysis w/ kandroid MemTracer
5. Conclusion : Memory Issue Troubleshooting Flow
Performance
1. Introduction : Historical Distribution & Change History
2. Performance Analysis Tools
3. Case Study #1 : Scrolling performance
4. Case Study #2 : Android SMP Programming Guide
5. Conclusion : Performance Sensitive Paths
34 11th Kandroid Conference - www.kandroid.org
1. Introduction : Historical Distribution
Source : http://en.wikipedia.org/wiki/Android_version_history
include
rich
performance
analysis
framework
35 11th Kandroid Conference - www.kandroid.org
1. Introduction : Change History for Performance Features
Froyo (2.2) Gingerbread (2.3) Honeycomb (3.1) Jellybean (4.1)
• JIT • StrictMode
• NativeActivity
• SMP
• GPUI
• Jank Buster
+ NDK (Cupcake), RenderScript, ….
36 11th Kandroid Conference - www.kandroid.org
2. Performance Analysis Tools : systrace
http://developer.android.com/tools/debugging/systrace.html
37 11th Kandroid Conference - www.kandroid.org
By default, the tool will capture events for 5 seconds. I simply scrolled the main timeline up and down. The resulting trace is a
stand-alone HTML document.
Tip: to navigate a systrace document, use the WASD keys to pan and zoom. W will zoom in on the mouse cursor.
A systrace document shows a lot of very interesting information. For instance, it shows you whether a process is scheduled,
and on which CPU. If you zoom in on the last row, called 10440: m.jv.falcon.pro you can see what the application was doing. If
you click on one of the performTraversals blocks you can see how long the application spent drawing a frame.
While most of the performTraversals are below the 16 ms threshold, some take more time, thus confirming the measurements
previously obtained (zoom in at the 935 ms marker to see such a block.)
More interestingly, you can see that the application sometimes misses a frame because it doesn’t manage to schedule
a draw operation. Zoom in at the 270 ms marker to find a deliverInputEvent block taking 25 ms. This blocks indicates that the
application spent 25 ms processing a touch event. Since the application is using a ListView, this is likely due to a problem in
the adapter but we’ll get back to this later.
Systrace was useful to not only confirm that the application is spending too much time drawing, but also to help us find
another potential performance bottleneck. It is a very useful tool but it has its limitations. It only provides high level data and
we must turn to other tools to understand what is truly going on.
Source : http://www.curious-creature.org/2012/12/01/android-performance-case-study/
2. Performance Analysis Tools : systrace
Conclusion:
• Look over!!
• What terminologies are in the systrace report and their meaning?
• Architecture of systrace, atrace,
framework internal code + kernel trace plugin features
38 11th Kandroid Conference - www.kandroid.org
systrace
(python)
adb adbd
atrace
• atrace_args = ['adb', 'shell', 'setprop', 'debug.atrace.tags.enableflags', hex(flags)]
• atrace_args = ['adb', 'shell', 'atrace', '-z'] + args
start
Trace
stop
Trace
dump
Trace
Linux Kernel (ftrace feature)
android.os.Trace ATRACE_CALL()
ATRACE_INT()
ScopedTrace
Tracer
Tracing Point
Java Native
https://code.google.com/p/trace-viewer/
android_os_Trace
jni
trace_marker trace tracing_on …
2. Performance Analysis Tools : systrace
39 11th Kandroid Conference - www.kandroid.org
/init.trace.rc
## Permissions to allow system-wide tracing to the kernel trace buffer.
on boot
# Allow writing to the kernel trace log.
chmod 0222 /sys/kernel/debug/tracing/trace_marker
# Allow the shell group to enable (some) kernel tracing.
chown root shell /sys/kernel/debug/tracing/trace_clock
chown root shell /sys/kernel/debug/tracing/buffer_size_kb
chown root shell /sys/kernel/debug/tracing/options/overwrite
chown root shell /sys/kernel/debug/tracing/events/sched/sched_switch/enable
chown root shell /sys/kernel/debug/tracing/events/sched/sched_wakeup/enable
chown root shell /sys/kernel/debug/tracing/events/power/cpu_frequency/enable
chown root shell /sys/kernel/debug/tracing/events/power/cpu_idle/enable
chown root shell /sys/kernel/debug/tracing/events/cpufreq_interactive/enable
chown root shell /sys/kernel/debug/tracing/tracing_on
chmod 0664 /sys/kernel/debug/tracing/trace_clock
chmod 0664 /sys/kernel/debug/tracing/buffer_size_kb
chmod 0664 /sys/kernel/debug/tracing/options/overwrite
chmod 0664 /sys/kernel/debug/tracing/events/sched/sched_switch/enable
chmod 0664 /sys/kernel/debug/tracing/events/sched/sched_wakeup/enable
chmod 0664 /sys/kernel/debug/tracing/events/power/cpu_frequency/enable
chmod 0664 /sys/kernel/debug/tracing/events/power/cpu_idle/enable
chmod 0664 /sys/kernel/debug/tracing/events/cpufreq_interactive/enable
chmod 0664 /sys/kernel/debug/tracing/tracing_on
# Allow only the shell group to read and truncate the kernel trace.
chown root shell /sys/kernel/debug/tracing/trace
chmod 0660 /sys/kernel/debug/tracing/trace
2. Performance Analysis Tools : systrace
40 11th Kandroid Conference - www.kandroid.org
shell@android:/sys/kernel/debug/tracing $ cat README
cat README
tracing mini-HOWTO:
# mount -t debugfs nodev /sys/kernel/debug
# cat /sys/kernel/debug/tracing/available_tracers
wakeup preemptirqsoff preemptoff irqsoff function sched_switch nop
# cat /sys/kernel/debug/tracing/current_tracer
nop
# echo sched_switch > /sys/kernel/debug/tracing/current_tracer
# cat /sys/kernel/debug/tracing/current_tracer
sched_switch
# cat /sys/kernel/debug/tracing/trace_options
noprint-parent nosym-offset nosym-addr noverbose
# echo print-parent > /sys/kernel/debug/tracing/trace_options
# echo 1 > /sys/kernel/debug/tracing/tracing_enabled
# cat /sys/kernel/debug/tracing/trace > /tmp/trace.txt
# echo 0 > /sys/kernel/debug/tracing/tracing_enabled
# tracer: nop
## TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
2. Performance Analysis Tools : systrace
41 11th Kandroid Conference - www.kandroid.org
# tracer: nop
## TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
2. Performance Analysis Tools : systrace
42 11th Kandroid Conference - www.kandroid.org
jsyang@jsyang-desktop:~/android$ jgrep Trace.traceBegin
./frameworks/base/services/java/com/android/server/wm/WindowManagerService.java
Trace.traceBegin(Trace.TRACE_TAG_WINDOW_MANAGER, "wmAnimate");
Trace.traceBegin(Trace.TRACE_TAG_WINDOW_MANAGER, "wmLayout");
Trace.traceBegin(Trace.TRACE_TAG_WINDOW_MANAGER, "wmLayout");
Trace.traceBegin(Trace.TRACE_TAG_WINDOW_MANAGER, "wmUpdateFocus");
./frameworks/base/core/java/android/view/ViewRootImpl.java
Trace.traceBegin(Trace.TRACE_TAG_VIEW, "performTraversals");
Trace.traceBegin(Trace.TRACE_TAG_VIEW, "measure");
Trace.traceBegin(Trace.TRACE_TAG_VIEW, "layout");
Trace.traceBegin(Trace.TRACE_TAG_VIEW, "draw");
Trace.traceBegin(Trace.TRACE_TAG_VIEW, "deliverInputEvent");
./frameworks/base/core/java/android/view/HardwareRenderer.java
Trace.traceBegin(Trace.TRACE_TAG_VIEW, "getDisplayList");
Trace.traceBegin(Trace.TRACE_TAG_VIEW, "drawDisplayList");
./frameworks/base/core/java/android/content/AbstractThreadedSyncAdapter.java
Trace.traceBegin(Trace.TRACE_TAG_SYNC_MANAGER, mAuthority);
./frameworks/base/core/java/android/app/LoadedApk.java
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "broadcastReceiveReg");
./frameworks/base/core/java/android/app/ActivityThread.java
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "activityStart");
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "activityRestart");
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "activityPause");
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "activityPause");
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "activityStop");
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "activityStop");
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "activityShowWindow");
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "activityHideWindow");
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "activityResume");
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "activityDeliverResult");
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "activityDestroy");
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "bindApplication");
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "activityNewIntent");
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "broadcastReceiveComp");
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "serviceCreate");
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "serviceBind");
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "serviceUnbind");
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "serviceStart");
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "serviceStop");
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "requestThumbnail");
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "configChanged");
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "lowMemory");
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "activityConfigChanged");
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "backupCreateAgent");
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "backupDestroyAgent");
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "providerRemove");
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "broadcastPackage");
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "sleeping");
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "setCoreSettings");
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "trimMemory");
android.os.Trace ATRACE_CALL()
Tracing Point
Java Native
2. Performance Analysis Tools : systrace
43 11th Kandroid Conference - www.kandroid.org
./frameworks/native/libs/gui/BufferQueue.cpp
BufferQueue::acquireBuffer()
BufferQueue::cancelBuffer()
BufferQueue::connect()
BufferQueue::dequeueBuffer()
BufferQueue::disconnect()
BufferQueue::query()
BufferQueue::queueBuffer()
BufferQueue::releaseBuffer()
BufferQueue::requestBuffer()
BufferQueue::setBufferCountServer()
BufferQueue::setSynchronousMode()
./frameworks/native/libs/gui/SurfaceTexture.cpp
SurfaceTexture::attachToContext()
SurfaceTexture::detachFromContext()
SurfaceTexture::updateTexImage()
./frameworks/native/libs/gui/SurfaceTextureClient.cpp
SurfaceTextureClient::cancelBuffer()
SurfaceTextureClient::connect()
SurfaceTextureClient::dequeueBuffer()
SurfaceTextureClient::disconnect()
SurfaceTextureClient::query()
SurfaceTextureClient::queueBuffer()
SurfaceTextureClient::setBufferCount()
SurfaceTextureClient::setBuffersDimensions()
SurfaceTextureClient::setBuffersTransform()
SurfaceTextureClient::setBuffersUserDimensions()
SurfaceTextureClient::setCrop()
SurfaceTextureClient::setScalingMode()
SurfaceTextureClient::setSwapInterval()
android.os.Trace ATRACE_CALL()
Tracing Point
Java Native
./frameworks/native/libs/gui/BufferQueue.cpp
BufferQueue::acquireBuffer()
BufferQueue::cancelBuffer()
BufferQueue::connect()
BufferQueue::dequeueBuffer()
BufferQueue::disconnect()
BufferQueue::query()
BufferQueue::queueBuffer()
BufferQueue::releaseBuffer()
BufferQueue::requestBuffer()
BufferQueue::setBufferCountServer()
BufferQueue::setSynchronousMode()
./frameworks/native/libs/gui/SurfaceTexture.cpp
SurfaceTexture::attachToContext()
SurfaceTexture::detachFromContext()
SurfaceTexture::updateTexImage()
./frameworks/native/libs/gui/SurfaceTextureClient.cpp
SurfaceTextureClient::cancelBuffer()
SurfaceTextureClient::connect()
SurfaceTextureClient::dequeueBuffer()
SurfaceTextureClient::disconnect()
SurfaceTextureClient::query()
SurfaceTextureClient::queueBuffer()
SurfaceTextureClient::setBufferCount()
SurfaceTextureClient::setBuffersDimensions()
SurfaceTextureClient::setBuffersTransform()
SurfaceTextureClient::setBuffersUserDimensions()
SurfaceTextureClient::setCrop()
SurfaceTextureClient::setScalingMode()
SurfaceTextureClient::setSwapInterval()
./frameworks/native/libs/ui/GraphicBufferAllocator.cpp
GraphicBufferAllocator::alloc()
GraphicBufferAllocator::free()
./frameworks/native/libs/ui/GraphicBufferMapper.cpp
GraphicBufferMapper::lock()
GraphicBufferMapper::registerBuffer()
GraphicBufferMapper::unlock()
GraphicBufferMapper::unregisterBuffer()
./frameworks/native/services/surfaceflinger/SurfaceFlinger.cpp
SurfaceFlinger::captureScreenImplLocked()
SurfaceFlinger::computeVisibleRegions()
SurfaceFlinger::handlePageFlip()
SurfaceFlinger::handleRepaint()
SurfaceFlinger::handleTransaction()
SurfaceFlinger::onMessageReceived()
SurfaceFlinger::postFramebuffer()
SurfaceFlinger::renderScreenToTextureLocked()
SurfaceFlinger::turnElectronBeamOffImplLocked()
./frameworks/native/services/surfaceflinger/Layer.cpp
Layer::doTransaction()
Layer::lockPageFlip()
Layer::onDraw()
Layer::unlockPageFlip()
./frameworks/native/opengl/libs/EGL/eglApi.cpp
eglBeginFrame()
eglSwapBuffers()
./frameworks/av/media/libstagefright/AwesomePlayer.cpp
AwesomeNativeWindowRender::render()
AwesomePlayer::finishSeekIfNecessary()
AwesomePlayer::finishSetDataSource_l()
AwesomePlayer::initAudioDecoder()
AwesomePlayer::initRenderer_l()
AwesomePlayer::initVideoDecoder()
AwesomePlayer::invoke()
AwesomePlayer::notifyVideoSize_l()
AwesomePlayer::onStreamDone()
AwesomePlayer::onVideoEvent()
AwesomePlayer::pause()
AwesomePlayer::play()
AwesomePlayer::postVideoEvent_l()
AwesomePlayer::prepare()
AwesomePlayer::prepareAsync()
AwesomePlayer::seekTo()
AwesomePlayer::selectTrack()
2. Performance Analysis Tools : systrace
44 11th Kandroid Conference - www.kandroid.org
android.os.Trace ATRACE_INT()
Tracing Point
Java Native
2. Performance Analysis Tools : systrace
./frameworks/base/services/input/InputDispatcher.cpp
ATRACE_INT("iq", mInboundQueue.count());
ATRACE_INT(counterName, connection->outboundQueue.count());
ATRACE_INT(counterName, connection->waitQueue.count());
./frameworks/native/libs/gui/BufferQueue.cpp
ATRACE_INT(mConsumerName.string(), mQueue.size());
ATRACE_INT(mConsumerName.string(), mQueue.size());
./frameworks/native/services/surfaceflinger/DisplayHardware/HWComposer.cpp
ATRACE_INT("VSYNC", ++mVSyncCount&1);
./frameworks/native/opengl/libs/EGL/eglApi.cpp
ATRACE_INT("GPU Frames Outstanding", thread->mQueue.size());
ATRACE_INT("GPU Frames Outstanding", mQueue.size());
./frameworks/av/media/libstagefright/AwesomePlayer.cpp
ATRACE_INT("Video Lateness (ms)", latenessUs / 1E3);
ATRACE_INT("Video Lateness (ms)", latenessUs / 1E3);
./frameworks/av/services/audioflinger/FastMixer.cpp
ATRACE_INT(traceName, framesReady);
ATRACE_INT("cycle_ms", monotonicNs / 1000000);
ATRACE_INT("load_us", loadNs / 1000);
45 11th Kandroid Conference - www.kandroid.org
Activity
SurfaceFlinger
Event
Set
Property
Value
Invalidate
Measure
&
Layout
Prepare
Draw
Update
DisplayList
Draw
DisplayList
Swap
Buffers
Display
List
Dequeue
Buffer
Composite
Windows
Post
Buffer
Enqueue
Buffer
Something
Happens
Draw
Display
2. Performance Analysis Tools : systrace
46 11th Kandroid Conference - www.kandroid.org
Timeline Panel
Profile Panel
① Time spent in the method + time spent in any called functions (children)
② Time spent in the method
③ Number of calls to the method + number of recursive calls
Number of calls out of the total number of calls made to that method
④ Average time spent in the method
1 2 3 4
Graphical viewer for execution logs
2. Performance Analysis Tools : traceview
47 11th Kandroid Conference - www.kandroid.org
// start tracing to "/sdcard/kandroid.trace" Debug.startMethodTracing(“kandroid"); // ... // stop tracing Debug.stopMethodTracing();
How to generate trace files? Pros & Cons
Use android.os.Debug class in your code
• Precise control
• Need to modify code
Use method profiling feature in DDMS
• Easy to do
• Do not need code
• Less precise
Use „am profile‟ in the shell
2. Performance Analysis Tools : traceview
48 11th Kandroid Conference - www.kandroid.org
2. Performance Analysis Tools : traceview
Restrictions • If you are using the Debug class, your device or emulator must have an SD card and your application
must have permission to write to the SD card.
• If you are using DDMS, Android 1.5 devices are not supported.
• If you are using DDMS, Android 2.1 and earlier devices must have an SD card present and your
application must have permission to write to the SD card.
• If you are using DDMS, Android 2.2 and later devices do not need an SD card. The trace log files are
streamed directly to your development machine.
• If the system reaches the maximum buffer size (default to 8Mb) before stopMethodTracing() is called,
the system stops tracing and sends a notification to the console.
• Interpreted code will run more slowly when profiling is enabled. Don't try to generate absolute
timings from the profiler results.
• JIT is disabled when profiling is enabled (?)
Known Issues • If a thread exits during profiling, the thread name is not emitted.
• The VM reuses thread IDs. If a thread stops and another starts, they may get the same ID.
Other Tools • dmtracedump can generate graphical call-stack diagrams from trace log files.
• android.os.Debug.start[stop]NativeTracing() enables qemu tracing.
Traces every cpu instruction of every process, including kernel code.
Can have more complete information including all context switches and cache misses.
Works only inside the qemu emulator.
49 11th Kandroid Conference - www.kandroid.org
android:sdk $ cd platform-tools/
android:platform-tools $ adb shell ⏎
dumpsys SurfaceFlinger
type | handle | source crop | frame name ------------+----------+---------------------------+-------------------------------- HWC | 41a2fa08 | [ 0, 50, 720, 1184] | [ 0, 50, 720, 1184] com.facebook.katana HWC | 41a2fa60 | [ 0, 0, 720, 50] | [ 0, 0, 720, 50] StatusBar HWC | 41101f68 | [ 0, 0, 720, 96] | [ 0, 1184, 720, 1280] NavigationBar type | handle | source crop | frame name ------------+----------+---------------------------+-------------------------------- HWC | 41a2f910 | [ 0, 50, 720, 1184] | [ 0, 50, 720, 1184] org.kandroid.adapterview HWC | 41a2fa60 | [ 0, 0, 720, 50] | [ 0, 0, 720, 50] StatusBar HWC | 411c8fa8 | [ 0, 0, 720, 96] | [ 0, 1184, 720, 1280] NavigationBar type | handle | source crop | frame name ------------+----------+---------------------------+-------------------------------- GLES | 40e802a8 | [ 0, 50, 720, 1184] | [ 0, 50, 720, 1184] org.kandroid.testApp.MainActivity GLES | 41ac0368 | [ 0, 0, 720, 50] | [ 0, 0, 720, 50] StatusBar GLES | 400e0bd8 | [ 0, 0, 720, 96] | [ 0, 1184, 720, 1280] NavigationBar
Is there meaningful performance difference
between HWC and GLES?
2. Performance Analysis Tools : dumpsys SurfaceFlinger
50 11th Kandroid Conference - www.kandroid.org
0
5
10
15
20
25
30
1 7
13
19
25
31
37
43
49
55
61
67
73
79
85
91
97
103
109
115
121
127
Execute
Process
Draw
• Profile GPU rendering
• 128 frame
• 16.7 ms = 1000ms / 60fps
• Draw : Update DisplayList
• Process : Draw DisplayList
• Execute : Swap Buffer
128 frame
Update
DisplayList
Draw
DisplayList
Swap
Buffers
16.7
2. Performance Analysis Tools : dumpsys gfxinfo <pid>
51 11th Kandroid Conference - www.kandroid.org
Each column gives an estimate of how long each frame takes to render:
1. Draw is the time spent building display lists in Java. It indicates how much time is spent running methods such
as View.onDraw(Canvas).
2. Process is the time spent by Android’s 2D renderer to execute the display lists. The more Views in your hierarchy,
the more drawing commands must be executed.
3. Execute is the time it took to send a frame to the compositor. This part of the graph is usually small.
Reminder: to render smoothly at 60 fps, each frame must take less than 16 ms to complete.
About Execute: if Execute takes a long time, it means you are running ahead of the graphics pipeline. Android can
have up to 3 buffers in flight and if you need another one the application will block until one of these bufferes is freed up.
This can happen for two reasons. The first one is that your application is quick to draw on the Dalvik side but its display
lists take a long time to execute on the GPU. The second reason is that your application took a long time to execute the
first few frames; once the pipeline is full it will not catch up until the animation is done. This is something we’d like to
improve in a future version of Android.
Source : http://www.curious-creature.org/2012/12/01/android-performance-case-study/
2. Performance Analysis Tools : dumpsys gfxinfo <pid>
52 11th Kandroid Conference - www.kandroid.org
Caution :
This tool is not used for a real target device that is built in user-mode.
In this case, you can use custom ViewServer that is made by Romain Guy.
https://github.com/romainguy/ViewServer
2. Performance Analysis Tools : Overdraw Visualizer – Hierarchy Viewer
53 11th Kandroid Conference - www.kandroid.org
2. Performance Analysis Tools : Overdraw Visualizer – Hierarchy Viewer
54 11th Kandroid Conference - www.kandroid.org
2. Performance Analysis Tools : Overdraw Visualizer – Tracer for OpenGL ES
55 11th Kandroid Conference - www.kandroid.org
2. Performance Analysis Tools : Overdraw Visualizer – Tracer for OpenGL ES
56 11th Kandroid Conference - www.kandroid.org
2. Performance Analysis Tools : Overdraw Visualizer – Dev Opt.
57 11th Kandroid Conference - www.kandroid.org
It is easy to interpret the results if you remember the meaning of each color:
•No color means there is no overdraw. The pixel was painted only once. In this example, you can see that the background
is intact.
•Blue indicates an overdraw of 1x. The pixel was painted twice. Large blue areas are acceptable (if the entire window is
blue, you can get rid of one layer.)
•Green indicates an overdraw of 2x. The pixel was painted three times. Medium-sized green areas are acceptable but you
should try to optimize them away.
•Light red indicates an overdraw of 3x. The pixel was painted four times. Small light red areas are acceptable.
•Dark red indicates an overdraw of 4x or more. The pixel was painted 5 times or more. This is wrong. Fix it.
Based on this information you can see that Settings is a well behaved application that does not require any extra work.
There is a little bit of red in the switches but nothing worth our efforts.
Transparent pixels: look closely at the previous screenshots. Each icon is painted blue. You can see that
thetransparent pixels of the bitmaps count against your overdraw. Transparent pixels must be processed by the GPU and
can be expensive. Android uses optimizations to avoid drawing transparent pixels in layers and 9-patches so you should
only worry about bitmaps.
Overdraw and the GPU: there are two type of mobile GPU architectures. The first uses deferred rendering, for instance
ImaginationTech’s SGX series. This architecture allows the GPU to detect and fix overdraw in specific situations (it doesn’t work if you are blending transparent or translucent pixels.) The second architecture uses immediate rendering and can be
found in NVIDIA’s Tegra GPUs. This architecture cannot optimize overdraw for you, which is why I like to test on Nexus 7.
There are many advantages and disadvantages to both architectures but it’s beyond the scope of this article. Just know that
both work really well.
Source : http://www.curious-creature.org/2012/12/01/android-performance-case-study/
2. Performance Analysis Tools : Overdraw Visualizer – Dev Opt.
58 11th Kandroid Conference - www.kandroid.org
Adapter
Adap
terV
iew
ConvertView
Dumb
Recycling
Views
View
Holder
Vie
w H
old
er
getView(pos,convertView, parent)
View getView(pos,convertView, parent)
• convert view
• convert view
• view holder
• none
Gimme views
• For each position : Adapter.getView()
• A new View is returned : Expensive
• What if I have 1,000,000 items?
3. Case Study #1 : Scrolling Performance
1
2
3
3
4
59 11th Kandroid Conference - www.kandroid.org
0
10
20
30
40
Dumb Recycling views ViewHolder
FPS
3. Case Study #1 : Scrolling Performance
Source : Tuesday, June 1, 2010, Google I/O, The World of List View
50
60
Test Environments : List of 10,000 items, Nexus One device, Froyo
60 11th Kandroid Conference - www.kandroid.org
How to Generate Touch Events?
android.view
IWindowManager
• void injectPointerEvent(MotionEvent motionevent, boolean flag)
• Internal, unpublished API
• INJECT_EVENT permission required
android.app.
Instrumentation
• void sendPointerSync (MotionEvent event)
• INJECT_EVENT permission required
MonkeyDevice
in Monkey Runner
• void drag(tuple start, tuple end, float duration, integer steps)
• Default duration is 1.0 sec
• Default number of steps is 10
UiDevice
in uiautomator
• boolean swipe (Point[] segments, int segmentSteps)
• boolean swipe (int startX, int startY, int endX, int endY, int steps)
• Takes 5ms per step
Event injection to
/dev/input/eventX
• Use getevent / sendevent in the shell
• http://developer.android.com/tools/help/monkeyrunner_concepts.html
• http://developer.android.com/tools/help/uiautomator/index.html
• https://code.google.com/p/android-event-injector/
3. Case Study #1 : Scrolling Performance – Measurement (Touch Event, FPS)
61 11th Kandroid Conference - www.kandroid.org
How to Calculate the Frame Rate (FPS)?
protected void onCreate(Bundle savedInstanceState) { ViewTreeObserver observer = view.getViewTreeObserver(); observer.addOnPreDrawListener(new ViewTreeObserver.OnPreDrawListener() { @Override public boolean onPreDraw() { trackFPS(); return true; } }); }
private long mFpsStartTime = -1; private long mFpsPrevTime = -1; private int mFpsNumFrames; private void trackFPS() { long nowTime = System.currentTimeMillis(); if (mFpsStartTime < 0) { mFpsStartTime = mFpsPrevTime = nowTime; mFpsNumFrames = 0; } else { ++mFpsNumFrames; long frameTime = nowTime - mFpsPrevTime; long totalTime = nowTime - mFpsStartTime; Log.v(TAG, "Frame time:\t" + frameTime); mFpsPrevTime = nowTime; if (totalTime > 1000) { float fps = (float) mFpsNumFrames * 1000 / totalTime; Log.v(TAG, "\tFPS:\t" + fps); mFpsStartTime = nowTime; mFpsNumFrames = 0; } } } Source taken from android.view.ViewRootImpl
3. Case Study #1 : Scrolling Performance – Measurement (Touch Event, FPS)
62 11th Kandroid Conference - www.kandroid.org
0.0
10.0
20.0
30.0
40.0
50.0
60.0
70.0
Jellybean (27 rows) Gingerbread (24 rows) Jellybean (2 rows) Gingerbread (2 rows)
Dumb
Recycling Views
View Holder
Test Environments :
• 3 views per 1 row (LinearLayout, TextView, ImageView = null)
• Nexus One (Gingerbread) / Galaxy Nexus (JellyBean)
• Touch Event Generation : Gingerbread (Manual), JellyBean (Auto)
• Auto Event Gen. Method : getevent / custom sendevent
3. Case Study #1 : Scrolling Performance – Measurement (Result #1)
This result means :
• View inflation cost
• findViewById cost
• GC cost & change history
63 11th Kandroid Conference - www.kandroid.org
0
10
20
30
40
50
60
1 2 3 4 5 6 7 8 9
View Holder with Async
View Holder with no Async
Test Environments :
• 2 rows in the screen
• 3 views per 1 row (LinearLayout, TextView, ImageView)
• Galaxy Nexus (Jellybean)
• Touch Event Generation - Auto Generation
• Bitmap Decoding Method (with Async / with No Async)
scroll event #1 scroll event #2 scroll event #3
There are some additional questions in this result.
• What is FPS?
• What is the details of bitmap overhead?
3. Case Study #1 : Scrolling Performance – Measurement (Result #2)
64 11th Kandroid Conference - www.kandroid.org
0.00
10.00
20.00
30.00
40.00
50.00
60.00
File load
Bitmap Decode
0.00
10.00
20.00
30.00
40.00
50.00
60.00
File load Bitmap Decode
File load (read) 0.27 ms
Bitmap Decode 53.29 ms
This result means
• Most of the bitmap overhead is from decoding time and little of that is from file read.
• How can we handle bitmap overhead?
3. Case Study #1 : Scrolling Performance – Measurement (Result #3)
65 11th Kandroid Conference - www.kandroid.org
3. Case Study #1 : Scrolling Performance – Efficient Use of Bitmaps
Caching
Bitmaps
• http://developer.android.com/training/displaying-bitmaps/cache-bitmap.html
• Allowing components to quickly reload processed images.
• Memory cache vs. disk cache vs. database (ContentProvider)
• LRU vs. other policy?
• Component-wide vs. process-wide
Reusing
Bitmaps
• http://developer.android.com/training/displaying-bitmaps/manage-
memory.html#inBitmap
• BitmapFactory.Options.inBitmap field introduced in Android 3.0 (API Level 11)
• Improved performance by removing both memory allocation and de-allocation.
Subsampling
Bitmaps
• http://developer.android.com/training/displaying-bitmaps/load-bitmap.html
• BitmapFactory.Options.inSampleSize field
• Higher resolution image: no visible benefit, unnecessary memory consumption,
and additional on the fly scaling.
Using
recycle()
• http://developer.android.com/training/displaying-bitmaps/manage-
memory.html#recycle
• To reclaim memory as soon as possible on Android 2.3.3 (API level 10) and lower.
• May require reference counting.
Using
AsyncTask
• http://developer.android.com/training/displaying-bitmaps/process-bitmap.html
• Processing bitmaps in a background thread using AsyncTask.
• Handling concurrency with AsyncDrawable.
66 11th Kandroid Conference - www.kandroid.org
Test Environments
• number of file : 31
• total file size : 1,070,931 byte
• avg. file size : 34,546 byte
• test repeat count : 100
• read I/O
0
10
20
30
40
50
60
70
80
90
Total file Avg. (ms)
Single file Avg. (ms)
0
20
40
60
80
100
120
1 8
15
22
29
36
43
50
57
64
71
78
85
92
99
Asset
File-Int
File-Ext
Database
Cursor
3. Case Study #1 : Scrolling Performance – I/O Performance Measurement
This result means
• Fast : asset, internal f/s, database cursor
• Normal : database
• Slow : external f/s
• Not included : write io, per size, per total size
• If a bitmap image is in the local storage, then …
67 11th Kandroid Conference - www.kandroid.org
Remote DownloadProvider
Activity
Remote
Server
Remote
File
Local
File
DownLoad
Provider
DownLoad
Service
Download
Thread
Activity
Adapter
View
Image
Adapter
Main
Thread
Async
Drawable
3. Case Study #1 : Scrolling Performance – Networked I/O Issues
What do you think about this topic?
• If you want to use a bitmap cache, what is the best choice?
• If you decide to use custom component for download,
which is better choice: component based on
ContentProvider or Service?
68 11th Kandroid Conference - www.kandroid.org
Activity Remote DownloadProvider
Remote
Server
Remote
File
Local
File
DownLoad
Provider
DownLoad
Service
Download
Thread
Activity
Adapter
View
Image
Adapter
Main
Thread
Async
Drawable
3. Case Study #1 : Scrolling Performance – Networked I/O Issues
Bitmap
Management
Cache
Storage
Cache
Policy
Cache
Boundary
Memory
Handler
• Reference Mgt. Method
• Holder (Recycle)
• Memory
• Database
• File (I/E)
• LRU
• …
• Process
• Component
• onLowMem
• onTrimMem
• OOM
Some consideration for bitmap cache implementation
69 11th Kandroid Conference - www.kandroid.org
(pre-built) Download Provider
Download
Thread (N)
Download App
Activity
Receiver
Download
Manager
Download
Provider
Download
Service
database
Update
Thread
Download
Handler
Download
Queue
Download
Thread (N) Download
Thread (N)
HTTP
Client HTTP
Client HTTP
Client
file
3. Case Study #1 : Scrolling Performance – Networked I/O Issues
Content
Resolver
Intent
Content
Resolver
Custom downloader implementation approaches : Provider based vs. Service based
70 11th Kandroid Conference - www.kandroid.org
• Why did Google implement the downloader component based on content provider, not service?
- System-wide binder connection management overhead
• Why did downloadprovider use Intent for a callback response?
- DownloadManager‟s enqueue() method uses ContentResolver‟s insert() method.
This mechanism can‟t use callback interface like ContentObserver.
Download Provider must notify the completion of download, using Intent.
• What do you think about the performance comparison between provider and service?
- The performance comparison of Binder and Intent
- The Performance comparison of Single threaded and thread-pooled
3. Case Study #1 : Scrolling Performance – Networked I/O Issues
Custom downloader implementation approaches : Provider based vs. Service based
71 11th Kandroid Conference - www.kandroid.org
Binder connection establish time (ms)
Binder call response time (ms) : parcel with 1 int
0
10
20
30
40
50
60
70
80
90
100
1 4 7
10
13
16
19
22
25
28
31
34
37
40
43
46
49
52
55
58
61
64
67
70
73
76
79
82
85
88
91
94
97
100
Gingerbread (13.76)
JellyBean (17.75)
Gingerbread (0.67)
JellyBean (0.48)
3. Case Study #1 : Scrolling Performance – Networked I/O Issues
72 11th Kandroid Conference - www.kandroid.org
0
20
40
60
80
100
120
140
160
1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89 93 97
JellyBean (intent with interval)
JellyBean (intent no interval)
JellyBean (remote calll : 0.501)
JellyBean (internal call:0.008)
Message delivery time comparison : Intent vs. Binder Call
3. Case Study #1 : Scrolling Performance – Networked I/O Issues
73 11th Kandroid Conference - www.kandroid.org
Event
(AdapterView)
Invalidate Adapter
Measurement
Layout
Draw
• getView()
Bitmap
Decoding
Network
I/O
3. Case Study #1 : Scrolling Performance – Conclusion
Dumb
Recycle
View
Holder
Storage Async
Drawable
A
B
• If an adapterview has many children,
part A is as important as B, too.
• But if not, the bottleneck is in part B.
• In part B, the bottleneck is in Bitmap decoding
74 11th Kandroid Conference - www.kandroid.org
4. Case Study #2 : Android SMP Programming Guide
Source : http://www.kandroid.org/board/board.php?board=AndroidBeginner&command=body&no=102
Reference : http://developer.android.com/training/articles/smp.html
Theory
• Memory consistency models • Processor consistency
• CPU cache behavior
• Observability
• ARM‟s weak ordering
• Data memory barriers • Store/store and load/load
• Load/store and store/load
• Barrier instructions
• Address dependencies and causal
consistency
• Memory barrier summary
• Atomic operations • Atomic essentials
• Atomic + barrier pairing
• Acquire and release
75 11th Kandroid Conference - www.kandroid.org
4. Case Study #2 : Android SMP Programming Guide
Hardware Platform
User Applications
Kernel
System Call Interface
glibc
Architecture Dependent Kernel Code
User
Space
Kernel
Space
NPTL
(pthread)
bionic
libc
lib
cutils
gnu-
libstdc++ libstdc++
gabi++
stlport
gnustl
libutils
libbinder
dlmalloc
RefBase
sp<>, wp<> Android
Atomic API
The original API was developed for a
multi-threaded uniprocessor environment.
Most of the function names were left
unchanged when SMP updates were
made at Honeycomb.
pthread
operations
76 11th Kandroid Conference - www.kandroid.org
4. Case Study #2 : Android SMP Programming Guide
Explicit Parallelization Implicit Parallelization
Android Atomic API • android_atomic_acquire_store()
• android_atomic_release_store()
• …
Bionic libc pthread function
• Bionic atomic APIs
• …
• Compiler
• Linker
• Loader
Example :
pthread_mutex_lock(lock);
oldValue = *addr;
*addr = value;
pthread_mutex_unlock(lock);
Example :
#include <iostream>
int main()
{
#pragma omp parallel
{
std::cout << "Hello World!\n";
}
}
※ Android do not support
implicit parallelization such as OpenMP
Pre-honeycomb : multi-threaded uni-processor environment.
Honeycomb and after : mutli-threaded (symmetric) multi-processor environment.
77 11th Kandroid Conference - www.kandroid.org
4. Case Study #2 : Android SMP Programming Guide
• http://developer.android.com/training/articles/smp.html
• http://www.kandroid.org/board/board.php?board=AndroidBeginner&command=body&no=102
In C/C++
① Use the pthread synchronization primitives, like mutexes and semaphores, correctly.
② Avoid using atomic functions directly as much as possible.
③ Be extremely circumspect with "volatile”, which has no atomicity guarantees and
no memory barrier provisions.
In Java
① Use an appropriate utility class from the java.util.concurrent package.
② Make your class immutable.
③ Use “synchronized” statement to guard any field that can be accessed by
more than one thread.
④ Declare shared fields “volatile”, which will help you avoid the mysterious failures
associated with optimizing compilers and SMP mishaps.
⑤ Follow safe construction practices. (Refer to http://www.ibm.com/developerworks/java/library/j-jtp0618/index.html)
Don't publish the "this" reference during construction.
Don't implicitly expose the "this" reference.
Don't start threads from within constructors
78 11th Kandroid Conference - www.kandroid.org
5. Conclusion : Performance Sensitive Paths
Event
(AdapterView)
Invalidate Adapter
Measurement
Layout
Draw
• onMeasure()
• onLayout()
• draw()
• dispatchDraw()
• onDraw()
• dispatchTouchEvent()
• onTouchEvent()
• getView()
- Dump
- Recycling
- ViewHolder
General Performance Sensitive Paths Performance Monitoring Tools / APIs
• systrace
• TraceView
Update
DisplayList
Draw
DisplayList
Swap
Buffers dumpsys
• HierarchyViewer
• Tracer for OpenGL ES
• Setting App (Dev Opt.)
StrictMode
Async
Drawable
Down
loader
11th Kandroid Conference
www.kandroid.org
Q & A
12th Kandroid Conference
The Gate of the AOSP #5 : So-called, App?
일시 : 2013년 10월 25일(금) 오전 9시 ~ 오후 5시
장소 : 과학기술회관 국제회의실
Session (Draft)
• Android Application Architecture Pattern
• Android Chrome Browser
• Google GMS III
• InputMethodsService & Keyboard
• Networking & Download Provider
• GDK & DOOM
• Security & Anti-Virus App
세미나 발표를 희망하시거나, 제안하시고 싶은 주제가 있으시면 언제든 메일 주세요.
연락처 : [email protected] / kandroid.org 운영자, 양정수(닉네임:들풀)