Optimization & Performance Tuning in Mobile Systems, Advanced Level Raghu Sesha Iyengar
Jun 19, 2015
Optimization & Performance Tuning in Mobile Systems,
Advanced Level
Raghu Sesha Iyengar
Agenda
Optimization What is optimization Why optimize When to optimize
Performance parameters Profilers Benchmark References
What is Optimization “The design and operation of a system or process to make it as
good as possible in some defined sense”. -- Wiktionary
“an act, process, or methodology of making something (as a design, system, or decision) as fully perfect, functional, or effective as possible; specifically: the mathematical procedures (as finding the maximum of a function) involved in this” – Merriam-Webster’s Online dictionary
Finding “best” route to Orchid – our definition
For software optimization, “as good as possible” generally means achieving a preset benchmark “defined” generally means parameters used to measure how close
we are to the most optimized state
What is Optimization
Optimization applies to all aspects of mobile system
Typical software architecture of a mobile phone (Android as an example)
Source:http://developer.android.com
Processor RAM Peripherals
HARDWARE
Why to optimizeMajor considerations of mobile phone software are: Portability
Multitude of hardware choices API, Protocols and Specifications
User experience Usability Performance (speed, memory, responsiveness) Look and feel
Functionality Phone or Supercomputer Reliability
Current drain Battery life Battery technology not catching up with growth in processor power
Security Software development time
Open Source Create differentiator software
Pitfalls of optimization Portability Repetitive process
Optimization is rarely a one-time process Change in hardware platform or Software framework can cause
optimization cycle to start Increasing processor power
Faster processors Coprocessors Hardware Accelerators
Unplanned time and effort Set your target
What is the optimal time to calculate the next move in a chess game? Open source or 3rd party code
A new drop may invalidate all your effort
When to optimize
Platform and framework Decisions
Compiler/Assembler Decisions Design
Optimize Code Code Optimize Design
Measure Parameters
Compare with Benchmark
Start
Stop Measured parameter values acceptable
Measured parameter values not acceptable
The Optimization Cycle
When to optimize
Platform and framework Decisions
Compiler Decisions
Design Optimization
Code Optimization
Source: http://www.phonegg.com
Processor
Speed
Ram Internal
Memory
Display OS Camera
Droid 550MHZ 256MB 16GB 480x854px Android 5MP
HTC Nexus One
1GHZ 512MB 512MB 480x800px Android 5MP
iPhone 3GS 600MHz 256MB 16GB 320x480px MAC OS X 3MP
Nokia
N900
600MHz 256MB 32GB 800x480px Linux 5MP
Samsung
S8500 Wave
1GHz 512MB 2GB 480x800px Bada 5MP
LG Gw990 - - - 480x1024px MeeGo 5MP
Effort Returns
Low
Low
High
High
When to optimize Factors affecting CPU performance:
Co-processors & hardware accelerators
Advanced micro architecture (for same ISA)
Vector Registers Pipeline Optimizations
Data Forwarding Branch Prediction Predicate Register Delayed Branch
Cache Factors affecting Platform choice
Dev tools, IDE Target applications Support from vendor
Platform and framework Decisions
Compiler Decisions
Design Optimization
Code Optimization
Effort Returns
Low
Low
High
High
When to optimizeKnow Thy compiler Optimization types
Programming language dependant Vs Programming language independent
Machine dependant Vs Machine independent
Optimization levels Dynamic and Static linking Prelinking Pragma Compiler options JIT
Platform and framework Decisions
Compiler Decisions
Design Optimization
Code Optimization
Effort Returns
Low
Low
High
High
When to optimizeKnow Thy Framework Threading
• UI interactions• Background computations
File operation choices. • System calls for file op
• fopen Vs open• Write in standard chunks
• Background writes Algorithm choices
• Speed Vs Memory• Hash Vs Binary
• Analyze choices before optimizing• Spend enough time before picking
Create/destroy resources• Memory• peripherals
Platform and framework Decisions
Compiler Decisions
Design Optimization
Code Optimization
Effort Returns
Low
Low
High
High
When to optimize Other decisions
• Database decisions• Do we need database• Which data base to use
• Fixed Vs Floating point• Coding language decisions
• Application code• Native libraries
• API definitions• Extensibility
• Avoid costly Message passing• Sockets Vs Intents
• XML parser• SAX Vs DOM
Platform and framework Decisions
Compiler Decisions
Design Optimization
Code Optimization
Effort Returns
Low
Low
High
High
When to optimize
Know Thy language Strive for good programs not fast
ones Document optimizations
Optimization Vs Readability 80/20 rule. Area for compiler research Control flow optimizations
Implement state machines Optimization techniques for
Assembly Highly ISA dependent Group instructions Efficient loops Use MAC/FMA
Platform and framework Decisions
Compiler Decisions
Design Optimization
Code Optimization
Effort Returns
Low
Low
High
High
When to optimize High level language
Language dependent Lookup table Bounds checking Integer multiplication Dead code elimination Common sub expression
elimination Loop unrollingPlatform and framework Decisions
Compiler Decisions
Design Optimization
Code Optimization
Effort Returns
Low
Low
High
High
When to optimize
Platform and framework Decisions
Compiler/Assembler Decisions
Design
Optimize Code Code Optimize Design
Measure Parameters
Compare with Benchmark
Start
Stop Measured parameter values acceptable
Measured parameter values not acceptable
The Optimization Cycle
Parameters for Optimization Speed Memory
Heap & Stack Image size
Response Time Launch time Response to user input
Current Drain Active Current drain Idle Current drain
Security Application Security Device Security Network Security
Profilers Speed Profilers
Based on how data is collected Event based (e.g., gcov, gprof’s number of calls etc, codewarrior, eclipse profiler) Sampling based (e.g., gprof’s runtime figures)
Based on amount of information provided Trace based (e.g., gcov) Statistical (e.g., AMD CodeAnalyst)
Based on instrumentation of code Manual (e.g., Eclipse TraceView) Automatic source level (e.g., CodeWarrior) Compiler assisted Binary instrumented
Memory Profilers Heap and Stack usage (e.g., Valgrind)
Profilers Response Time Profiler
Manual High Speed Camera setup
Current Drain Profiler Customized setup
Security Profiler Application security
FindBugs, Klocworks Device Security
Anti virus Network security
Netfilter
Appropriate IDEs From Vendor (e.g., CodeWarrior) Open Source (e.g., Eclipse, MOTODEV Studio, Carbide.C++)
Benchmarks
Internal Benchmarks Benchmark against another phone Benchmark against another similar application
External Benchmarks Industry Standard
Embedded Microprocessor Benchmark Consortium (EEMBC)
Standard Performance Evaluation Corporation (SPEC) Open Source
I/O intensive (IOMETER) CPU intensive (PI calculator, LINPACK)
ReferencesVarious Processors power dissipation detailshttp://en.wikipedia.org/wiki/List_of_CPU_power_dissipation
Summary of various processors based on different architectures:http://www.linuxfordevices.com/c/a/Linux-For-Devices-Articles/Embedded-Processor-and-SystemonChip-Quick-Reference-Guide/
openMPhttp://openmp.org/wp/
Boostpro libaries:http://www.boostpro.com
Vector signal processing libraryhttp://gpu-vsipl.gtri.gatech.edu/
Assembly and high level optimization techniqueshttp://www.agner.org/optimize/
Optimizing Java codehttp://www.glenmccl.com/jperf/index.htm
Optimization levels in GCC:http://www.linuxjournal.com/article/7269
Optimization levels in CodeWarriorhttp://www.freescale.com/infocenter/Codewarrior/index.jsp?topic=/com.freescale.doc.microcontrollers.targeting.manual/050_CWBuildProps.ColdFire_Compiler__Optimization.html
Profiler basicshttp://en.wikipedia.org/wiki/Profiling_%28computer_programming%29
Summary of profiling tools:http://ktown.kde.org/~seli/memory/analysis.html
Valgrind:http://valgrind.org/
Example codewarrior profilerhttp://www.freescale.com/files/soft_dev_tools/doc/user_guide/IDE_5.5_UG_Profiler.pdf
References
Kernel profile and OProfilehttp://omappedia.org/wiki/Android_Debugging
WinCE kernel profiler:http://discovertheexperience.blogspot.com/2009/02/windows-ce-kernel-profiler.html
iPhone Vs Android Development:http://blogs.zdnet.com/Burnette/?p=682&tag=col1;post-682
Low level optimization resourceshttp://www.agner.org/optimize/
Iphone securityhttp://www.itsecurity.com/features/iphone-security-threat/
Email securityhttp://www.itsecurity.com/email-security/
Jit: http://www.answers.com/topic/just-in-time-compilation
Assembly optimization tips:http://www.mark.masmcode.com/
Compiler, C++ optimization:http://www.agner.org/optimize/
Book on Software optimization:Software Optimization for High-Performance Computing By Kevin R. Wadleigh, Isom L. Crawford
Java benchmarkshttp://java-phones.com/tools/fpc-bench-303
Android benchmarks:http://mobileswdev.wordpress.com/2009/10/27/android-benchmarks/
Additional Slides
When to optimize
Source: Software Optimization for high performance computingBy Kevin R. Wadleigh, Isom L. Crawford
Platform and framework Decisions
Compiler Decisions
Design Optimization
Code Optimization
Effort Returns
Low
Low
High
High
When to optimize
Source: Software Optimization for high performance computingBy Kevin R. Wadleigh, Isom L. Crawford
Platform and framework Decisions
Compiler Decisions
Design Optimization
Code Optimization
Effort Returns
Low
Low
High
High
When to optimize
Source: Optimizing Software in C++
Platform and framework Decisions
Compiler Decisions
Design Optimization
Code Optimization
Effort Returns
Low
Low
High
High
When to optimize
Source:http://www.linuxjournal.com/article/7269
Optimization Levels in GCC
Platform and framework Decisions
Compiler Decisions
Design Optimization
Code Optimization
Effort Returns
Low
Low
High
High
When to optimize
Source:http://www.linuxjournal.com/article/7269
Optimization Levels (gcc) -O0
No optimization -O1
Produce optimized image in short time -O2
All optimizations that do not cause space-seed tradeoff -Os
Size over speed optimization -O3
Optimize for best speed. Global optimization. Size increase expected.
Can have adverse effect if this size increase is more than cache size and leads to more memory access.
Use –f to use any option (which may not be included in the –ox) e.g., gcc finline-functions.
Use –fno to exclude. E.g., fcc –o1 –fno-defer-pop
Platform and framework Decisions
Compiler Decisions
Design Optimization
Code Optimization
Effort Returns
Low
Low
High
High
When to optimize
Source:http://reallylongword.org/prelink-2/
700MHz Athlon, 768MB RAM, Linux 2.4.21
Platform and framework Decisions
Compiler Decisions
Design Optimization
Code Optimization
Effort Returns
Low
Low
High
High
When to optimize
A 100 MB file was read sequentially starting from the beginning of the file to produce these times. The file system buffer cache was 500MB
Source: Software Optimization for high performance computingBy Kevin R. Wadleigh, Isom L. Crawford
Platform and framework Decisions
Compiler Decisions
Design Optimization
Code Optimization
Effort Returns
Low
Low
High
High
When to optimize
Lookup Tables
Boundary checking
Source: Software Optimization for high performance computingBy Kevin R. Wadleigh, Isom L. Crawford
Platform and framework Decisions
Compiler Decisions
Design Optimization
Code Optimization
Effort Returns
Low
Low
High
High