PowerPoint slides
Download
Report
Transcript PowerPoint slides
Performance Analysis & Code Profiling
It’s 2:00AM -- do you know where your
program counter is?
Code is tooooo slooooooowwwwww....
Real world: performance matters. Often an app
killer.
Perceived performance usually bottom line for
consumer apps.
“In academia, the constant in the O() is
nothing. In industry, it’s the only thing.”
Good algorithms/data structures are crucial
starting points
After that, different implementations can have
huge impact
Where’s the bottleneck?
Assuming same data structs/algs, why is one
implementation slower than another?
Where is code spending most of its time?
Related question: where is all of the memory? Are
objects going away when they should?
Rule of thumb: 80/20 rule
In most programs, 80% of the time is
spent in 20% of the code
Problem: humans are very bad at finding the
20%!
Even worse at predicting where the 20% will be
when writing/designing program!
Non-solutions
Blame the language. Write in C/FORTRAN/etc.
Some languages do impose runtime penalties
Mostly, small compared to choice of alg, etc.
Use foreign language calls (assembly, C, etc.)
Sub-case of above
Can be useful for critical chunks of code
Still stuck with -- which chunks? (80/20)
Micro-optimize while writing code
Lot of pain
Makes code hard to read/follow
Typically doesn’t help
More non-solutions
Wait for HW to get faster
Solution used in practice :-P
Encourages sloppy design/programming
Some problems won’t go away with time
Let the compiler handle it
Good for micro-optimizations (esp. block
local)
Compiler is smarter than you, usually
Doesn’t handle design choices, overuse of
function calls, poor data structs, indescribable
invariants, data-dependent performance, etc.
Watching the code run...
Ultimate answer: look and see
Instrument and monitor code; see where most
time is being spent
Must load program under real data conditions!
You’ve already done some of this
Program timing
Counting get()/put()/remove() calls, etc.
In principle, could get everything you need that
way
Massive pain in the rear...
80/20 rule strikes again
Profiling tools to the rescue...
Automated instrumentation of code
Use external tool to monitor execution
Run under realistic conditions
Post-mortem examine results for critical 20%
Typically work by:
Rewriting compiled executable (gcc -p)
Monitoring the runtime system/JVM (java Xrunhprof)
Hprof: the Java profiler
Runs JVM in special mode; watches code as it
runs. Tracks:
Subroutine calls/stacks
Object allocations
Thread execution
CPU usage
Invoke with:
java -Xrunhprof:[hprof opt list] ClassToProf
Produces text summary of run, post-mortem
Note: Javasoft demo tool; NOT professional
quality, industrial strength tool (but free!)
Hprof options
file=fname: set output/dump file
cpu=samples|times: set profiling method for
CPU utilization, method calls, stack trace, etc.
heap=dump|sites|all: set tracing of heap
(dynamically allocated) objects
depth=#: set depth of stack traces to report
(max # of nested calls)
thread=y|n: report thread IDs?
Example:
java -Xrunhprof:file=hprof.txt,cpu=samples,depth=6 \
Analyzer -u 8 -a 4 -x 842 -r results.txt -m model.dat
Problems with hprof
Unreadable output (ugh)
Static analysis
Only gives you snapshot of results at end of run
Uses sampling
Only checks state of JVM periodically
Can miss very short/infrequent calls
Doesn’t check many things
Dynamic heap state; memory leaks (yes, even in
Java)
File I/O
Strange data access patterns
Multi-thread accesses