PowerPoint slides

Download Report

Transcript PowerPoint slides

Performance Analysis & Code Profiling
It’s 2:00AM -- do you know where your
program counter is?
Code is tooooo slooooooowwwwww....



Real world: performance matters. Often an app
killer.
 Perceived performance usually bottom line for
consumer apps.
 “In academia, the constant in the O() is
nothing. In industry, it’s the only thing.”
Good algorithms/data structures are crucial
starting points
After that, different implementations can have
huge impact
Where’s the bottleneck?




Assuming same data structs/algs, why is one
implementation slower than another?
Where is code spending most of its time?
Related question: where is all of the memory? Are
objects going away when they should?
Rule of thumb: 80/20 rule


In most programs, 80% of the time is
spent in 20% of the code
Problem: humans are very bad at finding the
20%!
 Even worse at predicting where the 20% will be
when writing/designing program!
Non-solutions



Blame the language. Write in C/FORTRAN/etc.
 Some languages do impose runtime penalties
 Mostly, small compared to choice of alg, etc.
Use foreign language calls (assembly, C, etc.)
 Sub-case of above
 Can be useful for critical chunks of code
 Still stuck with -- which chunks? (80/20)
Micro-optimize while writing code
 Lot of pain
 Makes code hard to read/follow
 Typically doesn’t help
More non-solutions


Wait for HW to get faster
 Solution used in practice :-P
 Encourages sloppy design/programming
 Some problems won’t go away with time
Let the compiler handle it
 Good for micro-optimizations (esp. block
local)
 Compiler is smarter than you, usually
 Doesn’t handle design choices, overuse of
function calls, poor data structs, indescribable
invariants, data-dependent performance, etc.
Watching the code run...







Ultimate answer: look and see
Instrument and monitor code; see where most
time is being spent
Must load program under real data conditions!
You’ve already done some of this
 Program timing
 Counting get()/put()/remove() calls, etc.
In principle, could get everything you need that
way
Massive pain in the rear...
80/20 rule strikes again
Profiling tools to the rescue...


Automated instrumentation of code
 Use external tool to monitor execution
 Run under realistic conditions
 Post-mortem examine results for critical 20%
Typically work by:
 Rewriting compiled executable (gcc -p)
 Monitoring the runtime system/JVM (java Xrunhprof)
Hprof: the Java profiler




Runs JVM in special mode; watches code as it
runs. Tracks:
 Subroutine calls/stacks
 Object allocations
 Thread execution
 CPU usage
Invoke with:
 java -Xrunhprof:[hprof opt list] ClassToProf
Produces text summary of run, post-mortem
Note: Javasoft demo tool; NOT professional
quality, industrial strength tool (but free!)
Hprof options

file=fname: set output/dump file
cpu=samples|times: set profiling method for
CPU utilization, method calls, stack trace, etc.
heap=dump|sites|all: set tracing of heap
(dynamically allocated) objects
depth=#: set depth of stack traces to report
(max # of nested calls)
thread=y|n: report thread IDs?

Example:




java -Xrunhprof:file=hprof.txt,cpu=samples,depth=6 \
Analyzer -u 8 -a 4 -x 842 -r results.txt -m model.dat
Problems with hprof




Unreadable output (ugh)
Static analysis
 Only gives you snapshot of results at end of run
Uses sampling
 Only checks state of JVM periodically
 Can miss very short/infrequent calls
Doesn’t check many things
 Dynamic heap state; memory leaks (yes, even in
Java)
 File I/O
 Strange data access patterns
 Multi-thread accesses