1/21 Efficient Path Profili ng Thomas Ball James R. Larus 2005. Dec. 6 Chihun Kim
Jan 21, 2016
1/21
Efficient Path ProfilingThomas Ball James R. Larus
2005. Dec. 6 Chihun Kim
2/21
Contents• Introduction• Limitation of edge profiling• Algorithm of efficient path profiling
– Basic idea– Compactly representing paths with unique integer– Finding instrumentation point– Placing instrumentations– Regenerating path from the profile result– Transforming general CFG to acyclic DAG
• Experiment
3/21
Introduction
• This paper describes a new path profiling algorithm– Simple– Fast– Minimized run-time overhead
• Efficient edge profiling : average overhead 16%• Efficient path profiling : average overhead 31%
4/21
Limitation of edge profiling• Problems
– Edge profiling does not identify the most frequently executed paths
– Example : 2 possible results from 1 edge profiling
• We need a Path Profiling
5/21
Basic idea• Paths are identified by unique
integer (path identifier)– Transitions in control flow graph
changes the path identifier• Control flow graph acts as State Machine• In the final state, instrumentation code
update the number of a path execution– This integer is used to index an array
of counter• We need to represent these integers very
compactly, because sparse set of identifiers wastes of memory storage
6/21
Basic idea
• Places instrumentation so that– Transitions of path identifier need not
occur at every branches
• Basic algorithm assumes that control flow graph is DAG (Directly Acyclic Graph)– We need to transform general CFG to
DAG
7/21
Compactly representing paths with sum
• The first step is to assign a constant value to each edge• Path identifier is a sum of edge values through the path
– All path have a unique identifier– Example : edge (ABCDEF) = (2+0+0+1+0) = (3)
8/21
Compactly representing paths with sum
• What is a edge value?
A B
exit
X = Number of path from B to exitY
KAll path strings prefixed with KB are numbered with (0 ~ X-1)
All path strings prefixed with KA are numbered with (X ~ X+Y-1)
0X Number of path from K to exit is (X+Y)
9/21
Compactly representing paths with sum
• AlgorithmA
B C
D
A B C DA
B C
D 1
A 0
B 0 C 0
D 1
0
A 0
B 0 C 0
D 1
0
A 0
B 0 C 1
D 1
0
A 0
B 0 C 1
D 1
00
A 0
B 1 C 1
D 1
00
A 0
B 1 C 1
D 1
00
0A 1
B 1 C 1
D 1
00
0A 1
B 1 C 1
D 1
00
01A 2
B 1 C 1
D 1
00
01A 2
B 1 C 1
D 1
00
01 ACD : 0
ABD : 1
10/21
Finding instrumentation point
• Uses the techniques in the paper– Thomas Ball. Efficiently counting program events with support for on-line queries, ACM
Transactions on Programming Languages and Systems, Sep 1994
• Path identifier is preserved, but– Transition events in the paths are reduced– Example:
• transition number changes from 3 to 2 in the path (ABDEF)
11/21
Inserting Instrumentation
• Initialize a register for path identifier r in entry node• Update register r in the edge by the edge values• Increment a table in the exit node
– The value in register r is used to index the result tabler = 0
r += 2
r += 4
r += 1
table[ r ] ++
12/21
Inserting Instrumentation
• There are some optimization in updating path register r– You can combine initialization of r and the first update of r– You can combine table update and the final update of r
r = 0
r += 2
r += 4
r += 1
table[ r ] ++
13/21
Regenerating path from the profile result
• Given information– R = path identifier– v = current block (initialized to entry block)– e = outgoing edge from the vertex v to w– Val(e) = edge value of the edge e
• At each block, find e (v→w) , which is outgoing edge of v with the largest Val(e) ≤ R.
14/21
Regenerating path from the profile result
• Example: path register is 4– Current node is A
– You choose edge (A→B)• Because (A→B) has the largest edge value
which is smaller than the current path register value 4.
A
B C
D
E F
2
1
2
R=4
15/21
Regenerating path from the profile result
• Example: path register is 4– Current node is B
– Path register is decreased by 2• Because you choose the edge (A→B)
– The second edge is (B→D)
A
B C
D
E F
2
1
2
R=2
16/21
Regenerating path from the profile result
• Example: path register is 4– Current node is D
– You cannot select a edge (D→E)• Because a current path register is only 0
– The final edge is (D→F)• Edge (D→F) has value 0
A
B C
D
E F
2
1
2R=0
17/21
Regenerating path from the profile result
• Example: path register is 4– Regenerated path is ABDF
A
B C
D
E F
2
1
2R=0
18/21
Transforming general CFG to acyclic DAG
• If there is a backedge (E → B)– Insert dummy edge (entry → B)– Insert dummy edge (E →exit)– Remove the backedge (E→B)
19/21
Experiment result• PP : path profiling• QPT : edge profiling
• PP’s overhead averaged 2.8 times QPT overhead
20/21
Experiment result• Correct : the fraction of paths predicted entirely correctly by edge profiling
21/21
Conclusion = Introduction
• New path profiling algorithm– Simple– Fast– Minimized run-time overhead
• Efficient edge profiling : average overhead 16%• Efficient path profiling : average overhead 31%