Advanced Compiler Techniques LIU Xianhua School of EECS, Peking University Inter-procedural Analysis
Feb 24, 2016
Advanced Compiler Techniques
LIU Xianhua
School of EECS, Peking University
Inter-procedural Analysis
“Advanced Compiler Techniques”
Topics
• Up to now– Intra-procedural analysis• Dataflow analysis• PRE• Loops• SSA
– Just for individual procedures
• Today: Inter-procedural analysis–across/between procedures
2
“Advanced Compiler Techniques”
Modularity is a Virtue
• Decomposing programs into procedures aids in readability and maintainability
• Object-oriented languages have pushed this trend even further
• In a good design, procedures should be:– An interface– A black box
3
“Advanced Compiler Techniques”
The Catch
• This inhibits optimization! • The compiler must assume:– Called procedure may use or change any
accessible variable– Procedure’s caller provides arbitrary values as
parameters• Interprocedural optimizations – use
the calling relationships between procedures to optimize one or both of them
4
5
Recall Function calls can affect our points-to sets
p1 = &x;p2 = &p1;...foo();
Be conservative – Lose a lot of information
“Advanced Compiler Techniques”
“Advanced Compiler Techniques”
Applications of IPA
• Virtual method invocation• Pointer alias analysis• Parallelization• Detection software errors and
vulnerabilities• SQL injection• Buffer overflow analysis & protection
6
“Advanced Compiler Techniques”
Basic Concepts
• Procedure (Function ) • Caller/Callee• Call Site• Call Graph• Call Context• Call Strings• Formal Arguments• Actual Arguments
7
“Advanced Compiler Techniques”
Terminology
Goal• – Avoid making overly conservative assumptions about the effects of
procedures and the state at call sites int a, e // globals
procedure foo(var b, c) // formal argsb := c
endprogram main
int d // localsfoo(a, d) // call site with
end // actual args• In procedure body
– formals and/or globals may be aliased (two names refer to same location)
– formals may have constant value• At procedure call
– global vars may be modified or used– actual args may be modified or used
8
9
Interprocedural Analysis vs.Interprocedural Optimization Interprocedural analysis
Gather information across multiple procedures (typically across the entire program) Can use this information to improve
intraprocedural analysis and optimization (e.g., CSE)
Interprocedural optimizations Optimizations that involve multiple procedurese.g., Inlining, procedure cloning, interprocedural register allocation Optimizations that use interprocedural analysis
“Advanced Compiler Techniques”
“Advanced Compiler Techniques”
The Call Graph
• Represent procedure call relationshipby call graph– G = (V,E,start)– Each procedure is a unique vertex– Call site = edge between caller & callee• (u,v) = call from u to v (u may call v)• Can label with source line
– Cycles represent recursion
10
11
Call Graph
“Advanced Compiler Techniques”
12
Super Graph
“Advanced Compiler Techniques”
13
Validity of InterproceduralControl Flow Paths
“Advanced Compiler Techniques”
14
Safety, Precision, and Efficiencyof Data Flow Analysis Data flow analysis uses static representation of
programs to compute summary information along paths
Ensuring Safety. All valid paths must be covered Ensuring Precision . Only valid paths should be
covered. Ensuring Efficiency. Only relevant valid paths
should be covered.
“Advanced Compiler Techniques”
A path which representslegal control flow
Subject to merging data flow values at shared program points without creating invalid paths
A path which yieldsinformation that affects the summary information
15
Flow and Context Sensitivity Flow sensitive analysis:
Considers intraprocedurally valid paths Context sensitive analysis:
Considers interprocedurally valid paths For maximum statically attainable
precision , analysis must be both flow and context sensitive.
“Advanced Compiler Techniques”
16
Context Sensitivity inInterprocedural Analysis
“Advanced Compiler Techniques”
17
Example of Context Sensitivity
“Advanced Compiler Techniques”
18
Staircase Diagrams ofInterprocedurally Valid Paths
“You can descend only as much as you have ascended!”
Every descending step must match a corresponding ascending step. “Advanced Compiler Techniques”
19
Context Sensitivity inPresence of Recursion
“Advanced Compiler Techniques”
• For a path from u tov, g must be applied exactly the same number of times as f .
• For a prefix of the above path, g can be applied only at most as many times as f .
20
Staircase Diagrams ofInterprocedurally Valid Paths
“Advanced Compiler Techniques”
“Advanced Compiler Techniques”
Interprocedural Analysis
• Goals– Enable standard optimizations even with
procedure calls– Reduce call overhead for procedures– Enable optimizations not possible for single
procedures• Optimizations– Register allocation– Loop transformations– CSE, etc.
21
“Advanced Compiler Techniques”
Analysis Sensitivity
• Flow-insensitive– What may happen (on at least one path)– Linear-time
• Flow-sensitive– Consider control flow (what must happen)– Iterative data-flow: possibly exponential
• Context-insensitive– Call treated the same regardless of caller– “Monovariant” analysis
• Context-sensitive– Reanalyze callee for each caller– “Polyvariant” analysis
• Path-sensitive vs. path-insensitive– Computes one answer for every execution path– Subsumes flow-sensitivity– Extremely expensive
More sensitivity
More accuracy, but more expensive
22
23
Increasing Precision inData Flow Analysis
“Advanced Compiler Techniques”
actually, onlycaller sensitive
“Advanced Compiler Techniques”
Precision of IPA
• Flow-insensitive– result not affected by control flow in procedure
• Flow-sensitive– result affected by control flow in procedure
A
BA B
24
“Advanced Compiler Techniques”
Context Sensitivity• Re-analyze callee as if procedure was inlined
• Too expensive in space & time– Recursion?
• Approximate context sensitivity:– Reanalyze callee for k levels of calling context
a = id(3); b = id(4);
id(x) { return x; }3 4
a = min(3, 4); s = min(“aardvark”, “vacuum”);
min(x, y) { if (x <= y) return x; else return y; }ints strings
25
26
Path Sensitivity Path-sensitive analysis
– Computes an answer for every path: – x is 4 at the end of the left path – x is 5 at the end of the right path
Path-insensitive analysis – Computes one answer for all path: – x is not constant
“Advanced Compiler Techniques”
“Advanced Compiler Techniques”
Key Challenges for Interprocedural Analysis
• Compilation time, memory – Key problem: scalability to large programs – Dominated by analysis time/memory – Flow-sensitive analysis: bottleneck often memory, not
time Often limited to fast but imprecise analysis
• Multiple calling environments Different calls to P() have different properties:
– Known constants– Aliases– Surrounding execution context (e.g., enclosing loops) – Function pointer arguments– Frequency of the call
• Recursion27
28
Brute Force: Full Context-Sensitive Interprocedural Analysis
Invocation Graph [Emami94] Use an invocation graph, which distinguishes all
calling chains Re-analyze callee for all distinct calling paths Pro: precise Cons: exponentially expensive, recursion is
tricky
“Advanced Compiler Techniques”
29
Middle Ground: Use Call Graph andCompute Summaries Goal Represent procedure Call relationships Definition If program P consists of n procedures:
p1, . . ., pn Static call graph of P is GP = (N,S,E,r) −N = {p1, . . ., pn} −S = {call-site labels} −E ⊆ N × N × S −r ∈ N is start node
“Advanced Compiler Techniques”
“Advanced Compiler Techniques”
Summary Information
• Compute summary information for each procedure
Summarize effect of called procedure for callersSummarize effect of callers for called procedure• Store summaries in databaseUse later when optimizing procedures• Pros+ Concise+ Can be fast to compute and use+ Separate compilation practical• Cons– Imprecise if only have one summary per procedure
30
“Advanced Compiler Techniques”
Two Types of Information
• Track info that flows into procedures– “Propagation problems”, e.g.:• which formals are constant?• which formals are aliased to globals?
• Track info that flows out of procedures– “Side effect problems”, e.g.:• which globals defined/used by
procedure?• which locals defined/used by procedure?• Which actual parameters defined by
procedure?
proc(x, y){. . . }
31
“Advanced Compiler Techniques”
Propagation Summaries: Examples
• MAY-ALIAS– Formals that may be aliased to globals
• MUST-ALIAS– Formals definitely aliased to globals
• CONSTANT– Formals that are definitely constant
32
“Advanced Compiler Techniques”
Side-Effect Summaries: Examples
• MOD– Variables possibly modified (defined) by
procedure call• REF– Variables possibly referenced (used) by
procedure• KILL– Variables that are definitely killed in
procedure33
“Advanced Compiler Techniques”
Computing Summaries• Bottom-up (MOD, REF,
KILL)– Summarizes call effects
• Top-down (MAY-ALIAS)– Summarizes information
about caller• Bi-directional (AVAIL,
CONSTANT)– Info to/from caller & callee
34
“Advanced Compiler Techniques”
Side-Effect Summarization
• At procedure boundaries:– Translate formal args to actuals at call
site• Compute:– GMOD, GREF = procedure side effects–MOD, REF = effects at call site• Possibly specific to call
35
“Advanced Compiler Techniques”
Parameter Binding
• At procedure boundaries, we need to translate formal arguments of procedure to actual arguments of procedure at call siteint a,bprogram main // MOD(foo) = b
foo(b) // REF(foo) = a,bendprocedure foo (var c) // GMOD(foo)= b
int d // GREF(foo)= a,bd := bbar(b) // MOD(bar) = b
end // REF(bar) = aprocedure bar (var d)
if (...) // GMOD(bar)= d d := a // GREF(bar)= a
end 36
37
Constructing Summary Flow Functions Iteratively
Termination is possible only if all function compositionsand confluences can be reduced to a finite set of functions
“Advanced Compiler Techniques”
38
An Example of InterproceduralLiveness Analysis
“Advanced Compiler Techniques”
39
An Example of InterproceduralLiveness Analysis
“Advanced Compiler Techniques”
40
An Example of InterproceduralLiveness Analysis
“Advanced Compiler Techniques”
41
An Example of InterproceduralLiveness Analysis
“Advanced Compiler Techniques”
42
An Example of InterproceduralLiveness Analysis
“Advanced Compiler Techniques”
43
An Example of InterproceduralLiveness Analysis
“Advanced Compiler Techniques”
e ∈ InSp but e ∉ Inc1
44
Interprocedural Validity andCalling Contexts
“You can descend only as much as you have ascended!”Every descending step must match a corresponding ascending step.Calling context is represented by the remaining descending steps. “Advanced Compiler Techniques”
45
Available Expressions Analysis Using Call Strings Approach
“Advanced Compiler Techniques”
Is a ∗ bavailable?
int a, b, t; void p() { if (a == 0) { a = a-1; p(); t = a b;∗ } }
YES!
46
Available Expressions Analysis Using Call Strings Approach
“Advanced Compiler Techniques”
“Advanced Compiler Techniques”
Alternatives to IPA: Inlining
• Replaces calls to procedures with copies of their bodies
• Converts calls from opaque objects to local code– Exposes the “effects” of the called procedure– Extends the compilation region
• Language support: the inline attribute– But the compiler can decide per call-site,
rather than per procedure
47
“Advanced Compiler Techniques”
Inlining Decisions
• Must be based on – Heuristics, or– Profile information
• Considerations– The size of the procedure body (smaller=better)– Number of call sites (1=usually wins)– If call site is in a loop (yes=more optimizations)– Constant-valued parameters
48
49
Inlining Policies The hard question – How do we decide which calls to inline? Many possible heuristics – Only inline small functions – Let the programmer decide using an inline directive – Use a code expansion budget [Ayers, et al ’97] – Use profiling or instrumentation to identify hot paths—
inline along the hot paths [Chang, et al ’92] – JIT compilers do this
– Use inlining trials for object oriented languages [Dean & Chambers ’94] – Keep a database of functions, their parameter
types, and the benefit of inlining – Keeps track of indirect benefit of inlining – Effective in an incrementally compiled language
“Advanced Compiler Techniques”
“Advanced Compiler Techniques”
Study on Real Compilers
Cooper, Hall, Torczon (92)• Eight Programs, five compilers, five processors• Eliminated 99% of dynamic calls in 5 of the
programs• Measured speed of original vs. transformed code
What do you expect? V.S
.
50
“Advanced Compiler Techniques”
Results on real compilers
51
“Advanced Compiler Techniques”
What happened?
• Input code violated assumptions made by compiler writers– Longer procedures– More names– Different code shapes
• Exacerbated problems that are unimportant on “normal” code– Imprecise analysis– Algorithms that scale poorly– Tradeoffs between global and local speed– Limitations in the implementations
The compiler writers were surprised!52
“Advanced Compiler Techniques”
Inlining: Summary
• Pros+Exposes context & side effects+Simple
• Cons- Code bloat (bad for caches, branch predictor)- Can’t decide statically for OOPs- Library source?- Recursion?- How do we decide when to inline?
53
“Advanced Compiler Techniques”
Alternatives to IPA: Cloning
• Cloning: customize procedure for certain call sites
• Partition call sites to procedure p into equivalence classes– e.g., {{call3, call1}, {call4}}
• Equivalence based on optimization– Constant propagation: partition based on parameter
value 54
“Advanced Compiler Techniques”
Cloning
• Pros+Compromise between inlining & IPA+ Less code bloat compared to inlining+No problem with recursion+Better caller/callee optimization potential
(compared to IPA)
• Cons- Some code bloat (compared to IPA)
- May have to do interprocedural analysis anyway e.g. Interprocedural constant propagation can guide cloning
55
“Advanced Compiler Techniques”
Summary
• Interprocedural analysis– Difficult but expensive• Need source code, recompilation analysis• Trade-offs for precision & speed/space• Better than inlining
– Useful for many optimizations– IPA and cloning likely to become more
important• Java: many small procedures
56
57
Summary Most compilers avoid interprocedural
analysis – It’s expensive and complex – Not beneficial for most classical optimizations – Separate compilation + interprocedural analysis requires
recompilation analysis [Burke and Torczon’93] – Can’t analyze library code
When is it useful? – Pointer analysis – Constant propagation – Object oriented class analysis – Security and error checking – Program understanding and re-factoring – Code compaction – Parallelization
“Advanced Compiler Techniques”
“Modern” Uses of Compilers{
58
Trends Cost of procedures is growing
– More of them and they’re smaller (OO languages)
– Modern machines demand precise information (memory op aliasing)
Cost of inlining is growing – Code bloat degrades efficacy of many
modern structures – Procedures are being used more extensively
Programs are becoming larger Cost of interprocedural analysis is shrinking
– Faster machines – Better methods
“Advanced Compiler Techniques”
“Advanced Compiler Techniques”
Next Time
• Homework– Convert program to SSA form– Exercise 12.1.1
• Pointer Analysis– Reading: Dragon chapter 12
• Mid-term Review
59