This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
(Slide content courtesy of Greg Steffan, U. of Toronto)
Todd C. Mowry 15-‐745: Pointer Analysis 1
Carnegie Mellon
Pros and Cons of Pointers
• Many procedural languages have pointers – e.g., C or C++: int *p = &x;
• Pointers are powerful and convenient – can build arbitrary data structures
• Pointers can also hinder compiler op-miza-on – hard to know where pointers are poin-ng – must be conserva-ve in their presence
• Has inspired much research – analyses to decide where pointers are poin-ng – many op-ons and trade-‐offs – open problem: a scalable accurate analysis
Todd C. Mowry 15-‐745: Pointer Analysis 2
Carnegie Mellon
Pointer Analysis Basics: Aliases
• Two variables are aliases if: – they reference the same memory loca-on
• More useful: – prove variables reference different loca-ons
Todd C. Mowry 15-‐745: Pointer Analysis 3
int x,y;
int *p = &x;
int *q = &y;
int *r = p;
int **s = &q;
Alias sets:
Carnegie Mellon
The Pointer Alias Analysis Problem
• Decide for every pair of pointers at every program point: – do they point to the same memory loca-on?
• A difficult problem – shown to be undecidable by Landi, 1992
• Correctness: – report all pairs of pointers which do/may alias
• Ambiguous: – two pointers which may or may not alias
• Accuracy/Precision: – how few pairs of pointers are reported while remaining correct – ie., reduce ambiguity to improve accuracy
Todd C. Mowry 15-‐745: Pointer Analysis 4
Carnegie Mellon
Many Uses of Pointer Analysis
• Basic compiler op-miza-ons – register alloca-on, CSE, dead code elimina-on, live variables, instruc-on
• Behavioral synthesis – automa-cally conver-ng C-‐code into gates
• Error detec-on and program understanding – memory leaks, wild pointers, security holes
Todd C. Mowry 15-‐745: Pointer Analysis 5
Carnegie Mellon
Challenges for Pointer Analysis
• Complexity: huge in space and -me – compare every pointer with every other pointer – at every program point – poten-ally considering all program paths to that point
• Scalability vs accuracy trade-‐off – different analyses mo-vated for different purposes – many useful algorithms (adds to confusion)
• Alloca-on site (any call to malloc/calloc) – Consider each to be a unique loca-on – Doesn’t differen-ate between mul-ple objects allocated by the same
alloca-on site • Shape analysis
– Recognize linked lists, trees, DAGs, etc.
Todd C. Mowry 15-‐745: Pointer Analysis 9
Carnegie Mellon
Aggregate Modeling Op-ons
Arrays
Todd C. Mowry 15-‐745: Pointer Analysis 10
… Elements are treated as individual loca-ons
or
Treat en-re array as a single loca-on
or
Treat en-re structure as a single loca-on
…
Elements are treated as individual loca-ons (“field sensi-ve”)
Structures
or
Treat first element separate from others
…
Carnegie Mellon
Flow Sensi-vity Op-ons
• Flow insensi-ve – The order of statements doesn’t maner
• Result of analysis is the same regardless of statement order
– Uses a single global state to store results as they are computed – Not very accurate
• Flow sensi-ve – The order of the statements maner – Need a control flow graph – Must store results for each program point – Improves accuracy
• Path sensi-ve – Each path in a control flow graph is considered
Todd C. Mowry 15-‐745: Pointer Analysis 11
Carnegie Mellon
Flow Sensi-vity Example
(assuming alloca;on-‐site heap modeling)
Todd C. Mowry 15-‐745: Pointer Analysis 12
S1: a = malloc(…); S2: b = malloc(…); S3: a = b; S4: a = malloc(…); S5: if(c) a = b; S6: if(!c) a = malloc(…); S7: … = *a;
Flow Insensi-ve aS7 à
Flow Sensi-ve aS7 à
Path Sensi-ve aS7 à
Carnegie Mellon
int a, b, *p; int main() { S1: f(); S2: p = &a; S3: g(); }
Context Sensi-vity Op-ons
• Context insensi-ve/sensi-ve – whether to consider different calling contexts – e.g., what are the possibili-es for p at S6?
Todd C. Mowry 15-‐745: Pointer Analysis 13
int f() { S4: p = &b; S5: g(); }
int g() { S6: … = *p; }
Context Insensi-ve:
Context Sensi-ve:
Carnegie Mellon
Pointer Alias Analysis Algorithms
References: • “Points-‐to analysis in almost linear ;me”, Steensgaard, POPL 1996 • “Program Analysis and Specializa;on for the C Programming Language”,
Andersen, Technical Report, 1994 • “Context-‐sensi;ve interprocedural points-‐to analysis in the presence of
func;on pointers”, Emami et al., PLDI 1994 • “Pointer analysis: haven't we solved this problem yet?”, Hind, PASTE 2001 • “Which pointer analysis should I use?”, Hind et al., ISSTA 2000
Todd C. Mowry 15-‐745: Pointer Analysis 14
Carnegie Mellon
Address Taken
• Basic, fast, ultra-‐conserva-ve algorithm – flow-‐insensi-ve, context-‐insensi-ve – oren used in produc-on compilers
• Algorithm: – Generate the set of all variables whose addresses are assigned to another
variable. – Assume that any pointer can poten-ally point to any variable in that set.
• Complexity: O(n) -‐ linear in size of program • Accuracy: very imprecise
Todd C. Mowry 15-‐745: Pointer Analysis 15
Carnegie Mellon
Address Taken Example
pS5 =
Todd C. Mowry 15-‐745: Pointer Analysis 16
T *p, *q, *r; int main() { S1: p = alloc(T); f(); g(&p); S4: p = alloc(T); S5: … = *p; }
References: • “Cloning-‐based context-‐sensi;ve pointer alias analysis using binary decision
diagrams”, Whaley and Lam, PLDI 2004 • “Symbolic pointer analysis revisited”, Zhu and Calman, PDLI 2004 • “Points-‐to analysis using BDDs”, Berndl et al, PDLI 2003
Todd C. Mowry 15-‐745: Pointer Analysis 22
Carnegie Mellon
Binary Decision Diagram (BDD)
Todd C. Mowry 15-‐745: Pointer Analysis 23
Binary Decision Tree Truth Table BDD
Carnegie Mellon
BDD-‐Based Pointer Analysis
• Use a BDD to represent transfer func-ons – encode procedure as a func-on of its calling context – compact and efficient representa-on
• Perform context-‐sensi-ve, inter-‐procedural analysis – similar to dataflow analysis – but across the procedure call graph
• Gives accurate results – and scales up to large programs
Todd C. Mowry 15-‐745: Pointer Analysis 24
Carnegie Mellon
Probabilis-c Pointer Analysis
References: • “A Probabilis;c Pointer Analysis for Specula;ve Op;miza;ons”, DaSilva and
Steffan, ASPLOS 2006 • “Compiler support for specula;ve mul;threading architecture with probabilis;c
points-‐to analysis”, Shen et al., PPoPP 2003 • “Specula;ve Alias Analysis for Executable Code”, Fernandez and Espasa, PACT 2002 • “A General Compiler Framework for Specula;ve Op;miza;ons Using Data
Specula;ve Code Mo;on”, Dai et al., CGO 2005 • “Specula;ve register promo;on using Advanced Load Address Table (ALAT)”, Lin et
al., CGO 2003
Todd C. Mowry 15-‐745: Pointer Analysis 25
Carnegie Mellon
Pointer Analysis: Yes, No, & Maybe
• Do pointers a and b point to the same loca-on? – Repeat for every pair of pointers at every program point
• How can we op-mize the “maybe” cases?
Todd C. Mowry 15-‐745: Pointer Analysis 26
*a = ~ ~ = *b
Definitely Not
Definitely
Maybe
Pointer Analysis
op-mize
*a = ~ ~ = *b
Carnegie Mellon
Let’s Speculate
• Implement a poten-ally unsafe op-miza-on – Verify and Recover if necessary
Todd C. Mowry 15-‐745: Pointer Analysis 27
int *a, x; … while(…) { x = *a; … } a is probably
loop invariant
int *a, x, tmp; … tmp = *a; while(…) { x = tmp; … } <verify, recover?>
Carnegie Mellon
Data Specula-ve Op-miza-ons
• EPIC Instruc-on sets – Support for specula-ve load/store instruc-ons (e.g., Itanium)
• Specula-ve compiler op-miza-ons – Dead store elimina-on, redundancy elimina-on, copy propaga-on, strength