Top Banner
Type Systems For Distributed Data Sharing Ben Liblit Alex Aiken Kathy Yelick
38

Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Type Systems ForDistributed Data Sharing

Ben LiblitAlex Aiken Kathy Yelick

Page 2: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Distributed Sharing: Many Uses

• Data location management• Cache coherence• Race condition detection• Program/algorithm documentation• Consistency model relaxation• Synchronization elimination• Autonomous garbage collection• Security

Page 3: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Distributed Memory Model

• Multiple machines, each with local memory

• Global memory is union of local memories

• Distinguish two types of pointers:– LocalLocal points to local memory only– GlobalGlobal points anywhere: machine, address– Different representations & operations

Page 4: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Type Grammar

boxedint::

globallocal::

• Boxed and unboxed values• Integers, pointers, and pairs

– Pairs are not assumed boxed

• References to boxes are either local or global

Page 5: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Review of Global Dereferencing:Standard Approach Unsound

5

int local boxed where,:

global boxed:

x

x

x =

x =

Page 6: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Review of Global Dereferencing:Sound With Type Expansion

5

expand:

global boxed:

x

x

x =

x =

Page 7: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Type Expansion in Detail

intintpop

pop,pop,pop

global boxed global boxedpop

intintexpand

pop,pop,expand

global boxed boxedexpand

2121

2121

ττ

ττω

Page 8: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Representation Versus Sharing

• Locally pointed-to data might not be private

5

Page 9: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Representation Versus Sharing

• Locally pointed-to data might not be private– Because of local / global aliasing

5

x =

Page 10: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Representation Versus Sharing

• Locally pointed-to data might not be private– Because of transitivity + pointer widening

5

y =

y =

Page 11: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

y =

Representation Versus Sharing

• Globally pointed-to data might not be shared– What if “y” never actually happens?

5

y =

Page 12: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

y =

Representation Versus Sharing

• But globally used data must be shared– If “y” can happen, local pointer cell is shared.– What about cell containing “5”?

5

y =

Page 13: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Data Sharing as Types

• Shared data allows certain operations– Access by way of global pointer

• Private data allows other operations– Optimizations, GC, fast monitors, etc.

• Some form of polymorphism is essential– Neither subsumes the other– But we can have a common supertype

Page 14: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

boxedint::

privatemixedshared::

globallocal::

Augmented Type Grammar

• Allow subtyping of pointers, pairs– But not across pointers, since we allow assignment

• Allocation is explicitly shared or private• Question: what can you do with mixed data?

Page 15: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Late Enforcement:Limited Use of Global Pointers

global boxed boxedexpand

:

local boxed:

expand:

shared global boxed:

x

x

x

x

Page 16: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Late Enforcement: Applicability

Data location managementCache coherenceRace condition detectionProgram/algorithm documentationConsistency model relaxationSynchronization elimination Autonomous garbage collection (in practice) Security

Page 17: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Why Garbage Collection Breaks

1. Send out global pointer to my private data

2. Destroy all my local pointers to it

3. GC locally unreachable private data

4. …

5. Get that global pointer back again later

6. It points to my data, so coerce to local

7. Use this local pointer to my private data

Page 18: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Slightly Earlier Enforcement:No Escape of Private Addresses

shared global boxed shared boxedexpand

:

local boxed:

expand:

shared global boxed:

x

x

x

x

• Note that τ′ might reference private dataAutonomous garbage collection: OK Security: not OK

Page 19: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Early Enforcement:Shared is Transitively Closed

shared local boxed:

allShared :

private local boxed:

:

sp x

x

x

x

trueintallShared

allSharedallShared,allShared

shared boxedallShared

2121

τω

Page 20: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Recap of Enforcement Strategies

• Late enforcement– Anything can point to anything– Restricted global dereference & assignment

5

y =

3

Page 21: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Recap of Enforcement Strategies

• Slightly earlier enforcement– Can only reveal shared addresses– Still restrict global pointer operations

5

y =

3

Page 22: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Recap of Enforcement Strategies

• Early enforcement– Shared universe is transitively closed– Global pointer restrictions trivially satisfied

5

y =

3

Page 23: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Type Inference:Constraint Generation

• Type structure already known– Including local / global

• Induce constraints on sharing qualifiersδ = shared from global deref / assignδ ≤ δ′ from assignmentsδ = δ′ from various other

operations

• Stricter enforcement adds more constraintsδ = shared δ′ = shared

Page 24: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Type Inference:Constraint Resolution

• Given constraints:private ≤ δ1 δ ≤ δ1

shared ≤ δ2 δ ≤ δ2

private sharedδ

δ1 δ2

Page 25: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Type Inference:Constraint Resolution

• Two “minimal” solutionsδ = shared δ1 = mixed δ2 = shared

private sharedδ = shared

δ1 = mixed δ2 = shared

Page 26: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Type Inference:Constraint Resolution

• Two “minimal” solutionsδ = shared δ1 = mixed δ2 = shared

δ = private δ1 = private δ2 = mixed

private sharedδ = private

δ1 = private δ2 = mixed

Page 27: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Type Inference:Biased Constraint Resolution

1. Push “shared” and “mixed” forward

private sharedδ

δ1 shared ≤ δ2

Page 28: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Type Inference:Biased Constraint Resolution

1. Push “shared” and “mixed” forward

2. Identify qualifiers which cannot be private

private sharedδ

δ1 shared ≤ δ2

Page 29: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Type Inference:Biased Constraint Resolution

1. Push “shared” and “mixed” forward 2. Identify qualifiers which cannot be private3. Set all other qualifiers to private

private sharedδ = private

δ1 = private shared ≤ δ2

Page 30: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Type Inference:Biased Constraint Resolution

2. Identify qualifiers which cannot be private 3. Set all other qualifiers to private4. Push “private” forward

private sharedδ = private

δ1 = privateshared ≤ δ2

private ≤ δ2

Page 31: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Type Inference:Biased Constraint Resolution

3. Set all other qualifiers to private4. Push “private” forward5. Set remaining qualifiers to “shared” or “mixed”

private sharedδ = private

δ1 = private δ2 = mixed

Page 32: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Implementation For Titanium

• Java + extensions– Objects, classes, interfaces, methods– Multidimensional arrays, templates– Local / global, communications primitives

• Sharing validation as type checking• Sharing inference as compiler analysis

– Late or early enforcement– Whole-program or partial

Page 33: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Experimental Findings:Static Metrics

• How much data is “private”?– 16% - 75% of all static declaration sites– 46% overall; 50% on largest benchmark

• Is “mixed” really needed?– Up to 6% of static sites, but large impact– Some utility code: could use parametric poly

Page 34: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Experimental Findings:Static Metrics

• Why have “local shared”?– 24% - 53% of shared data is locally addressed– Bad idea to force these to global

• Does enforcement policy affect results?– No change for small benchmarks (<1000 lines)– 1% - 4% shift for larger codes

Page 35: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Experimental Findings: Consistency Model Relaxation

• Impose sequentially consistent semantics– Restrict both Titanium & C optimizers– Relax restrictions for private data

• Performance impact varies widely– Negligible sequential slowdown: nothing to do– Sequential slowdown, offset by inference– Sequential slowdown, better inference needed

Page 36: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Experimental Findings:Other Dynamic Metrics

• Data location management– 1% - 100% of allocated bytes are private

• 45% in gas benchmark

– amr: highly sensitive to enforcement policy• 74% late / 19% early

• Synchronization elimination– Statically, one third eliminated– Dynamically, not significant for these codes

Page 37: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.

Summary

• “Shared” might not mean what you think– Related to local/global, but not the same

– Different degrees of privacy to choose from• Escape analysis, or several weaker alternatives

– Generalizes on earlier language designs

• Experimental implementation – Ideas & algorithms scale to real system

– More aggressive clients needed

– Potential for stronger (phase-aware) inference

Page 38: Type Systems For Distributed Data Sharing Ben Liblit Alex AikenKathy Yelick.