Top Banner
1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan Xiaofang Chen School of Computing, University of Utah Salt Lake City, UT Intel SRC Customization Award 2005-TJ-1318
48

1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

1

Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors:Annual Review Presentation – April 2007

Presenters: Ganesh Gopalakrishnan

Xiaofang Chen

School of Computing, University of UtahSalt Lake City, UT

Intel SRC Customization Award2005-TJ-1318

Page 2: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

2

Project Personnel

IBM Mentor: Dr. Steven M. German Intel Mentor: Dr. Ching-Tsun Chou Primary Student:

Xiaofang Chen Summer internship planned - IBM T.J. Watson (6/07)

where the research discussed here in Project 2 will be furthered

Other SRC Student: Robert Palmer (work involving TLA+ modeling of

communication libraries) Defense May 10; Expected to join Intel (6/07)

3 other PhD students, 1 MS student, 2 UGs in FV all working on FV of threading / msg-passing software

Page 3: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

3

Multicores are the future!Their caches are visibly central…

(photo courtesy of

Intel Corporation.)

> 80% of chipsshipped will bemulti-core

Page 4: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

4

…and the number of organizations of multiprocessor caches is mindboggling (e.g. imagine 80 cores and deeper hierarchies).

Interface

L2 Cache+Local Dir

L1 Cach

e

L1 Cach

e

Global Dir

MainMemory

Cluster 2Cluster 1 Cluster 3

Interface

L2 Cache+Local Dir

L1 Cach

e

L1 Cach

e

Interface

L2 Cache+Local Dir

L1 Cach

e

L1 Cach

e

Shared / Private

Inclusive / Exclusive

Page 5: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

5

Protocol design happens in “the thick of things” (many interfaces, constraints of performance, power, testability).

From “High-throughput coherence control and hardware messaging in Everest,” by Nanda et.al., IBM J.R&D 45(2), 2001.

Page 6: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

6

Future Coherence Protocols

Cache coherence protocols that are tuned for the contexts in which they are operating can significantly increase performance and reduce power consumption [Liqun Cheng]

Producer-consumer sharing pattern-aware protocol [Cheng, HPCA07] 21% speedup and 15% reduction in network traffic

Interconnect-aware coherence protocols [Cheng, ISCA06] Heterogeneous Interconnect Improve performance AND reduce power 11% speedup and 22% wire power savings

Bottom-line: Protocols are going to get more complex!

Page 7: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

7

Designers have poor conceptual tools (e.g., “Informal MSC drawings”). Need better notations and tools.

LDirL1-1 GDir

Req_S

(S) (S: L1-1)

L1-2

(I)Swap

Broadcast

NAckFwd_Req

Gnt_S

Gnt_S

(S: L1-2)

Page 8: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

8

Design Abstractions in More Modern Flows

An Interleaving Protocol Model (Murphi or TLA+ are the languages of choice here) FV here eliminates concurrency bugs

Detailed HDL model FV here eliminates implementation bugs;

however Correspondence with Interleaving Model is lost

Need more detailed models anyhow Interleaving Models are very abstract

Monolithic Verification of HDL Code Does not Scale Design optimizations captured at HDL level

Interleaving model becomes more obsolete Need an Integrated Flow:

Interleaving -> High level HW View -> Final HDL

Page 9: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

9

Related Work in Formal HW Design

BlueSpec High level design is expressed using atomic

transactions Synthesizes high level designs into hardware

implementations Automatic scheduling of high level design steps in

hardware May not meet performance goals

Malik et.al. Formal Architecture and Microarchitecture Modeling for Verification Meant for Instruction Set Processors

Need Formal theory of Refinement from Interleaving to High level HW Models

Page 10: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

10

Our Goals Develop Methodology to Verify “Realistic” Interleaving

Models Useful Benchmarks for others Our particular contributions are towards Hierarchical

protocols Largely Inspired by Chou et.al.’s work (FMCAD’04) Xiaofang Chen’s PhD is wrapping up a nice story

here!

Develop Language and Formal Theory for Higher Level HW Specification & Refinement Ideas largely due to German & Janssen Xiaofang Chen’s PhD work is taking ideas from

initial proposal all the way to practical realization!

Page 11: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

11

A summary of our work over Y1-2

1. Three progressively better approaches to verify hierarchical cache coherence protocols at the interleaving level

1. A/G method of complementary abstractions (FMCAD’06)2. Extensions to Non-inclusive hierarchies (TR 06-014)3. Abstract each level separately (to be submitted)4. Error-trace checking (to be submitted)

2. A theory of transaction based design and verification (writeup finished; initial experiments finished)

3. Modular verification of transactions (writeup in progress; initial experiments finished)

Number the projects 1.1, 1.2, 1.3, 1.4, 2, and 3

Page 12: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

12

Project 1.[1-4] Timeline

1.1: FMCAD’06 results

1.2: Another hierarchical benchmark (non-inclusive)

1.3: Abstraction per level (more scalable)

1.4: Automatic Recognition of spurious/real bugs

Page 13: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

13

1.[1-4]: Hierarchical Protocols

RAC

L2 Cache+Local Dir

L1 Cach

e

L1 Cach

e

Global Dir

MainMemory

Home ClusterRemote Cluster 1

Remote Cluster 2

RAC

L2 Cache+Local Dir

L1 Cach

e

L1 Cach

e

RAC

L2 Cache+Local Dir

L1 Cach

e

L1 Cach

e

Page 14: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

14

Abstracted Protocol #1

RAC

L2 Cache+Local Dir’

Global Dir

MainMemory

Home Cluster

Remote Cluster 1

Remote Cluster 2

RAC

L2 Cache+Local Dir

L1 Cach

e

L1 Cach

e

RAC

L2 Cache+Local Dir’

Page 15: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

15

Abstracted Protocol #2

RAC

L2 Cache+Local Dir’

Global Dir

MainMemory

Home Cluster

Remote Cluster 1

Remote Cluster 2

RAC

L2 Cache+Local Dir

L1 Cach

e

L1 Cach

e

RAC

L2 Cache+Local Dir’

Page 16: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

16

Non-Circular Assume/Guarantee

We can’t verify this due to state explosion: h ║ r1 ║ r2 ╞ Coh

Instead Check-1: h ║ R1 ║ R2 ╞ Coh1 Λ Guarant1 Check-2: H ║ r1 ║ R2 ╞ Coh2 Λ Guarant2

Page 17: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

17

Protocol features Broadcast channels Non-imprecise local dir

Verification challenges A/G cannot infer local dir from just intra-

clusters Coherence may involve multiple L1

caches

1.2: We applied the non-circular A/G method to a Non-Inclusive Hierarchical Protocol….

Page 18: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

18

Verifying Non-Inclusive Protocols

Inferring “L2.State = Excl” from Outside the cluster Inside the cluster

Use history variables to change non-inclusive to inclusive protocols

Page 19: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

19

Experimental Results

Protocols # of States Mem (GB)

Model Check

Hierarchy > 1,521,900,000 20 No

Abs-1 234,478,105 20 Y

Abs-2 283,124,383 20 Y

Reduction is over 65%

Page 20: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

20

1.3: We then tried a “Split Hierarchy Per Level Approach” to using non-circular A/G

RAC

L2 Cache+Local Dir’

Global Dir

MainMemory

RAC

L2 Cache+Local Dir’

RAC

L2 Cache+Local Dir’

L2 Cache+Local Dir

L1 Cach

e

L1 Cach

e ABS #1

L2 Cache+Local Dir

L1 Cach

e

L1 Cach

eABS #2

ABS #3

Page 21: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

21

A Sample Scenario

Home ClusterRemote Cluster 1 Remote Cluster 2

1. Req_Ex

2. Fwd Req_Ex

3. Fwd Req_Ex

4. Fwd Req_Ex

5. Grant

6. Grant

Excl Invld

Page 22: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

22

Map to Abstracted ProtocolsRemote Cluster 1 Remote Cluster 2

2. Fwd Req_Ex

3. Fwd Req_Ex

5. Grant

6. Grant

1. Req_Ex4. Fwd Req_Ex

InvldExcl

Page 23: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

23

Experimental Results

Protocols # of States Exec time(sec)

Mem (GB)

Model Check

Hierarchy > 438,120,000 >125,799 18 No

Inter 1,500,621 269 2 Y

Intra-1 564,878 48 2 Y

Intra-2 188,842 18 2 Y

Reduction is over 95% !

Page 24: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

24

Project 1.4: Automatic Recognition of Spurious / Real Bugs in these approaches

Problem statement Given an error trace of ABS protocol Is it a real bug of the original protocol?

Solution In the original protocol, using BFS to

guide the model checking to match the error trace

Reason: because our abstraction is just projection

Page 25: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

25

Basic Idea of Automatic Recognition

v1=0, v2=0

v1=1, v2=2

v1=6, v2=8

……

v1=3, v2=1, v3=0

v1=0, v2=0, v3=0

v1=1, v2=2, v3=1

v1=0, v2=0, v3=3

keep

keep

drop

…………

Error trace of Abs. protocol Directed BFS of original

protocol

Page 26: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

26

Y3 Plans for Project 1: Considerable Experience Gained Three Large Benchmark Protocols (each is 3000+ lines

of Murphi Code) on the web

Have Reduced Verif Complexity of Hier Protocols by 90%

Can Identify Spurious Errors Automatically All Finite-state

Not Parameterized No plans for Parameterized

Y3 Plans: Build Tool to support this methodology

Page 27: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

27

Summary of Projects 2 and 3

1. Three progressively better approaches to verify hierarchical cache coherence protocols at the interleaving level

1. A/G method of complementary abstractions (FMCAD’06)2. Extensions to deeper, and non-inclusive hierarchies (TR 06-014)3. Latest method that abstracts each level separately (to be

submitted)4. Error-trace checking (to be submitted)

2. A theory of transaction based design and verification (writeup finished)

3. Modular verification of transactions (writeup in progress)

Page 28: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

28

Transaction Level HW Modeling

The problem addressed: Bridge the gap between high-level specifications and RTL implementations

Global properties cannot be formally verified at RTL Level!

Specifications can be verified, but do they correctly represent the implementations?

Page 29: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

29

Driving Design Benchmark due to German and Geert Janssen

Page 30: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

30

What changes when moving from a spec to an implementation?

Atomicity Concurrency Granularity in modeling

1 1.1

1.2

1.3

client home

client

router buffer

home

Page 31: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

31

General Mappings between high level transitions and transactions that help implement them

High Level Transition 1

Low Level Transitions that help realize 1

1

1.1 1.2

1.3

High Level Transitions take some non-zero unit of time (conceptual)

Each Low Level Transition takesOne Clock Cycle

Page 32: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

32

High-Level and Low-Level Computations

1

1.1 1.2

1.3

2 3

2.1 2.2 3.1

3.2

3.3

Page 33: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

33

Specification of High and Low Levels

1

1.1 1.2

1.3

In Murphi as a Guard Action Rule

In HMurphi as Multiple Guard Action Rulesenclosed in a Begin Transaction / End Transaction

The Guards Decide when each low level transition can fire

The Maximal Number of Low Level Transitions Enabledin any state are concurrently fired within each clock tick

Page 34: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

34

Transaction

A transaction is a set of transitions in Impl that correspond to a transition in Spec

Transaction

Rule 1

……

Rule n

Endtransaction;

Page 35: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

35

Executions

Spec: interleaving One enabled transition fires at each step

Impl: concurrent All enabled transitions fire at each step

……1 2 3

……{1.1, 2.1} {1.2} {2.2, 3.1, 3.2}

Page 36: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

36

A Few Notations

Observable variables: VH

These are Variables used in both Spec and Impl

Impl has additional internal variables also

A variable v is inactive at a state s if all transactions in Impl that can write to v are quiescent at s

Page 37: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

37

A Formal Notion of Simulation

For every concurrent execution of Impl, exists an interleaving execution of Spec, VH ∩ inactive(li) match

…… {…} {…} {…}l0 l1 l2

……t0 t1 t2h0 h1 h2

Page 38: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

38

Simulation Checks

Spec(I)

I

Spec(I’)Spec

transition

Impl transaction I’

Guard for Spec transition must hold

I is a reachable state where the commit guard is true

Observable vars changed by either Spec or Impl must match

Page 39: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

39

Model Checking Approaches

Monolithic Cross product construction

Compositional Abstraction Assume/Guarantee

Page 40: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

40

Compositional Approach

Abstraction Change read to an access of an input var Self-sourced read Add all transitions that write to a var

Assume/Guarantee Require all writes to var guarantee prop P Assume P holds on all reads

Page 41: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

41

Example of Abstraction

Transaction … Rule (v1 = d1) => ... …Endtransaction

Transaction 1

Transaction 2

Transaction n

……

Page 42: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

42

Example of Assume/Guarantee

Transaction 1: Request granted

Transaction 2: Update Cache

State := Excl

Data := d

Impl.State = Spec.State

Page 43: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

43

Benchmarks

High level in FMCAD’04 tutorial Low level provided by German and

Janssen Sizes:

1 Home node, 1 remote node

Sizes are constrained by accessible VHDL tools!

Page 44: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

44

Implementations

Muv: HMurphi VHDL Written by German

Mud: Static analyzer for possible conflicts /

dependencies VHDL verifier

IBM RuleBase

Page 45: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

45

Preliminary Results

Approaches # Flip-Flops

# Gates

Time (min)

Monolithic 212 8574 17

Decomposed W/W

conflicts108 5763 11

closures 89 2194 3

* This is for datapath = 1 bit* Intel Xeon CPU 3.0GHz, 2GB memory

Page 46: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

46

When Datapath > 1 bit Cannot check monolithic approach

RuleBase 300 F-F academic license restriction Decomposed approach

W/W checks not affected

Datapath bits # of F-F # of Gates

1 89 2194

2 97 2380

26 289 6659

Page 47: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

47

Future Work

Reduce the cost of W/W conflicts checking Localized reasoning

Apply to pipeline More benchmarks Try other VHDL tools

SixthSense etc.

Page 48: 1 Scaling Formal Methods toward Hierarchical Protocols in Shared Memory Processors: Annual Review Presentation – April 2007 Presenters: Ganesh Gopalakrishnan.

48

Publications, Software, Models FMCAD 2006 paper Presentation at Intel Journal version of hierarchical coherence protocol verification (under

prep) TR on Theory of Transaction Based Specification and Verification

(under prep) Detailed VHDL-level German Protocol developed Analysis Framework for HMurphi Developed Preliminary Verification Experiments using Cadence IFV, IBM

RuleBase, and IBM SixthSense Xiaofang Chen’s Summer Internship at IBM T.J. Watson Res. Ctr. Robert’s SRC Poster Techcon 2007 submission

There will be more publications during 2007-8 following hiatus due to infrastructure build-up (many delays!)