Top Banner
1/17 Design Patterns and Computer Architecture Mark Murphy, Scott Beamer, Henry Cook, Andrew Waterman, Krste Asanovic, Kurt Keutzer
20

1/17 Design Patterns and Computer Architecture Mark Murphy, Scott Beamer, Henry Cook, Andrew Waterman, Krste Asanovic, Kurt Keutzer.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1/17 Design Patterns and Computer Architecture Mark Murphy, Scott Beamer, Henry Cook, Andrew Waterman, Krste Asanovic, Kurt Keutzer.

1/17

Design Patternsand

Computer Architecture

Mark Murphy,Scott Beamer, Henry Cook, Andrew

Waterman,Krste Asanovic, Kurt Keutzer

Page 2: 1/17 Design Patterns and Computer Architecture Mark Murphy, Scott Beamer, Henry Cook, Andrew Waterman, Krste Asanovic, Kurt Keutzer.

Design Patterns and Architecture

Design patterns (so far) are good at exposing ||ism Only half of the battle / There is parallelism everywhere

we look!

We need to incorporate Architectural information But not too much: we don't want to drown in detail!

Computer Architects need patterns too! Dwarfs were supposed to supplant benchmarks,

remember? Dwarfs -> Computational Patterns: too vague for

architects

Do design pattern writers need architectural patterns? Standardize a vocabulary to discuss performance

issues?

2/17

Page 3: 1/17 Design Patterns and Computer Architecture Mark Murphy, Scott Beamer, Henry Cook, Andrew Waterman, Krste Asanovic, Kurt Keutzer.

Work In Progress The point of this talk is not to present any results I want your input on result of brainstorming

sessions between myself and the Architecture research group

There are 40 minutes for this -- ~20 of me presenting slides and the rest for discussion

3/17

Page 4: 1/17 Design Patterns and Computer Architecture Mark Murphy, Scott Beamer, Henry Cook, Andrew Waterman, Krste Asanovic, Kurt Keutzer.

Structural PatternsChoose your high level structure

Agent and repository Layered systems

Arbitrary static task graph Map reduce

Iterative refinement Model view controller

Process control Pipe-and-filter

Event based, implicit invocation

Puppeteer

Computational PatternsIdentify the key computations

Dense linear algebra

Backtrack branch and

bound

Monte carlo methods

Sparse linear algebra

Finite state machine

Dynamic programming

Unstructured grids

Graphical models

Graph algorithms

Structured grids N-body methods

Circuits

Spectral methods

Parallel Algorithm Strategy PatternsRefine the structure - what concurrent approach do I use? Guided re-organization

Task Parallelism

Geometric Decomposition

Data Parallelism Pipeline Discrete Event

Recursive Splitting

Implementation Strategy PatternsUtilize Supporting Structures – how do I implement my concurrency? Guided mapping

Program Structure

Actors SPMD Master/Worker Shared queue Distributed array Data StructureTask queue Strict data

parallelLoop

parallelismShared data Graph partitioning

Fork/Join BSP Shared hash table Memory parallelismConcurrent Execution Patterns

Implementation methods – what are the building blocks of parallel programming? Guided implementation

Advancing Program Counters Coordination

MIMD Thread pool Message passing Mutual exclusion Digital circuits

Task graph Speculation Collective communication Transaction al memory

SIMD Data flow Collective synchronization

P2P synchronization

Applications

Pro

duct

ivit

y L

ayer

Effi

ciency

Layer

Pattern Language Exposes ||ism

Page 5: 1/17 Design Patterns and Computer Architecture Mark Murphy, Scott Beamer, Henry Cook, Andrew Waterman, Krste Asanovic, Kurt Keutzer.

Pattern Language Exposes ||ism

Example from Machine Learning: Compute the gradient of a scalar function w.r.t a matrix

B Each entry of gradient requires NxN Blas2 matrix

computations

5/17

Page 6: 1/17 Design Patterns and Computer Architecture Mark Murphy, Scott Beamer, Henry Cook, Andrew Waterman, Krste Asanovic, Kurt Keutzer.

Pattern Language Exposes ||ism

Example from Quantum Chemistry: Need to compute a matrix <# basis functions> x <#

electrons> Each entry of matrix requires evaluating a number of

functions, and summing the results

6/17

Page 7: 1/17 Design Patterns and Computer Architecture Mark Murphy, Scott Beamer, Henry Cook, Andrew Waterman, Krste Asanovic, Kurt Keutzer.

Pattern Language Exposes ||ism

In both examples, we have (at least) two levels of ||ism Many entries in matrix (Task Parallel) Much work in computing each entry (Map/Reduce Data

Parallel) The pattern language can pretty much tell us this

However, the right parallel program for a GPU-like manycore processor looks different in the two cases for the Machine Learning problem, only parallelize the

computation of each matrix element for the Chemistry problem, parallelize at both levels

Knowing this requires understanding that GPU-like processors implement fine-grained data parallelism best

7/17

Page 8: 1/17 Design Patterns and Computer Architecture Mark Murphy, Scott Beamer, Henry Cook, Andrew Waterman, Krste Asanovic, Kurt Keutzer.

SW writers understand HW arch?

There has been a sentiment that the pattern language should be architecture-agnostic

Architectural savvy required for decisions like these.

Otherwise, the options are all unattractive: Implement every possible parallelization, choose best? ... Choose one parallelization, hope it works? ... Ask Bryan to parallelize your code?

But clearly we can't write a pattern language around GTX200, just as we can't write it around LRB or Nehalem

8/17

Page 9: 1/17 Design Patterns and Computer Architecture Mark Murphy, Scott Beamer, Henry Cook, Andrew Waterman, Krste Asanovic, Kurt Keutzer.

Performance Models? Abstract, simplistic models to capture the

essence of low-level performance issues. Extant example: logP for distributed memory

machines l -- Network Latency for message o -- CPU overhead of sending a message g -- gap = inverse of NIC bandwidth P -- number of processors

9/17

l-latency network

Page 10: 1/17 Design Patterns and Computer Architecture Mark Murphy, Scott Beamer, Henry Cook, Andrew Waterman, Krste Asanovic, Kurt Keutzer.

Performance Models? Could imagine a similar model for current

manycores. How about this one? The BLIMP model:

B(L) -- Bandwidth as function of load/store block size I -- # Instruction Fetch units M -- # Load/Store units P -- # Execution Pipelines

10/17

I = 4

P = 8

Page 11: 1/17 Design Patterns and Computer Architecture Mark Murphy, Scott Beamer, Henry Cook, Andrew Waterman, Krste Asanovic, Kurt Keutzer.

Performance Models? Problems are obvious

Sure -- you can analyze the FFT algorithm and Matrix Mulitply

But what about my code? Can't handle data dependence in computational

intensity Example: SIFT Feature Extraction

Compute a "scale space" For each maximum in scale space:

Do a whole bunch of work How many maxima are there?

"Interesting" architectural features cannot be described

Still .... better than nothing? 11/17

Page 12: 1/17 Design Patterns and Computer Architecture Mark Murphy, Scott Beamer, Henry Cook, Andrew Waterman, Krste Asanovic, Kurt Keutzer.

Design Patterns and Architecture

Design patterns (so far) are good at exposing ||ism Only half of the battle / There is parallelism everywhere

we look!

We need to incorporate Architectural information But not too much: we don't want to drown in detail!

Computer Architects need patterns too! Dwarfs were supposed to supplant benchmarks,

remember? Dwarfs -> Computational Patterns: too vague for

architects

Do design pattern writers need architectural patterns? Standardize a vocabulary to discuss performance

issues?

12/17

Page 13: 1/17 Design Patterns and Computer Architecture Mark Murphy, Scott Beamer, Henry Cook, Andrew Waterman, Krste Asanovic, Kurt Keutzer.

Architects need patterns too! "Benchmark Addiction" was part of motivation for

Dwarfs Reliance upon C-source code benchmarks pigeon-holed

architectural innovation Dwarfs were supposed to be anti-benchmarks: provide a

non-source code description of the computations that were important

We (i.e. Tim) quickly discovered that Dwarfs were far too vague and high-level to serve this purpose A Computational Patern (~Dwarf) doesn't even imply a

particular problem to be solved, much less a particular algorithm

Can the fleshed-out pattern language be the solution?

13/17

Page 14: 1/17 Design Patterns and Computer Architecture Mark Murphy, Scott Beamer, Henry Cook, Andrew Waterman, Krste Asanovic, Kurt Keutzer.

Anti-Benchmarks? Architecture-agnostic patterns-based analysis of a

program enumerates space of implementations

14/17

Task Parallel

Map/Reduce

But architects still need their benchmark fix What does this actually tell them? They need to know:

Is my cache big enough? Should I include my whiz-bang u-arch

widget?

Page 15: 1/17 Design Patterns and Computer Architecture Mark Murphy, Scott Beamer, Henry Cook, Andrew Waterman, Krste Asanovic, Kurt Keutzer.

Anti-Benchmarks Suppose that the pattern language included

somehow the architectural savvy needed to make every possible implementation decision

What happens when the architect changes the rules?

15/17

Page 16: 1/17 Design Patterns and Computer Architecture Mark Murphy, Scott Beamer, Henry Cook, Andrew Waterman, Krste Asanovic, Kurt Keutzer.

Multiple Levels of Description Level 0: A patterns-based description Level 1: An "Abstract Machine" model? Level 2: A performance model? Level 3: A cycle-accurate simulation? Level 4: A joule-accurate simulation?

16/17

Page 17: 1/17 Design Patterns and Computer Architecture Mark Murphy, Scott Beamer, Henry Cook, Andrew Waterman, Krste Asanovic, Kurt Keutzer.

Abstract Machines Alternate proposal for performance model (K.

Asanovic) Given a microarchitectural widget, how does its

presence/absence affect the performance of a program? Map the program to two different machines (one

with, one without the widget). How are the programs different? Mapping process TBD. SEJITS?

Examples: An "Infinite ILP" machine. The superscalar analogue of

PRAM An Infinite Vector-width machine. An infinite thread machine

17/17

Page 18: 1/17 Design Patterns and Computer Architecture Mark Murphy, Scott Beamer, Henry Cook, Andrew Waterman, Krste Asanovic, Kurt Keutzer.

Design Patterns and Architecture

Design patterns (so far) are good at exposing ||ism Only half of the battle / There is parallelism everywhere

we look!

We need to incorporate Architectural information But not too much: we don't want to drown in detail!

Computer Architects need patterns too! Dwarfs were supposed to supplant benchmarks,

remember? Dwarfs -> Computational Patterns: too vague for

architects

Do design pattern writers need architectural patterns? Standardize a vocabulary to discuss performance

issues?

18/17

Page 19: 1/17 Design Patterns and Computer Architecture Mark Murphy, Scott Beamer, Henry Cook, Andrew Waterman, Krste Asanovic, Kurt Keutzer.

Architectural Meta-Patterns Hopefully by now I've conveyed my concern

about the lack of architectural / performance information in design patterns

Also, hopefully it is clear that I don't know the answer

Maybe someone can write me a pattern? How should I tell you what I know about

architecture?

19/17

Page 20: 1/17 Design Patterns and Computer Architecture Mark Murphy, Scott Beamer, Henry Cook, Andrew Waterman, Krste Asanovic, Kurt Keutzer.

Thank You

20/17