Top Banner
slide 1 Outline Outline • Classification • ILP Architectures • Data Parallel Architectures • Process level Parallel Architectures • Issues in parallel architectures • Cache coherence problem • Interconnection networks
37

Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

Dec 30, 2015

Download

Documents

Alannah Daniel
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 1

OutlineOutlineOutlineOutline

• Classification

• ILP Architectures

• Data Parallel Architectures

• Process level Parallel Architectures

• Issues in parallel architectures

• Cache coherence problem

• Interconnection networks

Page 2: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 2

OutlineOutlineOutlineOutline

• Classification

• ILP Architectures

• Data Parallel Architectures

• Process level Parallel Architectures

• Issues in parallel architectures

• Cache coherence problem

• Interconnection networks

• Flynn’s [66]• Feng’s [72]• Händler’s [77]• Modern (Sima, Fountain & Kacsuk)

Page 3: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 3

Flynn’s ClassificationFlynn’s ClassificationFlynn’s ClassificationFlynn’s Classification

Architecture Categories

SISD SIMD MISD MIMD

Page 4: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 4

SISDSISDSISDSISD

C P MIS IS DS

Page 5: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 5

SIMDSIMDSIMDSIMD

C

P

P

MIS

DS

DS

Page 6: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 6

MISDMISDMISDMISD

C

C

P

P

M

IS

IS

IS

IS

DS

DS

Page 7: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 7

MIMDMIMDMIMDMIMD

C

C

P

P

M

IS

IS

IS

IS

DS

DS

Page 8: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 8

Feng’s ClassificationFeng’s ClassificationFeng’s ClassificationFeng’s Classification

1 16 32 641

16

64

256

16K

word length

bit slicelength

•MPP

•STARAN

•C.mmP

•PDP11

•PEPE

•IBM370

•IlliacIV

•CRAY-1

Page 9: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 9

Händler’s ClassificationHändler’s ClassificationHändler’s ClassificationHändler’s Classification

< K x K’ , D x D’ , W x W’ >

control data word

dash degree of pipeliningTI - ASC <1, 4, 64 x 8>

CDC 6600 <1, 1 x 10, 60> x <10, 1, 12> (I/O)

C.mmP <16,1,16> + <1x16,1,16> + <1,16,16>

PEPE <1 x 3, 288, 32>

Cray-1 <1, 12 x 8, 64 x (1 ~ 14)>

Page 10: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 10

Modern ClassificationModern ClassificationModern ClassificationModern Classification

Parallel architectures

Data-parallel

architectures

Function-parallel

architectures

Page 11: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 11

Data Parallel ArchitecturesData Parallel ArchitecturesData Parallel ArchitecturesData Parallel Architectures

Data-parallel

architectures

Vector

architectures

Associative

And neural

architectures

SIMDs Systolic

architectures

Page 12: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 12

Function Parallel ArchitecturesFunction Parallel ArchitecturesFunction Parallel ArchitecturesFunction Parallel Architectures

Function-parallel architectures

Instr level Parallel Arch

Thread level Parallel Arch

Process level Parallel Arch

(ILPs) (MIMDs)

Pipelined processors

VLIWs Superscalar processors

Distributed Memory

MIMD

Shared Memory

MIMD

Page 13: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 13

OutlineOutlineOutlineOutline

• Classification

• ILP Architectures

• Data Parallel Architectures

• Process level Parallel Architectures

• Issues in parallel architectures

• Cache coherence problem

• Interconnection networks

• Pipelining• VLIW• Superscalar

Page 14: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 14

PipeliningPipeliningPipeliningPipelining

IF D RF EX/AG M WB

• faster throughput with pipelining

• resource sharing across cycles • all instructions may not take same cycles

Page 15: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 15

Hazards in PipeliningHazards in PipeliningHazards in PipeliningHazards in Pipelining

• Procedural dependencies => Control hazards– conditional and unconditional branches, calls/returns

• Data dependencies => Data hazards– RAW (read after write)– WAR (write after read)– WAW (write after write)

• Resource conflicts => Structural hazards– use of same resource in different stages

Page 16: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 16

Pipeline PerformancePipeline PerformancePipeline PerformancePipeline Performance

CPI = 1 + (S - 1) * bTime = CPI * T / S

TS stages

Frequency of interruptions - b

Page 17: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 17

Cache/

memory

Fetch

Unit Single multi-operation instruction

multi-operation instruction

FU FU FU

Register file

ILP in VLIW processorsILP in VLIW processorsILP in VLIW processorsILP in VLIW processors

Page 18: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 18

Cache/

memory

Fetch

UnitMultiple instruction

Sequential stream of instructions

FU FU FU

Register file

Decode

and issue

unit

Instruction/control

Data

FU Funtional Unit

ILP in Superscalar processorsILP in Superscalar processorsILP in Superscalar processorsILP in Superscalar processors

Page 19: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 19

Why Superscalars are popular ?Why Superscalars are popular ?Why Superscalars are popular ?Why Superscalars are popular ?

• Binary code compatibility among scalar & superscalar processors of same family

• Same compiler works for all processors (scalars and superscalars) of same family

• Assembly programming of VLIWs is tedious• Code density in VLIWs is very poor - Instruction

encoding schemes

Page 20: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 20

FU FU FU

Register file

•Instruction encoding

•Scalability: Access time, area, power consumption sharply increase with number of register ports

Issues in VLIW ArchitectureIssues in VLIW ArchitectureIssues in VLIW ArchitectureIssues in VLIW Architecture

Page 21: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 21

Tasks of superscalar processingTasks of superscalar processingTasks of superscalar processingTasks of superscalar processing

Parallel Superscalar Parallel Preserving the Preserving thedecoding instruction instruction sequential sequential issue execution consistency of consistency of execution exception processing

Page 22: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 22

OutlineOutlineOutlineOutline

• Classification

• ILP Architectures

• Data Parallel Architectures

• Process level Parallel Architectures

• Issues in parallel architectures

• Cache coherence problem

• Interconnection networks

•SIMD Processors•Vector Processors•Associative Processors•Systolic Arrays

Page 23: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 23

Data Parallel ArchitecturesData Parallel ArchitecturesData Parallel ArchitecturesData Parallel Architectures

• SIMD Processors– Multiple processing elements driven by a single

instruction stream• Vector Processors

– Uni-processors with vector instructions• Associative Processors

– SIMD like processors with associative memory• Systolic Arrays

– Application specific VLSI structures

Page 24: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 24

Systolic Arrays [Systolic Arrays [H.T. Kung 1978]H.T. Kung 1978]Systolic Arrays [Systolic Arrays [H.T. Kung 1978]H.T. Kung 1978]

Simplicity, Regularity, Concurrency, Communication

Example : Band matrix multiplication

666564

56555453

45444342

34333231

232221

1211

666564

56555453

45444342

34333231

232221

1211

000

00

00

00

000

0000

000

00

00

00

000

0000

BBB

BBBB

BBBB

BBBB

BBB

BB

AAA

AAAA

AAAA

AAAA

AAA

AA

C

Page 25: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

B11 B12

B21

B31

A11

A12

A21

A22

A31

A23

T=0

Page 26: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 26

OutlineOutlineOutlineOutline

• Classification

• ILP Architectures

• Data Parallel Architectures

• Process level Parallel Architectures

• Issues in parallel architectures

• Cache coherence problem

• Interconnection networks

•MIMD Processors- Shared Memory- Distributed Memory

Page 27: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 27

Why Process level Parallel Architectures?Why Process level Parallel Architectures?Why Process level Parallel Architectures?Why Process level Parallel Architectures?

Function-parallel architectures

Instruction level PAs

Thread level PAs

Process level PAs(MIMDs)

Distributed Memory

MIMD

Shared Memory

MIMD

Data-parallel architectures

Built usinggeneral purpose

processors

Page 28: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 28

MIMD ArchitecturesMIMD ArchitecturesMIMD ArchitecturesMIMD Architectures

Design Space• Extent of address space sharing

• Location of memory modules

• Uniformity of memory access

Page 29: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 29

OutlineOutlineOutlineOutline

• Classification

• ILP Architectures

• Data Parallel Architectures

• Process level Parallel Architectures

• Issues in parallel architectures

• Cache coherence problem

• Interconnection networks

•User’s perspective•Architect’s perspective

Page 30: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 30

Issues from user’s perspectiveIssues from user’s perspectiveIssues from user’s perspectiveIssues from user’s perspective

• Specification / Program design– explicit parallelism or – implicit parallelism + parallelizing compiler

• Partitioning / mapping to processors

• Scheduling / mapping to time instants– static or dynamic

• Communication and Synchronization

Page 31: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 31

Parallel programming modelsParallel programming modelsParallel programming modelsParallel programming models

Concurrent control flow

Functional or logic program

Vector/array operations

Concurrent tasks/processes/threads/objects

With shared variables or message passing

Relationship between programming model and architecture ?

Page 32: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 32

Issues from architect’s perspectiveIssues from architect’s perspectiveIssues from architect’s perspectiveIssues from architect’s perspective

• Coherence problem in shared memory with caches

• Efficient interconnection networks

Page 33: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 33

OutlineOutlineOutlineOutline

• Classification

• ILP Architectures

• Data Parallel Architectures

• Process level Parallel Architectures

• Issues in parallel architectures

• Cache coherence problem

• Interconnection networks

•Coherence Protocols- Bus or directory based- Invalidate or update- Definition of states

Page 34: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 34

Cache Coherence ProblemCache Coherence ProblemCache Coherence ProblemCache Coherence Problem

Multiple copies of data may exist

Problem of cache coherence

Options for coherence protocols

• What action is taken?– Invalidate or Update

• Which processors/caches communicate?– Snoopy (broadcast) or directory based

• Status of each block?

Page 35: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 35

OutlineOutlineOutlineOutline

• Classification

• ILP Architectures

• Data Parallel Architectures

• Process level Parallel Architectures

• Issues in parallel architectures

• Cache coherence problem

• Interconnection networks

•Switching and control•Topology

Page 36: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 36

Interconnection NetworksInterconnection NetworksInterconnection NetworksInterconnection Networks

• Architectural Variations:– Topology

– Direct or Indirect (through switches)

– Static (fixed connections) or Dynamic (connections established as required)

– Routing type store and forward/worm hole)

• Efficiency:– Delay

– Bandwidth

– Cost

Page 37: Slide 1 OutlineOutline Classification ILP Architectures Data Parallel Architectures Process level Parallel Architectures Issues in parallel architectures.

slide 37

BooksBooksBooksBooks

• D. Sima, T. Fountain, P. Kacsuk, "Advanced Computer Architectures : A Design Space Approach", Addison Wesley, 1997.

• M.J. Flynn, "Computer Architecture : Pipelined and Parallel Processor Design", Narosa Publishing House/ Jones and Bartlett, 1996.

• D.A. Patterson, J.L. Hennessy, "Computer Architecture : A Quantitative Approach", Morgan Kaufmann Publishers, 2002.

• K. Hwang, "Advanced Computer Architecture : Parallelism, Scalability, Programmability", McGraw Hill, 1993.

• H.G. Cragon, "Memory Systems and Pipelined Processors", Narosa Publishing House/ Jones and Bartlett, 1998.

• D.E. Culler, J.P Singh and Anoop Gupta, "Parallel Computer Architecture, A Hardware/Software Approach", Harcourt Asia / Morgan Kaufmann Publishers, 2000.