Understanding Parallel Computers

Post on 30-Dec-2015

21 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Understanding Parallel Computers. Parallel Processing EE 613. Balancing Machine Specifics With Portability. How much do we need to know about the machine architecture Performance goal Game developers, embedded systems, and hardware vendors Coding specific to machine Lifetime goal - PowerPoint PPT Presentation

Transcript

Understanding Parallel Computers

Parallel ProcessingEE 613

Balancing Machine Specifics With Portability

• How much do we need to know about the machine architecture– Performance goal

• Game developers, embedded systems, and hardware vendors

• Coding specific to machine

– Lifetime goal• Portability• Generalized code

A Look At Six Parallel Computers

• Chip Multiprocessors– Intel Core Duo– AMD Dual Core Opteron

• Heterogeneous Chip Designs– GPU, FPGA, Cell– Vector

• Clusters– Node – processors, RAM, disk, memory not shared– Typical - eight nodes, control processor, switch– Blade server – includes com ports and cooling fans

• Supercomputers – BlueGene/L (440 PowerPC’s)

CSE524 Parallel Algorithms Lawrence Snyder

CSE524 Parallel Algorithms Lawrence Snyder

CSE524 Parallel Algorithms Lawrence Snyder

CSE524 Parallel Algorithms Lawrence Snyder

CSE524 Parallel Algorithms Lawrence Snyder

Now we can put multiple cores on a single chip.

CSE524 Parallel Algorithms Lawrence Snyder

MESI – Modified, Exclusive, Shared, Invalid

CSE524 Parallel Algorithms Lawrence Snyder

CSE524 Parallel Algorithms Lawrence Snyder

MOESI – Modified, Owned, Exclusive, Shared, Invalid

CSE524 Parallel Algorithms Lawrence Snyder

Both designs implement a coherent shared memory.

CSE524 Parallel Algorithms Lawrence Snyder

Symmetric Multiprocessor (SMP)• Each processor makes memory requests over the common memory bus

• All cache controllers snoop the memory bus and adjust the tags on their cached values to ensure coherent cache usage.

CSE524 Parallel Algorithms Lawrence Snyder

Example – po & p1 have copies of Block X, but p2 writes to Block X.

CSE524 Parallel Algorithms Lawrence Snyder

CSE524 Parallel Algorithms Lawrence Snyder

CSE524 Parallel Algorithms Lawrence Snyder

CSE524 Parallel Algorithms Lawrence Snyder

CSE524 Parallel Algorithms Lawrence Snyder

CSE524 Parallel Algorithms Lawrence Snyder

CSE524 Parallel Algorithms Lawrence Snyder

CSE524 Parallel Algorithms Lawrence Snyder

CSE524 Parallel Algorithms Lawrence Snyder

CSE524 Parallel Algorithms Lawrence Snyder

CSE524 Parallel Algorithms Lawrence Snyder

CSE524 Parallel Algorithms Lawrence Snyder

CSE524 Parallel Algorithms Lawrence Snyder

CSE524 Parallel Algorithms Lawrence Snyder

CSE524 Parallel Algorithms Lawrence Snyder

CSE524 Parallel Algorithms Lawrence Snyder

CSE524 Parallel Algorithms Lawrence Snyder

CSE524 Parallel Algorithms Lawrence Snyder

CSE524 Parallel Algorithms Lawrence Snyder

CSE524 Parallel Algorithms Lawrence Snyder

CSE524 Parallel Algorithms Lawrence Snyder

CSE524 Parallel Algorithms Lawrence Snyder

CSE524 Parallel Algorithms Lawrence Snyder

CSE524 Parallel Algorithms Lawrence Snyder

CSE524 Parallel Algorithms Lawrence Snyder

CSE524 Parallel Algorithms Lawrence Snyder

CSE524 Parallel Algorithms Lawrence Snyder

CSE524 Parallel Algorithms Lawrence Snyder

top related