Top Banner
CS 584 Remember to read and take the quiz!
33

CS 584

Jan 19, 2016

Download

Documents

CS 584. Remember to read and take the quiz!. Parallelism. What is parallelism? Multiple tasks working at the same time on the same problem. Why parallelism? "I feel the need for speed!"Top Gun 1986?. Parallel Computing. What is a parallel computer? - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CS 584

CS 584

Remember to read and take the quiz!

Page 2: CS 584

Parallelism

What is parallelism? Multiple tasks working at the same

time on the same problem.

Why parallelism? "I feel the need for speed!"

Top Gun 1986?

Page 3: CS 584

Parallel Computing

What is a parallel computer? A set of processors that are able to

work cooperatively to solve a computational problem

Examples Parallel supercomputers

IBM SP-2, Cray T3E, Intel Paragon, etc. Clusters of workstations Symmetric multiprocessors

Page 4: CS 584

Won't serial computers be fast enough?

Moore's Law Double in speed every 18 months

Predictions of need British government in 1940s predicted they

would only need about 2-3 computers Market for Cray was predicted to be about

10

Problem Doesn't take into account new applications.

Page 5: CS 584

Applications Drive Supercomputing

Traditional Weather simulation and prediction Climate modeling Chemical and physical computing

New apps. Collaborative environments Virtual reality Parallel databases.

Page 6: CS 584

Application Needs

Graphics 109 volume elements 200 operations per element Real-time display

Weather & Climate 10 year simulation involves 1016 operations Accuracy can be improved by higher

resolution grids which involves more operations.

Page 7: CS 584

Cost-Performance Trend

1960s

1970s1980s

1990s

Cost

Performance

Page 8: CS 584

What does this suggest?

More performance is easy to a point.Significant performance increases of current serial computers beyond the saturation point is extremely expensive.Connecting large numbers of microprocessors into a parallel computer overcomes the saturation point. Cost stays low and performance increases.

Page 9: CS 584

Computer Design

Single processor performance has been increased lately by increasing the level of internal parallelism. Multiple functional units Pipelining

Higher performance gains by incorporating multiple "computers on a chip."

Page 10: CS 584

Computer Performance

1e2

1e7

1e12

Eniac

IBM 704

IBM 7090

CDC 7600

Cray 1Cray X-MP

Cray C90

IBM SP-2

1950 20001975

TFLOPS

Page 11: CS 584

Communication Performance

Early 1990s Ethernet 10 MbitsMid 1990s FDDI 100 MbitsMid 1990s ATM 100s MbitsLate 1990s Fast Ethernet 100 MbitsLate 1990s Gig Ethernet 100s MbitsSoon 1000 Mbits will be commonplace

Page 12: CS 584

Performance Summary

Applications are demanding more speed.Performance trends Processors are increasing in speed. Communication performance is increasing.

Future Performance trends suggest a future where

parallelism pervades all computing. Concurrency is key to performance increases.

Page 13: CS 584

Parallel Processing Architectures

Architectures Single computer with lots of

processors Multiple interconnected computers

Architecture governs programming Shared memory and locks Message passing

Page 14: CS 584

Shared Memory Computers

Interconnection Network

Processors

Memory Modules

SMP MachinesSGI Origin series

Page 15: CS 584

Message Passing Computers

Interconnection Network

Processors

Memory Modules

IBM SP-2 nCubeCray T3E Intel Paragon

Workstation Clusters

Page 16: CS 584

Distributed Shared Memory

Interconnection Network

Processors

Memory Modules

SGI Origin seriesWorkstation ClustersKendall Square Research KSR1 and KSR2

Page 17: CS 584

Parallel Computers

Flynn's Taxonomy

Data Stream

InstructionStream

Single Inst.Single Data

Single Inst.Mult. Data

Mult. Inst.Single Data

Mult. Inst.Mult. Data

Nice, but doesn't fully account for all.

Page 18: CS 584

Message Passing Architectures

Requires some form of interconnection

The network is the bottleneck Latency and bandwidth Diameter Bisection bandwidth

Page 19: CS 584

Message Passing Architectures

Line/Ring

Mesh/Torus

Page 20: CS 584

Message Passing Architectures

Tree/Fat Tree

Page 21: CS 584

Message Passing Architectures

Hypercube

Page 22: CS 584
Page 23: CS 584

Embedding

The mapping of nodes from one static network onto another Ring onto hypercube Tree onto mesh Tree onto hypercube

Transporting algorithms

Page 24: CS 584

Communication Methods

Circuit switching Path establishment Dedicated links

Packet switching Message is split up Store and Forward or virtual cut

through Wormhole routing (less storage)

Page 25: CS 584

Trends

Supercomputers had best processing and communication However: Commodity processors doing pretty good Pentium III and Alpha

And: Switched 100 Mbps network is cheap High speed networks aren’t too high

priced

Page 26: CS 584

Cost-Performance

Supercomputer 128 node SP-2 100’s gigaflops Millions of dollars

Cluster of Workstations 128 node 10’s to 100’s of gigaflops Hundreds of thousands of dollars

Page 27: CS 584

Advantages of Clusters

Low Cost Easily upgradeable Can use existing software Low Cost

Page 28: CS 584

What about scalability?

Current switches support about 256 connections What about 1000’s of connections?

Interconnecting switches Fat Tree Hypercube Etc.

Page 29: CS 584

Serial vs. Parallel Programming

Serial programming has been aided by von Neumann computer design Procedural and object languages Program one ---> Program all

Parallel programming needs same type of standardization. Machine model. (Multicomputer) Language and communication. (MPI)

Page 30: CS 584

The Multicomputer

Multiple von Neumann computers (nodes)Interconnection networkEach node executes its own program accesses its local memory

faster than remote memory accesses (locality)

sends and receives messages

Page 31: CS 584

The Multicomputer

INTERCONNECT

CPU

Mem

Page 32: CS 584

Parallel Programming Properties

Concurrency Performance should increase by employing

multiple processors.

Scalability Performance should continue to increase as

we add more processors.

Locality of Reference Performance will be greater if we only

access local memory.

Page 33: CS 584

Summary

Applications drive supercomputing.Processor and network performance is increasing.Trend is toward ubiquitous parallelismSerial programming was aided by standardized machine and programming model.Standardized machine and programming models for parallel computing are emerging.