CS 584
Remember to read and take the quiz!
Parallelism
What is parallelism? Multiple tasks working at the same
time on the same problem.
Why parallelism? "I feel the need for speed!"
Top Gun 1986?
Parallel Computing
What is a parallel computer? A set of processors that are able to
work cooperatively to solve a computational problem
Examples Parallel supercomputers
IBM SP-2, Cray T3E, Intel Paragon, etc. Clusters of workstations Symmetric multiprocessors
Won't serial computers be fast enough?
Moore's Law Double in speed every 18 months
Predictions of need British government in 1940s predicted they
would only need about 2-3 computers Market for Cray was predicted to be about
10
Problem Doesn't take into account new applications.
Applications Drive Supercomputing
Traditional Weather simulation and prediction Climate modeling Chemical and physical computing
New apps. Collaborative environments Virtual reality Parallel databases.
Application Needs
Graphics 109 volume elements 200 operations per element Real-time display
Weather & Climate 10 year simulation involves 1016 operations Accuracy can be improved by higher
resolution grids which involves more operations.
Cost-Performance Trend
1960s
1970s1980s
1990s
Cost
Performance
What does this suggest?
More performance is easy to a point.Significant performance increases of current serial computers beyond the saturation point is extremely expensive.Connecting large numbers of microprocessors into a parallel computer overcomes the saturation point. Cost stays low and performance increases.
Computer Design
Single processor performance has been increased lately by increasing the level of internal parallelism. Multiple functional units Pipelining
Higher performance gains by incorporating multiple "computers on a chip."
Computer Performance
1e2
1e7
1e12
Eniac
IBM 704
IBM 7090
CDC 7600
Cray 1Cray X-MP
Cray C90
IBM SP-2
1950 20001975
TFLOPS
Communication Performance
Early 1990s Ethernet 10 MbitsMid 1990s FDDI 100 MbitsMid 1990s ATM 100s MbitsLate 1990s Fast Ethernet 100 MbitsLate 1990s Gig Ethernet 100s MbitsSoon 1000 Mbits will be commonplace
Performance Summary
Applications are demanding more speed.Performance trends Processors are increasing in speed. Communication performance is increasing.
Future Performance trends suggest a future where
parallelism pervades all computing. Concurrency is key to performance increases.
Parallel Processing Architectures
Architectures Single computer with lots of
processors Multiple interconnected computers
Architecture governs programming Shared memory and locks Message passing
Shared Memory Computers
Interconnection Network
Processors
Memory Modules
SMP MachinesSGI Origin series
Message Passing Computers
Interconnection Network
Processors
Memory Modules
IBM SP-2 nCubeCray T3E Intel Paragon
Workstation Clusters
Distributed Shared Memory
Interconnection Network
Processors
Memory Modules
SGI Origin seriesWorkstation ClustersKendall Square Research KSR1 and KSR2
Parallel Computers
Flynn's Taxonomy
Data Stream
InstructionStream
Single Inst.Single Data
Single Inst.Mult. Data
Mult. Inst.Single Data
Mult. Inst.Mult. Data
Nice, but doesn't fully account for all.
Message Passing Architectures
Requires some form of interconnection
The network is the bottleneck Latency and bandwidth Diameter Bisection bandwidth
Message Passing Architectures
Line/Ring
Mesh/Torus
Message Passing Architectures
Tree/Fat Tree
Message Passing Architectures
Hypercube
Embedding
The mapping of nodes from one static network onto another Ring onto hypercube Tree onto mesh Tree onto hypercube
Transporting algorithms
Communication Methods
Circuit switching Path establishment Dedicated links
Packet switching Message is split up Store and Forward or virtual cut
through Wormhole routing (less storage)
Trends
Supercomputers had best processing and communication However: Commodity processors doing pretty good Pentium III and Alpha
And: Switched 100 Mbps network is cheap High speed networks aren’t too high
priced
Cost-Performance
Supercomputer 128 node SP-2 100’s gigaflops Millions of dollars
Cluster of Workstations 128 node 10’s to 100’s of gigaflops Hundreds of thousands of dollars
Advantages of Clusters
Low Cost Easily upgradeable Can use existing software Low Cost
What about scalability?
Current switches support about 256 connections What about 1000’s of connections?
Interconnecting switches Fat Tree Hypercube Etc.
Serial vs. Parallel Programming
Serial programming has been aided by von Neumann computer design Procedural and object languages Program one ---> Program all
Parallel programming needs same type of standardization. Machine model. (Multicomputer) Language and communication. (MPI)
The Multicomputer
Multiple von Neumann computers (nodes)Interconnection networkEach node executes its own program accesses its local memory
faster than remote memory accesses (locality)
sends and receives messages
The Multicomputer
INTERCONNECT
CPU
Mem
Parallel Programming Properties
Concurrency Performance should increase by employing
multiple processors.
Scalability Performance should continue to increase as
we add more processors.
Locality of Reference Performance will be greater if we only
access local memory.
Summary
Applications drive supercomputing.Processor and network performance is increasing.Trend is toward ubiquitous parallelismSerial programming was aided by standardized machine and programming model.Standardized machine and programming models for parallel computing are emerging.