Taxanomy of parallel machines
Taxonomy of parallel machines
• Memory– Shared
mem.– Distributed
mem.
• Control– SIMD– MIMD
Shared Memory Multiprocessor
Conventional ComputerConsists of a processor executing a program stored in a
(main) memory:
Each main memory location located by its address. Addresses start at 0 and extend to 2b - 1 when there are b bits (binary digits) in address.
Main memory
Processor
Instructions (to processor)Data (to or from processor)
Shared Memory Multiprocessor SystemNatural way to extend single processor model - have multiple
processors connected to multiple memory modules, such that each processor can access any memory module :
Processors
Interconnectionnetwork
Memory moduleOneaddressspace
Simplistic view of a small shared memory multiprocessor
Processors
Shared memory
Bus
Typical Shared Memory Multiprocessor
Processor
L2 Cache
Bus interface
L1 cache
Processor
L2 Cache
Bus interface
L1 cache
Processor
L2 Cache
Bus interface
L1 cache
Processor
L2 Cache
Bus interface
L1 cache
Memory controller
Memory
I/O interface
I/O bus
Processor/memorybus
Shared memory
Programming Shared Memory Multiprocessors
• Threads - programmer decomposes program into individual parallel sequences, (threads), each being able to access variables declared outside threads.
Example: Pthreads
• Sequential programming language with preprocessor compiler directives to declare shared variables and specify parallelism.
Example: OpenMP or Cilk - needs OpenMP or Cilk compiler
Distributed Memory Multiprocessor
Computers connected through an interconnection network:
Processor
Interconnectionnetwork
Local
Computers
Messages
memory
Interconnection Networks
• Limited and exhaustive interconnections• 2- and 3-dimensional meshes• Hypercube (not now common)• Using Switches:
– Crossbar– Trees– Multistage interconnection networks
Two-dimensional array (mesh)
Also three-dimensional - used in some large high performance systems.
LinksComputer/processor
Three-dimensional hypercube
000 001
010 011
100
110
101
111
IBM Blue Gene
IBM Blue Gene
Tree
Switchelement
Root
Links
Processors
Four-dimensional hypercube
Hypercubes popular in 1980/90’s - not now
0000 0001
0010 0011
0100
0110
0101
0111
1000 1001
1010 1011
1100
1110
1101
1111
Multistage Interconnection NetworkExample: Omega network
000
001
010
011
100
101
110
111
000
001
010
011
100
101
110
111
Inputs Outputs
2 ´ 2 switch elements(straight-through or
crossover connections)
Crossbar switch
SwitchesProcessors
Memories
Message-Passing Distributed memory parallel machines are usually
programmed via message passing. Industry standard: MPI
Processor
Interconnectionnetwork
Shared
Computers
Messages
memory
Flynn’s Classifications
Taxanomy of parallel machines
Distributedmemory
Shared memory
MIMD SIMD
clusters
multi-core
CM/2(legacy)
GPU