Seminar on parallel computing • Goal: provide environment for exploration of parallel computing • Driven by participants • Weekly hour for discussion, show & tell • Focus primarily on distributed memory computing on linux PC clusters • Target audience: – Experience with linux computing & Fortran/C – Requires parallel computing for own studies • 1 credit possible for completion of ‘proportional’ project
18
Embed
Seminar on parallel computing Goal: provide environment for exploration of parallel computing Driven by participants Weekly hour for discussion, show &
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Seminar on parallel computing• Goal: provide environment for exploration of
parallel computing • Driven by participants• Weekly hour for discussion, show & tell• Focus primarily on distributed memory
computing on linux PC clusters• Target audience:
– Experience with linux computing & Fortran/C– Requires parallel computing for own studies
• 1 credit possible for completion of ‘proportional’ project
Main idea
• Distribute a job over multiple processing units
• Do bigger jobs than is possible on single machines
• Memory addressing – 32 bit words in PCs: 4 Gbyte RAM max.
Machine architecture: serial
– Single processor– Hierarchical memory:
• Small number of registers on CPU
• Cache (L1/L2)
• RAM
• Disk (swap space)
– Operations require multiple steps• Fetch two floating point numbers from main memory
• Add and store
• Put back into main memory
Vector processing
• Speed up single instructions on vectors– E.g., while adding two floating point numbers
fetch two new ones from main memory– Pushing vectors through the pipeline
• Useful in particular for long vectors• Requires good memory control:
– Bigger cache is better
• Common on most modern CPUs– Implemented in both hardware and software
SIMD
• Same instruction works simultaneously on different data sets
• Extension of vector computing• Example:
DO IN PARALLELfor i=1,n
x(i) = a(i)*b(i) endDONE PARALLEL
MIMD• Multiple instruction, multiple data• Most flexible, encompasses SIMD/serial.• Often best for ‘coarse grained’ parallelism• Message passing• Example: domain decomposition
– Divide computational grid in equal chunks
– Work on each domain with one CPU
– Communicate boundary values when necessary
• 1976 Cray-1 at Los Alamos (vector)
• 1980s Control Data Cyber 205 (vector)
• 1980s Cray XMP– 4 coupled Cray-1s
• 1985 Thinking Machines Connection Machine– SIMD, up to 64k processors