Top Banner
Multiple Processor Systems I CS 423 Klara Nahrstedt/Sam King 06/19/22 1
29

Multiple Processor Systems I CS 423 Klara Nahrstedt/Sam King 10/11/20141.

Apr 01, 2015

Download

Documents

Josue Hosler
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Multiple Processor Systems I CS 423 Klara Nahrstedt/Sam King 10/11/20141.

Multiple Processor Systems I

CS 423Klara Nahrstedt/Sam King

04/11/23 1

Page 2: Multiple Processor Systems I CS 423 Klara Nahrstedt/Sam King 10/11/20141.

04/11/23 2

Content

• Administrative• Introduction• Multiprocessors

– Hardware– OS Types– Synchronization– Scheduling

Page 3: Multiple Processor Systems I CS 423 Klara Nahrstedt/Sam King 10/11/20141.

Administrative

• MP4 is out• If you are changing groups for MP4, let staff

(profs, TAs) and your partners know – We have to change multiple state machines to get

to a consistent state about your group status– If you take the extra days, you don’t have to

ask/inform TAs/Profs about taking them since we see the status in the compass

04/11/23 3

Page 4: Multiple Processor Systems I CS 423 Klara Nahrstedt/Sam King 10/11/20141.

04/11/23 4

Introduction• Goal of computer architects, engineers and scientists to get more and more

computing power• Problem with the goal:

– Previous approaches: make clock run faster to speed up– We are beginning to hit some fundamental limits on clock speed– Reason: Einstein’s special theory of relativity

• No electrical signal can propagate faster than the speed of light: 30 cm/ns in a vacuum and 20 cm/ns over copper or optical fiber

• This means:– with 10 GHz clock, signals cannot travel more than 2 cm total, – with 100 GHz clock, total signal path length at most 2 mm, and– at 1000 GHz (1 THz) computer would have to be smaller than 100 µm

– Reason: heat dissipation • making computers this small may be possible, but the faster the computer runs, the

more heat it generates, and the smaller the computer is, the harder it is to get rid of the heat.

Page 5: Multiple Processor Systems I CS 423 Klara Nahrstedt/Sam King 10/11/20141.

Computer Speedup

Moore’s Law: “The density of transistors on a chip doubles every 18 months, for the same cost” (1965)

Image: Tom’s Hardware

Page 6: Multiple Processor Systems I CS 423 Klara Nahrstedt/Sam King 10/11/20141.

04/11/23 6

Multiprocessor Systems

• Continuous need for faster computers– shared memory model– message passing multiprocessor– wide area distributed system

Loosely coupled Tightly coupled

Page 7: Multiple Processor Systems I CS 423 Klara Nahrstedt/Sam King 10/11/20141.

04/11/23 7

Multiprocessors• Shared-memory multiprocessor:

A computer system in which two or more CPUs share full access to a common RAM

• Inter-processor communication: one CPU writes some data into memory and another one reads the data out

• Multiprocessor OS performs:– Regular OS functions: handling system calls, memory management, file

system access, management of I/O devices– Unique to multiprocessor OS: process synchronization, resource

management, scheduling

• Uniform vs. non-uniform memory access (UMA, NUMA) multiprocessors

Page 8: Multiple Processor Systems I CS 423 Klara Nahrstedt/Sam King 10/11/20141.

04/11/23 8

Multiprocessor Hardware (1)UMA Bus-based Multiprocessors

Problem: If the bus is busy when a CPU wants to read or write memory, (a)the CPU waits until the bus becomes idle. As CPU speeds scale up, this imposes an unacceptable performance penalty.

What is the solution?

Page 9: Multiple Processor Systems I CS 423 Klara Nahrstedt/Sam King 10/11/20141.

04/11/23 9

Multiprocessor Hardware (2)UMA Multiprocessors Using Crossbar Switch

Problem: with best caching we can only support 16 or 32 CPUsSolution: crossbar switch with non-blocking network

Page 10: Multiple Processor Systems I CS 423 Klara Nahrstedt/Sam King 10/11/20141.

04/11/23 10

• (a) 256-node directory based multiprocessor• (b) Fields of 32-bit memory address• (c) Directory at node 36

Multiprocessor Hardware (3)

Page 11: Multiple Processor Systems I CS 423 Klara Nahrstedt/Sam King 10/11/20141.

04/11/23 11

Multiprocessor OS Types (1)Each CPU has its own OS

busCharacteristics: • share single copy of OS code• partition multiprocessor memory among four CPUs• sharing of set of disks and other I/O devices

Design aspects: • when a process makes a system call, system call is handled by its own CPU• no sharing of processes – consequence: no load balancing on CPUs• no sharing of pages – consequence: no load balancing on memory• independent buffer caching of disk blocks• problem: no load balancing on CPU, memory and inconsistencies

Page 12: Multiple Processor Systems I CS 423 Klara Nahrstedt/Sam King 10/11/20141.

04/11/23 12

Multiprocessor OS Types (2)Master-Slave Multiprocessors

Characteristics: • one copy of OS and its tables are present on one CPU and not the others• all system calls are redirected to the one CPU for processing• CPU with OS is called master, all other CPUs running applications are called slavesDesign aspects: • there is only data structure that keeps track of ready processes – good for loadbalancing• pages can be allocated among all processes dynamically • there is only one buffer cache so inconsistencies never occurWhat are the problems with this OS type?

bus

Page 13: Multiple Processor Systems I CS 423 Klara Nahrstedt/Sam King 10/11/20141.

04/11/23 13

Multiprocessor OS Types (3)Symmetric Multiprocessors

Characteristics: • eliminates the master/slave asymmetry• there is one copy of OS, but any CPU can run it• when a system call is made, the CPU traps to the kernel, which processes the system callDesign aspects: • dynamic balance of processes and memory, since there is only one set of OS tables• eliminates master as a bottleneckWhat are the problems with this OS Type?

bus

Page 14: Multiple Processor Systems I CS 423 Klara Nahrstedt/Sam King 10/11/20141.

Parallel Programming Parallelization Idea

• Parallelization is “easy” if processing can be cleanly split into n units:

work

w1 w2 w3

Partition problem

Page 15: Multiple Processor Systems I CS 423 Klara Nahrstedt/Sam King 10/11/20141.

Parallelization Idea (2)

w1 w2 w3

thread thread thread

Spawn worker threads:

In a parallel computation, we would like to have as many threads as we have processors. e.g., a four-processor computer would be able to run four threads at the same time.

Page 16: Multiple Processor Systems I CS 423 Klara Nahrstedt/Sam King 10/11/20141.

Parallelization Idea (3)

thread thread thread

Workers process data:

w1 w2 w3

Page 17: Multiple Processor Systems I CS 423 Klara Nahrstedt/Sam King 10/11/20141.

Parallelization Idea (4)

results

Report results

thread thread threadw1 w2 w3

Page 18: Multiple Processor Systems I CS 423 Klara Nahrstedt/Sam King 10/11/20141.

Parallelization Pitfalls

But this model is too simple!

• How do we assign work units to worker threads?• What if we have more work units than threads?• How do we aggregate the results at the end?• How do we know all the workers have finished?• What if the work cannot be divided into completely

separate tasks?

Page 19: Multiple Processor Systems I CS 423 Klara Nahrstedt/Sam King 10/11/20141.

Parallelization Pitfalls (2)

• Each of these problems represents a point at which multiple threads must communicate with one another, or access a shared resource.

• Golden rule: Any memory that can be used by multiple threads must have an associated synchronization system!

Page 20: Multiple Processor Systems I CS 423 Klara Nahrstedt/Sam King 10/11/20141.

The Moral: Be Careful!• Synchronization is hard

– Need to consider all possible shared state– Must keep locks organized and use them

consistently and correctly

• Knowing there are bugs may be tricky; fixing them can be even worse!

• Keeping shared state to a minimum reduces total system complexity

Page 21: Multiple Processor Systems I CS 423 Klara Nahrstedt/Sam King 10/11/20141.

04/11/23 21

MultiprocessorSynchronization (1)

• To ensure atomic execution of TSL,• we have to be able to lock the bus

Page 22: Multiple Processor Systems I CS 423 Klara Nahrstedt/Sam King 10/11/20141.

04/11/23 22

MultiprocessorSynchronization (2)

• Multiple locks used to avoid cache thrashing

Page 23: Multiple Processor Systems I CS 423 Klara Nahrstedt/Sam King 10/11/20141.

04/11/23 23

MultiprocessorSynchronization (3)

• Spinning versus Switching• In some cases CPU must wait

– waits to acquire ready list

• In other cases a choice exists– spinning wastes CPU cycles– switching uses up CPU cycles also– possible to make separate decision each time

locked mutex encountered

Page 24: Multiple Processor Systems I CS 423 Klara Nahrstedt/Sam King 10/11/20141.

04/11/23 24

• Scheduling of independent processes– use of single data structure for scheduling

• Priorities and quantum expiration need to be carefully considered – What are the differences between uniprocessor and multiprocessor when quantum expires?

• Smart scheduling, affinity scheduling/two-level scheduling

Multiprocessor Scheduling (1)Time Sharing

Page 25: Multiple Processor Systems I CS 423 Klara Nahrstedt/Sam King 10/11/20141.

04/11/23 25

Multiprocessor Scheduling (2)Space Sharing

• Processes are related to each other, or • Process consisting of multiple threads • Simple algorithm:

– all threads are created at the same time, – each thread is given a dedicated CPU and start at same time across multiple CPUs– if there are not enough CPUs to start all threads, wait until CPUs become available

• Consider a central server that keeps track of available CPUs• Advantage: no multi-programming• Disadvantage: Waste of CPU if it blocks

Page 26: Multiple Processor Systems I CS 423 Klara Nahrstedt/Sam King 10/11/20141.

04/11/23 26

Multiprocessor Scheduling (3)Gang Scheduling

• Problem: communication between two threads– both belong to process A– both running out of phase

Page 27: Multiple Processor Systems I CS 423 Klara Nahrstedt/Sam King 10/11/20141.

04/11/23 27

Multiprocessor Scheduling (4)Gang Scheduling

• Solution: Gang Scheduling1.Groups of related threads scheduled as a unit, or gang

2.All members of gang run simultaneously on different timeshared CPUs

3.All gang members start and end their time slices together

Page 28: Multiple Processor Systems I CS 423 Klara Nahrstedt/Sam King 10/11/20141.

04/11/23 28

Multiprocessor Scheduling (5)Gang Scheduling

Page 29: Multiple Processor Systems I CS 423 Klara Nahrstedt/Sam King 10/11/20141.

04/11/23 29

Conclusion • Multiprocessors

– Popular and attractive– Offer simple communication model—all CPUs share a

common memory– Processes write messages to memory which can be

read by others– Synchronization using mutexes, semaphores, monitors

and other well-established schemes

• Problem:– Large multiprocessors are difficult to build and are

expensive