Top Banner
Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia www.gridbus.org WW Grid
35

Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

Dec 22, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

Parallel Processing: Architecture Overview

Rajkumar BuyyaGrid Computing and Distributed Systems

(GRIDS) Lab. The University of MelbourneMelbourne, Australiawww.gridbus.org

WW Grid

Page 2: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

Serial Vs. Parallel

QPlease

COUNTER

COUNTER 1

COUNTER 2

Page 3: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

Overview of the Talk

Introduction Why Parallel Processing ? Parallel System H/W Architecture Parallel Operating Systems

Page 4: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

P PP P P PMicrokernelMicrokernel

Multi-Processor Computing System

Threads InterfaceThreads Interface

Hardware

Operating System

ProcessProcessor ThreadPP

Applications

Computing Elements

Programming paradigms

Page 5: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

Two Eras of Computing Architectures System

Software/Compiler Applications P.S.Es Architectures System Software Applications P.S.Es

SequentialEra

ParallelEra

1940 50 60 70 80 90 2000 2030

Commercialization R & D Commodity

Page 6: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

History of Parallel Processing

The notion of parallel processing can be traced to a tablet dated around 100 BC. Tablet has 3 calculating positions capable of

operating simultaneously. From this we can infer that:

They were aimed at “speed” or “reliability”.

Page 7: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

Motivating Factors

Just as we learned to fly, not by constructing a machine that flaps its wings like birds, but by applying aerodynamics principles demonstrated by the nature...

Similarly parallel processing has been modeled after those of biological species.

Aggregated speed with which complex calculations carried out by (billions of) neurons demonstrate feasibility of PP.

Individual neuron response speed is slow (ms) –

Page 8: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

Why Parallel Processing?

Computation requirements are ever increasing -- visualization, distributed databases, simulations, scientific prediction (earthquake), etc.

Silicon based (sequential) architectures reaching physical limits in processing limits as they are constrained by:

the speed of light, thermodynamics

Page 9: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

Age

Gro

wth

5 10 15 20 25 30 35 40 45 . . . .

Human Architecture! Growth Performance

Vertical Horizontal

Page 10: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

No. of Processors

C.P

.I

1 2 . . . .

Computational Power Improvement

Multiprocessor

Uniprocessor

Page 11: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

Why Parallel Processing?

Hardware improvements like pipelining, superscalar are not scaling well and require sophisticated compiler technology to exploit performance out of them.

Techniques such as vector processing works well for certain kind of problems.

Page 12: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

Why Parallel Processing?

Significant development in networking technology is paving a way for network-based cost-effective parallel computing.

The parallel processing technology is mature and is being exploited commercially.

Page 13: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

Parallel Programs

Consist of multiple active “processes” simultaneously solving a given problem.

And the communication and synchronization between them (parallel processes) forms the core of parallel programming efforts.

Page 14: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

Types of Parallel Systems

Tightly Couple Systems: Shared Memory Parallel

Smallest extension to existing systems

Program conversion is incremental

Distributed Memory Parallel Completely new systems Programs must be

reconstructed Loosely Coupled Systems:

Clusters Built using commodity

systems Centralised management

Grids Aggregation of distributed

systems Decentralized management

Page 15: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

Processing Elements Architecture

Page 16: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

Processing Elements

Flynn proposed a classification of computer systems based on a number of instruction and data streams that can be processed simultaneously.

They are: SISD (Single Instruction and Single Data)

Conventional computers SIMD (Single Instruction and Multiple Data)

Data parallel, vector computing machines MISD (Multiple Instruction and Single Data)

Systolic arrays MIMD (Multiple Instruction and Multiple Data)

General purpose machine

Page 17: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

SISD : A Conventional Computer

Speed is limited by the rate at which computer can transfer information internally.

ProcessorProcessorData Input Data Output

Instru

ctions

Ex: PCs, Workstations

Page 18: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

The MISD Architecture

More of an intellectual exercise than a practical configuration. Few built, but commercially not available

Data InputStream

Data OutputStream

Processor

A

Processor

B

Processor

C

InstructionStream A

InstructionStream B

Instruction Stream C

Page 19: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

SIMD Architecture

Ex: CRAY machine vector processing, Thinking machine cm*Intel MMX (multimedia support)

Ci<= Ai * Bi

InstructionStream

Processor

A

Processor

B

Processor

C

Data Inputstream A

Data Inputstream B

Data Inputstream C

Data Outputstream A

Data Outputstream B

Data Outputstream C

Page 20: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

Unlike SISD, MISD, MIMD computer works asynchronously.

Shared memory (tightly coupled) MIMD

Distributed memory (loosely coupled) MIMD

MIMD Architecture

Processor

A

Processor

B

Processor

C

Data Inputstream A

Data Inputstream B

Data Inputstream C

Data Outputstream A

Data Outputstream B

Data Outputstream C

InstructionStream A

InstructionStream B

InstructionStream C

Page 21: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

MEMORY

BUS

Shared Memory MIMD machine

Comm: Source PE writes data to GM & destination retrieves it Easy to build, conventional OSes of SISD can be easily be ported Limitation : reliability & expandibility. A memory component or

any processor failure affects the whole system. Increase of processors leads to memory contention.

Ex. : Silicon graphics supercomputers....

MEMORY

BUS

Global Memory SystemGlobal Memory System

ProcessorA

ProcessorA

ProcessorB

ProcessorB

ProcessorC

ProcessorC

MEMORY

BUS

Page 22: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

MEMORY

BUS

Distributed Memory MIMD

Communication : IPC (Inter-Process Communication) via High Speed Network.

Network can be configured to ... Tree, Mesh, Cube, etc. Unlike Shared MIMD

easily/ readily expandable Highly reliable (any CPU failure does not affect the whole system)

ProcessorA

ProcessorA

ProcessorB

ProcessorB

ProcessorC

ProcessorC

MEMORY

BUS

MEMORY

BUS

MemorySystem A

MemorySystem A

MemorySystem B

MemorySystem B

MemorySystem C

MemorySystem C

IPC

channel

IPC

channel

Page 23: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

Laws of caution.....

Speed of computation is proportional to the square root of system cost.

i.e. Speed = Cost

Speedup by a parallel computer increases as the logarithm of the number of processors. Speedup = log2(no. of processors) S

P

log 2P

C

S

Page 24: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

Caution....

Very fast development in network computing and related area have blurred concept boundaries, causing lot of terminological confusion : concurrent computing, parallel computing, multiprocessing, supercomputing, massively parallel processing, cluster computing, distributed computing, Internet computing, grid computing, etc.

At the user level, even well-defined distinctions such as shared memory and distributed memory are disappearing due to new advances in technology.

Good tools for parallel program development and debugging are yet to emerge.

Page 25: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

Caution....

There is no strict delimiters for contributors to the area of parallel processing: computer architecture, operating systems,

high-level languages, algorithms, databases, computer networks, …

All have a role to play.

Page 26: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

Operating Systems forHigh Performance

Computing

Page 27: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

Types of Parallel Systems

Shared Memory Parallel Smallest extension to

existing systems Program conversion is

incremental Distributed Memory

Parallel Completely new systems Programs must be

reconstructed Clusters

Slow communication form of Distributed

Page 28: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

Operating Systems for PP

MPP systems having thousands of processors requires OS radically different from current ones.

Every CPU needs OS : to manage its resources to hide its details

Traditional systems are heavy, complex and not suitable for MPP

Page 29: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

Operating System Models

Frame work that unifies features, services and tasks performed

Three approaches to building OS.... Monolithic OS Layered OS Microkernel based OS

Client server OS Suitable for MPP systems Simplicity, flexibility and high performance

are crucial for OS.

Page 30: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

ApplicationPrograms

ApplicationPrograms

System ServicesSystem Services

HardwareHardware

User ModeUser Mode

Kernel ModeKernel Mode

Monolithic Operating System

Better application Performance Difficult to extend Ex: MS-DOS

Page 31: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

Layered OS

Easier to enhance Each layer of code access lower level interface Low-application performance

ApplicationPrograms

ApplicationPrograms

System ServicesSystem Services

User Mode

Kernel Mode

Memory & I/O Device MgmtMemory & I/O Device Mgmt

HardwareHardware

Process ScheduleProcess Schedule

ApplicationPrograms

ApplicationPrograms

Ex : UNIX

Page 32: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

Traditional OS

OS DesignerOS Designer

OS

Hardware

User Mode

Kernel Mode

ApplicationPrograms

ApplicationPrograms

ApplicationPrograms

ApplicationPrograms

Page 33: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

New trend in OS design

User Mode

Kernel Mode

Hardware

Microkernel

ServersApplicationPrograms

ApplicationPrograms

ApplicationPrograms

ApplicationPrograms

Page 34: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

Microkernel/Client Server OS

(for MPP Systems)

Tiny OS kernel providing basic primitive (process, memory, IPC)

Traditional services becomes subsystems Monolithic Application Perf. Competence OS = Microkernel + User Subsystems

ClientApplication

ClientApplication

Thread lib.

Thread lib.

FileServer

FileServer

NetworkServer

NetworkServer

DisplayServer

DisplayServer

MicrokernelMicrokernel

HardwareHardware

User

Kernel

SendReply

Ex: Mach, PARAS, Chorus, etc.

Page 35: Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.

Few Popular Microkernel Systems

MACH, CMU

PARAS, C-DAC

Chorus

QNX,

(Windows)