Top Banner
1 Copyright © Amnon Barak 2001 1 The MOSIX Scalable Cluster Computing for Linux Prof. Amnon Barak Computer Science Hebrew University http://www. mosix.org
37

The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

Oct 28, 2019

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

1Copyright © Amnon Barak 2001 1

The MOSIX Scalable Cluster Computing for Linux

Prof. Amnon BarakComputer ScienceHebrew University

http://www. mosix.org

Page 2: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

2Copyright © Amnon Barak 2001

Presentation overview

• Part I : Why computing clusters (slide 3 - 7)• Part II : What is MOSIX (8 - 20)• Part III: Parallel file systems (21 - 26)• Part IV: Tools (27 - 28)• Part V : Experience /parallel applications (29 - 32)

• Part VI: Summary/current/future projects (33 - 36)

Color code: Highlight GOOD properties (blue)Not so good properties (Red)

Page 3: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

3Copyright © Amnon Barak 2001

Computing Clusters are not just for High Performance Computing (HPC)

• Many organizations need powerful systems to run “demanding applications” and for high availability. Some examples:• Internet: web servers, ISP, ASP, GRID computing• Telephony: scalable switches, billing systems• Image/signal proc: DVD, rendering, movies • Databases: OLTP, decision support, data warehousing • Commercial applications, financial modeling• RT, FT, Intensive I/O jobs, large compilations, . . .

Page 4: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

4Copyright © Amnon Barak 2001

Outcome

• Low cost Computing Clusters are replacingtraditional super-computers and mainframes,because they can provide good solutions formany demanding applications

• Computing clusters are especially suitable for small and medium organizations that can not afford expensive systems• Made of commodity components• Gradual growth - tuned to budget and needs

Page 5: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

5Copyright © Amnon Barak 2001

Advantage and disadvantage of Clusters

• Advantage: the ability to do parallel processing• Increased performance and higher availability

• Disadvantage: more complex to use and maintain• Question: is an SMP (NUMA) a good alternative ?

• Answer: by classification of parallel systems using 4 criteria:• Ease (complexity) of use, overall performance

(resource utilization), cost (affordability), scalability

Page 6: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

6Copyright © Amnon Barak 2001

2 alternatives: SMP vs. CC

• SMP (NUMA): easy to use, all the processors works in harmony, efficient multiprocessing and IPC, supports shared memory, good resource allocation, transparent to application, expensive when scaled up

• Clusters: low-cost, unlimited scalability, more difficult to use: insufficient “bonds” between nodes - each node works independently, many OS services are locally confined, limited provisions for remote services, inefficient resource utilization, may require modifications to applications

Page 7: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

7Copyright © Amnon Barak 2001

SMP vs. Clusters

Better property

excellentEase of use

Cluster

SMP/NUMA

inconvenient

excellentUtilization SMP / NUMA

tediousCluster

Cost SMP / NUMA high

excellentCluster

ScalabilityCluster

SMP / NUMA limited

unlimited

Page 8: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

8Copyright © Amnon Barak 2001

Goal of MOSIX

Make cluster computing as efficient and easy to use

as an SMP

Page 9: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

9Copyright © Amnon Barak 2001

Without MOSIX: user level control

Parallel and SequentialApplications

IndependentLinux machines

RSHPVM / MPI / RSH

Not transparent to applicationsRigid management lower performance

Page 10: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

10Copyright © Amnon Barak 2001

What is MOSIX

• A kernel layer that pool together the cluster-wide resources to provide user’s processes with the illusion of running on one big machine

• Model: Single System (like an SMP)• Ease of use - transparent to applications

- no need to modify applications• Maximal overall performance - adaptive resource

management, load-balancing by process migration• Overhead-free scalability (unlike SMP)

Page 11: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

11Copyright © Amnon Barak 2001

MOSIX is a unifying kernel layer

Parallel and SequentialApplications

IndependentLinux machines

MOSIXTransparent,unifying,kernel layer

PVM / MPI / RSHOptional

Page 12: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

12Copyright © Amnon Barak 2001

A two tier technology

1. Information gathering and dissemination• Provides each node with sufficient cluster information• Support scalable configurations• Fixed overhead: 16 or 6000 nodes

2. Preemptive process migration• Can migrate any process, anywhere, anytime• Transparent to applications - no change to user interface• Supervised by adaptive algorithms that respond to

global resource availability

Page 13: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

13Copyright © Amnon Barak 2001

Tier 1: Information exchange

• All the nodes gather and disseminate information about relevant resources: CPU speed, load, free memory, local/remote I/O, IPC

• Info exchanged in a random fashion (to support scalable configurations and overcome failures)

• Applicable to high volume transaction processing • Scalable web servers, telephony, billing

• Scalable storage area cluster (SAN + Cluster)for intensive, parallel access to shared files

Page 14: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

14Copyright © Amnon Barak 2001

Storage Area Cluster

LAN Switch

Storage Area Network

Cluster

Fibre-channel

Problem: how to preserve file consistency

Page 15: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

15Copyright © Amnon Barak 2001

Example: parallel “make” at EMC2

• Assign the next compilation to least loaded node• A cluster of ~320 CPUs (4-way Xeon nodes)• Runs 100-150 “build”, with millions lines of code

concurrently• Serial time ~30 hours, cluster time ~3 hours• Much better performance and unlimited scalability

vs. a commercial package, for less cost

Page 16: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

16Copyright © Amnon Barak 2001

Tier 2: Process migration for

• Load balancing: to improve the overall performance• Responds to uneven load distribution

• Memory ushering: to prevent disk paging

• Migrate processes when RAM exhausted

• Parallel file operations• Bring the process to the data

• Speed of migration (Fast Ethernet): 88.8 Mb/Sec

Page 17: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

17Copyright © Amnon Barak 2001

Intelligent load-balancing

Node 2Node 1

Round-Robin placement - IPC overhead

DA B

CIPC

Node 1 Node 2

AB

CD

Optimal assignment - No IPC overhead

Page 18: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

18Copyright © Amnon Barak 2001

Single system image model

•Users can “login” to any node in the cluster. Thisnode is the “home-node” for all the user’s processes•Migrated processes always seem to run at the home node,

e.g., “ps” show all your processes, even if they run elsewhere

•Migrated processes use local resources (at the remote nodes), while interact with the home-node to access their environment, e.g. perform I/O•Drawback: extra overhead for file and I/O operations

Page 19: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

19Copyright © Amnon Barak 2001

Splitting the Linux process

Deputy

Rem

ote

Kernel Kernel

User-space User-space

MOSIX LinkLocal

Local

Node 1 Node 2

•Process context (Remote) is site independent - may migrate•System context (deputy) is site dependent - must stay at “home”• Connected by an exclusive link for both synchronous (syscalls)

and asynchronous (signals, MOSIX events)

Page 20: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

20Copyright © Amnon Barak 2001

Example: MOSIX vs. PVM

Fixed number of processes per node

Random process size with average 8MB

MOSIX is scalablePVM is not scalable

Page 21: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

21Copyright © Amnon Barak 2001

Direct File System Access (DFSA)

•I/O access through the home node incurs high overhead

•Direct file System Access (DFSA) allow processes to perform file operations (directly) in the current node - not via the home node•Available operations: all common file and I/O

system-calls on conforming file systems•Conforming FS: GFS, MOSIX File System (MFS)

Page 22: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

22Copyright © Amnon Barak 2001

DFSA Requirements

•The FS (and symbolic-links) are identically mountedon the same-named mount-points (/mfs in all nodes)•File consistency: completed operation in one node is

seen in any other node•Required because a MOSIX process may perform

consecutive syscalls from different nodes

•Time-stamp consistency: if file A is modified after B, A time stamp must be greater than B's time stamp

Page 23: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

23Copyright © Amnon Barak 2001

The MOSIX File System (MFS)

• Provides a unified view of all files and all mounted FSs on all nodes, as if they were within a single FS

• Makes all directories and regular files throughout a MOSIX cluster available from all the nodes

• Provides file consistency from different nodes bymaintaining one cache at the server (disk) node

• Parallel file access by process migrate

Page 24: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

24Copyright © Amnon Barak 2001

Global File System (GFS) with DFSA

•Same as MFS, with local cache over the cluster using a unique locking mechanism•GFS + process migration combine the advantages

of load-balancing with direct disk access from any node - for parallel file operations•GFS for Linux 2.2 includes support for DFSA•Not written yet for Linux 2.4

Page 25: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

25Copyright © Amnon Barak 2001

Postmark (heavy FS load)Client - Server performance

157.5

104.4

101.0

16KB

MFS

159.5156.0161.3158.0169.1184.3NFS

105.5104.9104.1103.9104.0104.8

100.2100.2102.2100.0102.1102.6Local (in the server)

8KB4KB2KB1KB512B64B

Data Transfer Block SizeAccess Method

* All numbers in seconds

Page 26: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

26Copyright © Amnon Barak 2001

Parallel Read from a Quad server(4K block size)

0

100

150

200

250

300

1 2 4 8 16

(Tim

e (S

ec

LocalMFSNFS

50

Number of nodes

Page 27: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

27Copyright © Amnon Barak 2001

Tools

• Linux kernel debugger - on-line access to the kernel memory, processes properties, stack, etc.

• Cluster installation, administration and partition• Kernel monitor (mon) - displays load, speed,

total and used memory and CPU utilizatio

• JAVA monitor - display cluster properties • CPU load, speed, utilization, memory utilization,

CPU temp., fan speed• Access via the Internet

Page 28: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

28Copyright © Amnon Barak 2001

Java Monitor

* Leds* Dials

• Pie• Bars• Matrix• graphs• Fan• Tables…

* Radar

Page 29: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

29Copyright © Amnon Barak 2001

MOSIX is best for

• Multi-user platforms - where many users share the cluster resources

• High performance demanding applications

• Parallel applications - HPC

• Non-uniform clusters - with different speed machines, different memory sizes, etc.

Page 30: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

30Copyright © Amnon Barak 2001

API and implementation

• API: no new system-calls - all done via /proc• MOSIX 0.9 for Linux 2.2 (since May 1999):• 80 new files (40K lines), 109 files modified (7K lines)• 3K lines in load-balancing algorithms

• MOSIX 1. for Linux 2.4 (since May 2001):• 45 new files (35K lines), 124 files modified (4K lines)• 48 user-level files (10K lines)

Page 31: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

31Copyright © Amnon Barak 2001

Our “scalable” configuration

Page 32: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

32Copyright © Amnon Barak 2001

Examples of HPC Applications

• Programming environments: PVM, MPI• Examples:• Protein classification - 20 days 32 nodes• High energy molecular dynamics• Quantum dynamics simulations - 100 days non-stop• Weather forecasting (MM5)• Computational fluid dynamics (CFD)• Car crash simulations (parallel Autodyn)

Page 33: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

33Copyright © Amnon Barak 2001

Summary

• MOSIX brings a new dimension to Cluster and GRID Computing• Ease of use - like an SMP• Overhead-free scalability• Near optimal performance• May be used over LAN and WAN (GRID)

• Production quality -• Large production clusters installed

Page 34: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

34Copyright © Amnon Barak 2001

Ongoing projects

• DSM - strict consistency model• High availability - recovery model• Migratable sockets - for IPC optimization• Scalable web server - cluster IP• Network (cluster) RAM - bring process to data• Parallel File System

Page 35: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

35Copyright © Amnon Barak 2001

Future projects*

• Port MOSIX to:• New platforms, e.g. Itanium, PPC• Other OS, e.g. FreeBSD, Mac OS X

• Shared disk file system, e.g. GFS• Parallel DBMS, e.g. billing systems• Tune to specific parallel applications, e.g. rendering

* subject to external funding

Page 36: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

36Copyright © Amnon Barak 2001

The MOSIX group

• Small R&D group - 7 people• Over 20 years of experience in Unix kernel• Development of 7 generations of MOSIX

• We like to work with a large company on R&D of cluster technology - not only in Linux

• Possible directions: HPC, telephony, high volume transaction/web servers, rendering, movies, etc.

Page 37: The MOSIX Scalable Cluster Computing for Linux · •Low cost Computing Clusters are replacing traditional super-computers and mainframes, because they can provide good solutions

37Copyright © Amnon Barak 2001

MOSIX

Thank you for your attention

Questions ?

For further details please contact:[email protected]