Top Banner
Open Source Cluster Applications Resources
35

Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

Dec 28, 2015

Download

Documents

Janel Gordon
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

Open Source Cluster Applications Resources

Page 2: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

OverviewWhat is O.S.C.A.R.?HistoryInstallationOperationSpin-offsConclusions

Page 3: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

HistoryCCDK (Community Cluster Development Kit)OCG (Open Cluster Group)OSCAR (the Open Source Cluster Application

Resource)IBM, Dell, SGI and Intel working closely

togetherORNL – Oak Ridge National Laboratory

Page 4: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

First MeetingTim Mattson and Stephen ScottDecided on these:

That the adoption of clusters for mainstream, high-performance computing is inhibited by a lack of well-accepted software stacks that are robust and easy to use by the general user.

That the group embraces the open-source model of software distribution. Anything contributed to the group must be freely distributable, preferably as source code under the Berkeley open-source license.

That the group can accomplish its goals by propagating best-known practices built up through many years of hard work by cluster computing pioneers.

Page 5: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

Initial ThoughtsDiffering architectures (small, medium, large)Two paths of progress, R&D and ease of usePrimarily for non-computer-savvy users.

ScientistsAcademics

Homogeneous system

Page 6: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

TimelineInitial meeting in 2000Beta development started the same yearFirst distribution, OSCAR 1.0 in 2001 at

LinuxWorld Expo in New York CityToday up to OSCAR 5.1

Heterogeneous systemFar more robustMore user friendly

Page 7: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

Supported Distributions – 5.0Distribution and Release

Architecture Status

Red Hat Enterprise Linux 4

x86 Fully supported

Red Hat Enterprise Linux 4

x86_64 Fully supported

Red Hat Enterprise Linux 4

ia64 Fully supported

Fedora Core 4 x86 Fully supported

Fedora Core 4 x86_64 Fully supported

Fedora Core 5 x86 Fully supported

Fedora Core 5 x86_64 Fully supported

Mandriva Linux 2006 x86 Fully supported

SUSE Linux 10.0 x86 Fully supported

Page 8: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

InstallationDetailed Installation notesDetailed User guideBasic idea:

Configure head node (server)Configure image for client nodesConfigure networkDistribute node imagesManage your own cluster!!

Page 9: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

Head NodeInstall by running ./install_cluster eth1 scriptGUI will auto-launch Chose desired step in GUI, make sure each

step is complete before proceeding onto next one

All the configuration can be done from this system from now on

Page 10: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.
Page 11: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

DownloadSubversion is usedDefault is the OSCAR SVNCan set up custom SVNAllows for up to date

installationAllows for controlled

rollouts of multiple clustersOPD also has powerful

command line functionality (LWP for proxy servers)

Page 12: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

Select & Configure OSCAR packagesCustomize server up to your

liking/needsSome packages can be

customizedThis step is very crucial,

choice of packages can affect performance as well as compatibility

Page 13: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

Installation of Server NodeSimply installs packages which were selectedAutomatically configures the server nodeNow the Head or Server is ready to manage,

administer and schedule jobs for it’s client nodes

Page 14: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

Build Client ImageChoose nameSpecify packages within the package fileSpecify distributionBe wary of automatic reboot if network boot

is manually selected as default

Page 15: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

Building the Client Image …

Page 16: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

Define ClientsThis step creates the network structure of the

nodesIt’s advisable to assign IP based on physical

linksGUI short-comings regarding multiple IP

spansIncorrect setup can lead to an error during

node installation

Page 17: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

Define Clients

Page 18: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

Setup NetworkingSIS – System Installation SuiteSystemImagerMAC addresses are scanned forMust link a MAC to a nodeMust select network boot method (rsync,

multicast, bt)Must make sure clients support PXE boot or

create boot CDsOwn Kernel can be used if the one supplied

with SIS does not work

Page 19: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.
Page 20: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

Client Installation and TestAfter the network is properly configured,

installation can beginAll nodes are installed and rebootedOnce the system imaging is complete, a test

can be run to ensure the cluster is working properly

At this point, the cluster is ready to begin parallel job scheduling

Page 21: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

OperationAdmin packages are:

Torque Resource Manager Maui Scheduler C3pfilterSystem Imager Suite Switcher Environment ManagerOPIUMGanglia

Page 22: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

OperationLibrary packages:

LAM/MPIOpenMPIMPICHPVM

Page 23: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

Torque Resource ManagerServer on Head node“mom” daemon on clientsHandles job submission and executionKeeps track of cluster resourcesHas own scheduler but uses Maui by defaultCommands are not intuitive, documentation

must be readFrom OpenPBShttp://svn.oscar.openclustergroup.org/wiki/os

car:5.1:administration_guide:ch4.1.1_torque_overview

Page 24: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

Maui SchedulerHandles job schedulingSophisticated algorithmsCustomizableMuch literature on it’s algorithmsHas a commercial gen. of Maui called MoabAccepted as the unofficial HPC standard for

schedulinghttp://www.clusterresources.com/pages/resou

rces/documentation.php

Page 25: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

C3 - Cluster Command Control Developed by ORNLCollection of tools for cluster administrationCommands:

cget, cpush, crm, cpushimagecexec, cexecs, ckill, cshutdowncnum, cname, clist

Cluster Configuration Fileshttp://svn.oscar.openclustergroup.org/wiki/

oscar:5.1:administration_guide:ch4.3.1_c3_overview

Page 26: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

pfilterCluster traffic filterDefault is that client nodes can only send

outgoing communications, outside the scope of the cluster

If it is desirable to open up client nodes, pfilter config file must be modified

Page 27: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

System Imager SuiteTool for network Linux installationsImage based, can even chroot into imageAlso has database which contains cluster

configuration informationTied in with C3Can handle multiple images per clusterCompletely automated once image is createdhttp://wiki.systemimager.org/index.php/

Main_Page

Page 28: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

Switcher Environment Manager Handles “dot” filesDoes not limit advanced usersDesigned to help non-savvy usersHas guards in place that prevent system

destructionWhich MPI to use – per user basisOperates on two levels: user and systemModules package is included for advanced

users (and used by switcher)

Page 29: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

OPIUMLogin is handled by the Head nodeOnce connection is established, client nodes

do not require authenticationSynchronization run by root, at intervalsIt stores hash values of the password in .shh

folder along with a “salt”Password changes must be done at the Head

node as all changes propagate from there

Page 30: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

GangliaDistributed Monitoring SystemLow overhead per nodeXML for data representationRobustUsed in most cluster and grid solutionshttp://ganglia.info/papers/science.pdf

Page 31: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

LAM/MPILAM - Local Area MulticomputerLAM initializes the runtime environment on a

select number of nodesMPI 1 and some of MPI 2MPICH2 can be used if installedTwo tiered debugging system exists:

snapshot and communication logDaemon basedhttp://www.lam-mpi.org/

Page 32: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

Open MPIReplacement for LAM/MPISame team working on itLAM/MPI relegated to upkeep only, all new

development in Open MPIMuch more robust (OS, schedulers)Full MPI-2 complianceMuch higher performancehttp://www.open-mpi.org/

Page 33: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

PVM – Parallel Virtual MachineSame as LAM/MPICan be run outside of the scope of Torque

and MauiSupports Windows nodes as wellMuch better portabilityNot as robust and powerful as Open MPIhttp://www.csm.ornl.gov/pvm/

Page 34: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

Spin-offsHA-OSCAR - http://xcr.cenit.latech.edu/ha-

oscar/VMware with OSCAR -

http://www.vmware.com/vmtn/appliances/directory/341

SSI-OSCAR - http://ssi-oscar.gforge.inria.fr/SSS-OSCAR -

http://www.csm.ornl.gov/oscar/sss/

Page 35: Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.

ConclusionsFuture DirectionOpen MPIWindows, Mac OS?