Top Banner
1 Lectures on Grid Computing Tuğba Taşkaya-Temizel January 2006
47
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Lecture01.ppt

1

Lectures on Grid Computing

Tuğba Taşkaya-TemizelJanuary 2006

Page 2: Lecture01.ppt

2

What is Grid?Access Grid The Access Grid is a collection

of familiar resources (projectors, cameras, microphones) linked by networked computers to enable audiovisual collaboration between remote participants: videoconferencing.

Just as importantly, the Access Grid provides interfaces to Grid middleware enabling the creation of new tools for collaborative visualization, data-sharing, remote control of instruments and interaction with other grid resources.

Images: http://www.cisl.ucar.edu/news/02/features/vislab/trustees5.htmlhttp://www.informatics.bangor.ac.uk/~ade/gallery/ag/IMG_0746

http://www.accessgrid.org/

Page 3: Lecture01.ppt

3

Today’s Program

14:00-15.20 Introduction to Grid (LTA) 15:30-15:50 Access Grid Demo

(41AD03) 15:50-16:00 Visit to Grid Environment

at our Department (BB02) 16:10-18:00 Laboratory (Registering

to the Grid environment) (APLAB2)

Page 4: Lecture01.ppt

4

What is Grid?Power Grids

• A network of high-voltage transmission lines and connections that supply electricity from a number of generating stations to various distribution centres in a country or a region, so that no consumer is dependent on a single station.

http://science.howstuffworks.com/power.htm

Page 5: Lecture01.ppt

5

Grid ComputingEverywhere

Business: Sectors like financial services, industrial manufacturing, energy…

Humanitarian works

Research : Health, Aerospace, Astronomy, Finance…

Government

Page 6: Lecture01.ppt

6

Grid Computing The internet took 20 years to be taken

seriously by business. By comparison the grid is happening far more rapidly. Tom Hawk, IBM

Insight Research says the worldwide market for grid technology and services is doubling every year and will reach $5 billion by 2008.

Grid computing is just one of the technologies the UK government says, in its latest report, should receive more support and funding. (December 17,2003)

Page 7: Lecture01.ppt

7

Grid Computing "We really do believe that grid computing is

real," CEO of Hewlett-Packard Carly Fiorina said. "It is driving the R&D in our industry. For the first time our energy is focused on something else than building a killer app or a hot box. We are more focused on making system that combines the best of IT and business. Imagine what is possible." (September 11, 2003)

"The Grid will be the major new direction for IT," said Geoff Brown, technical director for ATS Core Technologies at Oracle. (October 28, 2002)

Page 8: Lecture01.ppt

8

DEFINITIONS: Grid?GRID:

The Grid is envisaged to be ‘the computing and data management infrastructure that will provide the electronic underpinning for a global society in business, government, science and entertainment’

Berman, Fox and Hey (2003:9)

Page 9: Lecture01.ppt

9

DEFINITIONS: Grid?GRID:

A virtual information processing environment where the user has the ‘illusion’ of a seamless single-source computing power which is actually distributed.

Page 10: Lecture01.ppt

10

Why should you care?

Ian Foster explains why we should care Grids in three points:

Vision

Reality

Future

Page 11: Lecture01.ppt

11

Why should you care?

Grid is a disruptive technology [Vision] It ushers in a virtualized, collaborative,

distributed world.

Two interrelated opportunities1) Enhance economy, flexibility, access by

virtualizing computing resources2) Deliver entirely new capabilities by

integrating distributed resources

Vision

Reality

Future

Vision

Reality

Future

Source: Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003

Page 12: Lecture01.ppt

12

Why should you care? Virtualization

Vision

Reality

Future

Vision

Reality

Future

Application Virtualization

• Automatically connect applications to services• Dynamic & intelligent provisioning

Infrastructure Virtualization

• Dynamic & intelligent provisioning• Automatic failover

Source: The Grid: Blueprint for a New Computing Infrastructure (2nd Edition), 2004

Page 13: Lecture01.ppt

13

Why should you care? Distributed System Integration

Vision

Reality

Future

Vision

Reality

Future

UK e-Science Centres

Source: http://www.nesc.ac.uk/centres/

Page 14: Lecture01.ppt

14

Why should you care?Vision

Reality

Future

Vision

Reality

Future

Source: “The Anatomy of the Grid”, Foster, Kesselman, Tuecke, 2001

The real and specific problem that underlies the Grid concept is coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations.

Page 15: Lecture01.ppt

15

Why should you care?Terminology

Grid has strong links with “Utility Computing”, “Autonomic Computing” and “Service Oriented Architecture”.

Vision

Reality

Future

Vision

Reality

Future

Page 16: Lecture01.ppt

16

Why should you care? Grid addresses pain points now

[Reality]Grids are built not bought, but are delivering

real benefits in commercial settings Low utilization of enterprise resources High cost of provisioning for peak demand Inadequate resources prevent use of

advanced applications Lack of information integration

Vision

Reality

Future

Source: Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003

Page 17: Lecture01.ppt

17

Why should you care?Early Commercial Applications

Vision

Reality

Future

Source: Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003

“Gridified” Infrastructure

FinancialServices

DerivativesAnalysis

Statistical Analysis

Portfolio Risk

Analysis

DerivativesAnalysis

Statistical Analysis

Portfolio Risk

Analysis

Manufacturing

Mechanical/ Electronic

Design

Process Simulation

FiniteElement Analysis

Failure Analysis

Mechanical/ Electronic

Design

Process Simulation

FiniteElement Analysis

Failure Analysis

LS / Bioinformatics

Cancer Research

Drug Discovery

Protein Folding

Protein Sequencing

Cancer Research

Drug Discovery

Protein Folding

Protein Sequencing

Other

Web Applications

Weather Analysis

Code Breaking/

Simulation

Academic

Web Applications

Weather Analysis

Code Breaking/

Simulation

Academic

Sources: IDC, 2000 and Bear Stearns- Internet 3.0 - 5/01 Analysis by SAI

Gri

d S

erv

ice

s M

ark

et

Op

po

rtu

nit

y 2

00

5

Energy

Seismic Analysis

Reservoir Analysis

Seismic Analysis

Reservoir Analysis

Entertainment

Digital Rendering

Digital Rendering

Massive Multi-Player

Games

Massive Multi-Player

Games

Streaming Media

Streaming Media

Leading adopters (Oct 2003) *• Financial services: 31%• Life sciences: 26%• Manufacturing: 18%

*Grids 2004: From Rocket Science To Business Service, The 451 Group

Page 18: Lecture01.ppt

18

Why should you care?Grid Deployment Strategies

A range of excellent commercial & open source products for resource federation Federate enterprise computing resources Federate enterprise information resources Globus Toolkit®: inter-enterprise sharing

But, “Grids are built, not bought” Integration with other enterprise systems is

needed to deliver complete solution Start small & with well-defined ROI case

Grow based on experience

Vision

Reality

Future

Source: Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003

Page 19: Lecture01.ppt

19Image courtesy Christian Richters: Source:Wired News

Data Grids for High Energy Physics

Fastest particle accelarator: Large Hadron ColliderWhen completed in 2005, CERN's Large Hadron Collider will send protons and ions from hydrogen nuclei rushing through a 17-mile circular tunnel at speeds of up to 52,200,000 miles per hour.

Page 20: Lecture01.ppt

20Image courtesy Harvey Newman, Caltech

Tier2 Centre ~1 TIPS

Online System

Offline Processor Farm

~20 TIPS

CERN Computer Centre

FermiLab ~4 TIPSFrance Regional Centre

Italy Regional Centre

Germany Regional Centre

InstituteInstituteInstituteInstitute ~0.25TIPS

Physicist workstations

~100 MBytes/sec

~100 MBytes/sec

~622 Mbits/sec

~1 MBytes/sec

There is a “bunch crossing” every 25 nsecs.

There are 100 “triggers” per second

Each triggered event is ~1 MByte in size

Physicists work on analysis “channels”.

Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server

Physics data cache

~PBytes/sec

~622 Mbits/sec or Air Freight (deprecated)

Tier2 Centre ~1 TIPS

Tier2 Centre ~1 TIPS

Tier2 Centre ~1 TIPS

Caltech ~1 TIPS

~622 Mbits/sec

Tier 0Tier 0

Tier 1Tier 1

Tier 2Tier 2

Tier 4Tier 4

1 TIPS is approximately 25,000

SpecInt95 equivalents

Data Grids for High Energy Physics

Page 21: Lecture01.ppt

21

Mathematicians Solve NUG30 Looking for the solution

to the NUG30 quadratic assignment problem

An informal collaboration of mathematicians and computer scientists

Condor-G delivered 3.46E8 CPU seconds in 7 days (peak 1009 processors) in U.S. and Italy (8 sites)

NUG30 Solution: 14,5,28,24,1,3,16,15,10,9,21,2,4,29,25,22,13,26,17,30,6,20,19,8,18,7,27,12,11,23

MetaNEOS: Argonne, Iowa, Northwestern, WisconsinSource:Shawn McKee The Grid:The Future of High Energy Physics Computing? January 7,2002

Page 22: Lecture01.ppt

22

Network for Earthquake Engineering Simulation

NEESgrid: national infrastructure to couple earthquake engineers with experimental facilities, databases, computers, & each other

On-demand access to experiments, data streams, computing, archives, collaboration

NEESgrid: Argonne, Michigan, NCSA, UIUC, USC

Page 23: Lecture01.ppt

23

The 13.6 TF TeraGrid: Computing at 40 Gb/s

26

24

8

4 HPSS

5

HPSS

HPSS UniTree

External Networks

External Networks

External Networks

External Networks

Site Resources Site Resources

Site ResourcesSite ResourcesNCSA/PACI8 TF240 TB

SDSC4.1 TF225 TB

Caltech Argonne

TeraGrid/DTF: NCSA, SDSC, Caltech, Argonne www.teragrid.org

Page 24: Lecture01.ppt

24

Why should you care?

An open Grid is to your advantage [Future] Standards are being defined now that will determine the future of this technology

Vision

Reality

Future

Source: Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003

Page 25: Lecture01.ppt

25

Grid Vision, Marketing, and Reality

Vision Computing & data resources can be shared

like content on the Wb Marketing

Have we got a [Data, compute, knowledge, information, desktop, PC, enterprise, cluster, …] Grid for you!

Reality Commercial products mostly noninteroperable Open source tools offer de facto standards,

but are also far from a complete solution

Vision

Reality

Future

Source: Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003

Page 26: Lecture01.ppt

26

Standards Matter!

Open, standard protocols Enable interoperability Avoid product/vendor lock-in Enable innovation/competition on end points Enable ubiquity

In Grid space, must address how we Describe, discover, & access resources Monitor, manage, & coordinate, resources Account & charge for resources

For many different types of resource

Vision

Reality

Future

Source: Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003

Page 27: Lecture01.ppt

27

Open Grid Services Architecture

Define a service-oriented architecture … the key to effective virtualization

… that addresses vital “Grid” requirements AKA utility, on-demand, system management,

collaborative computing in particular, distributed service management

… building on Web services standards extending those standards where needed

“The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration”, Foster, Kesselman, Nick, Tuecke, 2002

Vision

Reality

Future

Page 28: Lecture01.ppt

28

A family of six Web services specifications A design pattern to

specify how to use Web services to access “stateful” components

Message-based publish-subscribe to Web services

Latest Step Forward:WS-Resource Framework

Gro

ups

References

Noti

fica

tion

Faults

Properties

Lifetim

e

Vision

Reality

Future

Source: Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003

Page 29: Lecture01.ppt

29

WS-Resource Framework Completes Grid-WS Convergence

Grid

Web

The definition of WSRF means that Grid and Web communities can move forward on a common base

WSRF

Started far apart in apps & tech

OGSI

GT2

GT1

HTTPWSDL,

WS-*

WSDL 2,

WSDM

Have beenconverging

Vision

Reality

Future

Source: Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003

Page 30: Lecture01.ppt

30

2100

2100 2100 2100 2100

2100 2100 2100 2100

Personal Device SMPs or SuperComputers

LocalCluster

GlobalGrid

PERFORMANCE

+

Q

o

S

•Individual•Group•Department•Campus•State•National•Globe•Inter Planet•Universe

Administrative Barriers

EnterpriseCluster/Grid

The Evolution of the GRID

Source: www.gridbus.org

Page 31: Lecture01.ppt

31

The Evolution of the GRID

The first generation involved proprietary solutions for sharing high-performance computing resources

The second generation introduced middleware to cope with scale and heterogeneity

The third generation introduced a service-oriented approach leading to commercial projects in addition to the scientific projects now collectively known as e-Science

Page 32: Lecture01.ppt

32

The Evolution of the GRID The first generation

FAFNER, I-WAY The second generation

Technologies: Globus, Legion Distributed object systems (Jini and RMI, The

common component architecture form) Grid resource brokers and schedulers Grid portals Integrated systems Peer-to-Peer computing

The third generation Service-oriented architecture (web services, OGSA,

Agents) Information aspects: relation with the World Wide

Web Live information systems

Page 33: Lecture01.ppt

33

The Evolution of the GRID

Grid is being developed not only to make distributed resources available to end-user not also to co-ordinate such usage for sharing and aggregation of resources.

Page 34: Lecture01.ppt

34

The Evolution of the GRID

Moore’s law improvements in computing produce highly functional end-systems

The internet and burgeoning wired and wireless provide wide-spread connectivity

Changing modes of working and problem solving emphasise teamwork, computation

Network growth produce dramatic changes in topology and geography

Page 35: Lecture01.ppt

35

GRID: Key Issues

Resources Discovery, Allocation, Scheduling

Availability

Access, Security, Networks

Efficiency Economy, Management Administration.

Hardware Computers, Services, Networks

Application

Development, Testing

Page 36: Lecture01.ppt

36

GRID: Key Issues Sharing

A biochemist will be able to exploit 10,000 computers to screen 100,000 compounds in an hour

1,000 physicists worldwide will be able to pool resources for petop analyses of petabytes of data

A multidisciplinary analysis in aerospace couples code and data in geographically distributed organisations may be possible

Civil engineers colloborate to design, execute, and analyse shake table experiments

Climate scientists will be able to visualise, annotate, and analyse terabyte simulation datasets

Page 37: Lecture01.ppt

37DOE X-ray grand challenge: ANL, USC/ISI, NIST, U.Chicago

tomographic reconstruction

real-timecollection

wide-areadissemination

desktop & VR clients with shared controls

Advanced Photon Source

GRID: Key Issues Sharing Online Access to Scientific Instruments

archival storage

Page 38: Lecture01.ppt

38

MORE DEFINITIONS Resource Network protocol Network enabled service Application Programming

Interface(API) Software Development Kit (SDK) Syntax

Page 39: Lecture01.ppt

39

MORE DEFINTIONS: Resource

An entity that is to be shared E.g., computers, storage, data, software

Does not have to be physical entity E.g., Condor pool, distributed file system,…

Defined in terms of interfaces, not devices E.g. scheduler such as LSF and PBS define a

compute resource Open/close/read/write define access to a

distributed file system, e.g NFS, AFS, DFS

Page 40: Lecture01.ppt

40

MORE DEFINTIONS: Network protocol

A formal description of message formats and a set of rules for message exchange Rules may define sequence of message

exchanges Protocol may define state-change in endpoint,

e.g. file system state change Good protocols designed to do one thing

Protocols can be layered Examples of protocols

IP, TCP, TLS( was SSL), HTTP, Kerberos

Page 41: Lecture01.ppt

41

MORE DEFINTIONS: Network enabled services

Implementation of a protocol that defines a set of capabilities Protocol defines interaction with

service All services require protocols Not all protocols are used to provide

services (e.g. IP, TLS) Examples: FTP and Web servers

Page 42: Lecture01.ppt

42

MORE DEFINTIONS: Application Programming Interface

(API)

A specification for a set of routines to facilitate application development

Spec often language specific (or IDL) Routine name, number, order and type of

arguments; mapping to language constructs Behaviour or function of routine

Examples GSS API(security), MPI (message passing)

Page 43: Lecture01.ppt

43

MORE DEFINTIONS: Software Development Kit (SDK)

A particular instantiation of API SDK consists of libraries and tools

Provides implementation of API specification

Can have multiple SDKs for an API Examples of SDKs

MPICH, Motif Widgets

Page 44: Lecture01.ppt

44

MORE DEFINTIONS: Syntax

Rules for encoding information, e.g. XML, Condor ClassAds, Globus RSL

Distinct from protocols One syntax may be used by many

protocols Syntaxes may be layered

E.g., Condor ClassAds -> XML->ASCII

Page 45: Lecture01.ppt

45

References

Berman F., Fox G., Hey T. (2003) Grid Computing: Making the Global Infrastructure a Reality, Chichester, John Willey & Sons Inc.

http://www.computing.surrey.ac.uk/courses/csm23/list.html

Page 46: Lecture01.ppt

46

CSM23 Assessment and Weighting

Components of Assessment

Method(s) Percentage weighting

Annotated bibliography Students are required to write a 200 word summary of each of 5 key research papers

10%

Oral Examination 20%

Laboratory Exercise Students are required to implement small-scale laboratory homework during the semester.

20%

Project Students are expected to implement a Grid project ad write IEEE formatted report about their projects. In addition, the students are asked to give a presentation.

Implementation:20%IEEE Report:20%Presentation:10%

Page 47: Lecture01.ppt

47

CSM23 TimetableDate Topic Lecturer

16/01/2006 Overview and Motivation Mrs.Tugba Taskaya-Temizel

23/01/2006 Grid Architecture and Technologies

Mrs.Tugba Taskaya-Temizel

30/01/2006 Security Dr.James Heather

6/02/2006 Parallel Computing Dr.Roger M A Peel,

13/02/2006 Resource Allocation, Data Management, Information Services and Peer-to-Peer Networks

Mrs.Tugba Taskaya-Temizel

20/02/2006 Grid Applications Mrs.Tugba Taskaya-Temizel

27/02,6/03, 13/03, 20/03

Seminars Mrs.Tugba Taskaya-Temizel