1 Lectures on Grid Computing Tuğba Taşkaya-Temizel January 2006
1
Lectures on Grid Computing
Tuğba Taşkaya-TemizelJanuary 2006
2
What is Grid?Access Grid The Access Grid is a collection
of familiar resources (projectors, cameras, microphones) linked by networked computers to enable audiovisual collaboration between remote participants: videoconferencing.
Just as importantly, the Access Grid provides interfaces to Grid middleware enabling the creation of new tools for collaborative visualization, data-sharing, remote control of instruments and interaction with other grid resources.
Images: http://www.cisl.ucar.edu/news/02/features/vislab/trustees5.htmlhttp://www.informatics.bangor.ac.uk/~ade/gallery/ag/IMG_0746
http://www.accessgrid.org/
3
Today’s Program
14:00-15.20 Introduction to Grid (LTA) 15:30-15:50 Access Grid Demo
(41AD03) 15:50-16:00 Visit to Grid Environment
at our Department (BB02) 16:10-18:00 Laboratory (Registering
to the Grid environment) (APLAB2)
4
What is Grid?Power Grids
• A network of high-voltage transmission lines and connections that supply electricity from a number of generating stations to various distribution centres in a country or a region, so that no consumer is dependent on a single station.
http://science.howstuffworks.com/power.htm
5
Grid ComputingEverywhere
Business: Sectors like financial services, industrial manufacturing, energy…
Humanitarian works
Research : Health, Aerospace, Astronomy, Finance…
Government
6
Grid Computing The internet took 20 years to be taken
seriously by business. By comparison the grid is happening far more rapidly. Tom Hawk, IBM
Insight Research says the worldwide market for grid technology and services is doubling every year and will reach $5 billion by 2008.
Grid computing is just one of the technologies the UK government says, in its latest report, should receive more support and funding. (December 17,2003)
7
Grid Computing "We really do believe that grid computing is
real," CEO of Hewlett-Packard Carly Fiorina said. "It is driving the R&D in our industry. For the first time our energy is focused on something else than building a killer app or a hot box. We are more focused on making system that combines the best of IT and business. Imagine what is possible." (September 11, 2003)
"The Grid will be the major new direction for IT," said Geoff Brown, technical director for ATS Core Technologies at Oracle. (October 28, 2002)
8
DEFINITIONS: Grid?GRID:
The Grid is envisaged to be ‘the computing and data management infrastructure that will provide the electronic underpinning for a global society in business, government, science and entertainment’
Berman, Fox and Hey (2003:9)
9
DEFINITIONS: Grid?GRID:
A virtual information processing environment where the user has the ‘illusion’ of a seamless single-source computing power which is actually distributed.
10
Why should you care?
Ian Foster explains why we should care Grids in three points:
Vision
Reality
Future
11
Why should you care?
Grid is a disruptive technology [Vision] It ushers in a virtualized, collaborative,
distributed world.
Two interrelated opportunities1) Enhance economy, flexibility, access by
virtualizing computing resources2) Deliver entirely new capabilities by
integrating distributed resources
Vision
Reality
Future
Vision
Reality
Future
Source: Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003
12
Why should you care? Virtualization
Vision
Reality
Future
Vision
Reality
Future
Application Virtualization
• Automatically connect applications to services• Dynamic & intelligent provisioning
Infrastructure Virtualization
• Dynamic & intelligent provisioning• Automatic failover
Source: The Grid: Blueprint for a New Computing Infrastructure (2nd Edition), 2004
13
Why should you care? Distributed System Integration
Vision
Reality
Future
Vision
Reality
Future
UK e-Science Centres
Source: http://www.nesc.ac.uk/centres/
14
Why should you care?Vision
Reality
Future
Vision
Reality
Future
Source: “The Anatomy of the Grid”, Foster, Kesselman, Tuecke, 2001
The real and specific problem that underlies the Grid concept is coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations.
15
Why should you care?Terminology
Grid has strong links with “Utility Computing”, “Autonomic Computing” and “Service Oriented Architecture”.
Vision
Reality
Future
Vision
Reality
Future
16
Why should you care? Grid addresses pain points now
[Reality]Grids are built not bought, but are delivering
real benefits in commercial settings Low utilization of enterprise resources High cost of provisioning for peak demand Inadequate resources prevent use of
advanced applications Lack of information integration
Vision
Reality
Future
Source: Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003
17
Why should you care?Early Commercial Applications
Vision
Reality
Future
Source: Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003
“Gridified” Infrastructure
FinancialServices
DerivativesAnalysis
Statistical Analysis
Portfolio Risk
Analysis
DerivativesAnalysis
Statistical Analysis
Portfolio Risk
Analysis
Manufacturing
Mechanical/ Electronic
Design
Process Simulation
FiniteElement Analysis
Failure Analysis
Mechanical/ Electronic
Design
Process Simulation
FiniteElement Analysis
Failure Analysis
LS / Bioinformatics
Cancer Research
Drug Discovery
Protein Folding
Protein Sequencing
Cancer Research
Drug Discovery
Protein Folding
Protein Sequencing
Other
Web Applications
Weather Analysis
Code Breaking/
Simulation
Academic
Web Applications
Weather Analysis
Code Breaking/
Simulation
Academic
Sources: IDC, 2000 and Bear Stearns- Internet 3.0 - 5/01 Analysis by SAI
Gri
d S
erv
ice
s M
ark
et
Op
po
rtu
nit
y 2
00
5
Energy
Seismic Analysis
Reservoir Analysis
Seismic Analysis
Reservoir Analysis
Entertainment
Digital Rendering
Digital Rendering
Massive Multi-Player
Games
Massive Multi-Player
Games
Streaming Media
Streaming Media
Leading adopters (Oct 2003) *• Financial services: 31%• Life sciences: 26%• Manufacturing: 18%
*Grids 2004: From Rocket Science To Business Service, The 451 Group
18
Why should you care?Grid Deployment Strategies
A range of excellent commercial & open source products for resource federation Federate enterprise computing resources Federate enterprise information resources Globus Toolkit®: inter-enterprise sharing
But, “Grids are built, not bought” Integration with other enterprise systems is
needed to deliver complete solution Start small & with well-defined ROI case
Grow based on experience
Vision
Reality
Future
Source: Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003
19Image courtesy Christian Richters: Source:Wired News
Data Grids for High Energy Physics
Fastest particle accelarator: Large Hadron ColliderWhen completed in 2005, CERN's Large Hadron Collider will send protons and ions from hydrogen nuclei rushing through a 17-mile circular tunnel at speeds of up to 52,200,000 miles per hour.
20Image courtesy Harvey Newman, Caltech
Tier2 Centre ~1 TIPS
Online System
Offline Processor Farm
~20 TIPS
CERN Computer Centre
FermiLab ~4 TIPSFrance Regional Centre
Italy Regional Centre
Germany Regional Centre
InstituteInstituteInstituteInstitute ~0.25TIPS
Physicist workstations
~100 MBytes/sec
~100 MBytes/sec
~622 Mbits/sec
~1 MBytes/sec
There is a “bunch crossing” every 25 nsecs.
There are 100 “triggers” per second
Each triggered event is ~1 MByte in size
Physicists work on analysis “channels”.
Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server
Physics data cache
~PBytes/sec
~622 Mbits/sec or Air Freight (deprecated)
Tier2 Centre ~1 TIPS
Tier2 Centre ~1 TIPS
Tier2 Centre ~1 TIPS
Caltech ~1 TIPS
~622 Mbits/sec
Tier 0Tier 0
Tier 1Tier 1
Tier 2Tier 2
Tier 4Tier 4
1 TIPS is approximately 25,000
SpecInt95 equivalents
Data Grids for High Energy Physics
21
Mathematicians Solve NUG30 Looking for the solution
to the NUG30 quadratic assignment problem
An informal collaboration of mathematicians and computer scientists
Condor-G delivered 3.46E8 CPU seconds in 7 days (peak 1009 processors) in U.S. and Italy (8 sites)
NUG30 Solution: 14,5,28,24,1,3,16,15,10,9,21,2,4,29,25,22,13,26,17,30,6,20,19,8,18,7,27,12,11,23
MetaNEOS: Argonne, Iowa, Northwestern, WisconsinSource:Shawn McKee The Grid:The Future of High Energy Physics Computing? January 7,2002
22
Network for Earthquake Engineering Simulation
NEESgrid: national infrastructure to couple earthquake engineers with experimental facilities, databases, computers, & each other
On-demand access to experiments, data streams, computing, archives, collaboration
NEESgrid: Argonne, Michigan, NCSA, UIUC, USC
23
The 13.6 TF TeraGrid: Computing at 40 Gb/s
26
24
8
4 HPSS
5
HPSS
HPSS UniTree
External Networks
External Networks
External Networks
External Networks
Site Resources Site Resources
Site ResourcesSite ResourcesNCSA/PACI8 TF240 TB
SDSC4.1 TF225 TB
Caltech Argonne
TeraGrid/DTF: NCSA, SDSC, Caltech, Argonne www.teragrid.org
24
Why should you care?
An open Grid is to your advantage [Future] Standards are being defined now that will determine the future of this technology
Vision
Reality
Future
Source: Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003
25
Grid Vision, Marketing, and Reality
Vision Computing & data resources can be shared
like content on the Wb Marketing
Have we got a [Data, compute, knowledge, information, desktop, PC, enterprise, cluster, …] Grid for you!
Reality Commercial products mostly noninteroperable Open source tools offer de facto standards,
but are also far from a complete solution
Vision
Reality
Future
Source: Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003
26
Standards Matter!
Open, standard protocols Enable interoperability Avoid product/vendor lock-in Enable innovation/competition on end points Enable ubiquity
In Grid space, must address how we Describe, discover, & access resources Monitor, manage, & coordinate, resources Account & charge for resources
For many different types of resource
Vision
Reality
Future
Source: Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003
27
Open Grid Services Architecture
Define a service-oriented architecture … the key to effective virtualization
… that addresses vital “Grid” requirements AKA utility, on-demand, system management,
collaborative computing in particular, distributed service management
… building on Web services standards extending those standards where needed
“The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration”, Foster, Kesselman, Nick, Tuecke, 2002
Vision
Reality
Future
28
A family of six Web services specifications A design pattern to
specify how to use Web services to access “stateful” components
Message-based publish-subscribe to Web services
Latest Step Forward:WS-Resource Framework
Gro
ups
References
Noti
fica
tion
Faults
Properties
Lifetim
e
Vision
Reality
Future
Source: Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003
29
WS-Resource Framework Completes Grid-WS Convergence
Grid
Web
The definition of WSRF means that Grid and Web communities can move forward on a common base
WSRF
Started far apart in apps & tech
OGSI
GT2
GT1
HTTPWSDL,
WS-*
WSDL 2,
WSDM
Have beenconverging
Vision
Reality
Future
Source: Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003
30
2100
2100 2100 2100 2100
2100 2100 2100 2100
Personal Device SMPs or SuperComputers
LocalCluster
GlobalGrid
PERFORMANCE
+
Q
o
S
•Individual•Group•Department•Campus•State•National•Globe•Inter Planet•Universe
Administrative Barriers
EnterpriseCluster/Grid
The Evolution of the GRID
Source: www.gridbus.org
31
The Evolution of the GRID
The first generation involved proprietary solutions for sharing high-performance computing resources
The second generation introduced middleware to cope with scale and heterogeneity
The third generation introduced a service-oriented approach leading to commercial projects in addition to the scientific projects now collectively known as e-Science
32
The Evolution of the GRID The first generation
FAFNER, I-WAY The second generation
Technologies: Globus, Legion Distributed object systems (Jini and RMI, The
common component architecture form) Grid resource brokers and schedulers Grid portals Integrated systems Peer-to-Peer computing
The third generation Service-oriented architecture (web services, OGSA,
Agents) Information aspects: relation with the World Wide
Web Live information systems
33
The Evolution of the GRID
Grid is being developed not only to make distributed resources available to end-user not also to co-ordinate such usage for sharing and aggregation of resources.
34
The Evolution of the GRID
Moore’s law improvements in computing produce highly functional end-systems
The internet and burgeoning wired and wireless provide wide-spread connectivity
Changing modes of working and problem solving emphasise teamwork, computation
Network growth produce dramatic changes in topology and geography
35
GRID: Key Issues
Resources Discovery, Allocation, Scheduling
Availability
Access, Security, Networks
Efficiency Economy, Management Administration.
Hardware Computers, Services, Networks
Application
Development, Testing
36
GRID: Key Issues Sharing
A biochemist will be able to exploit 10,000 computers to screen 100,000 compounds in an hour
1,000 physicists worldwide will be able to pool resources for petop analyses of petabytes of data
A multidisciplinary analysis in aerospace couples code and data in geographically distributed organisations may be possible
Civil engineers colloborate to design, execute, and analyse shake table experiments
Climate scientists will be able to visualise, annotate, and analyse terabyte simulation datasets
37DOE X-ray grand challenge: ANL, USC/ISI, NIST, U.Chicago
tomographic reconstruction
real-timecollection
wide-areadissemination
desktop & VR clients with shared controls
Advanced Photon Source
GRID: Key Issues Sharing Online Access to Scientific Instruments
archival storage
38
MORE DEFINITIONS Resource Network protocol Network enabled service Application Programming
Interface(API) Software Development Kit (SDK) Syntax
39
MORE DEFINTIONS: Resource
An entity that is to be shared E.g., computers, storage, data, software
Does not have to be physical entity E.g., Condor pool, distributed file system,…
Defined in terms of interfaces, not devices E.g. scheduler such as LSF and PBS define a
compute resource Open/close/read/write define access to a
distributed file system, e.g NFS, AFS, DFS
40
MORE DEFINTIONS: Network protocol
A formal description of message formats and a set of rules for message exchange Rules may define sequence of message
exchanges Protocol may define state-change in endpoint,
e.g. file system state change Good protocols designed to do one thing
Protocols can be layered Examples of protocols
IP, TCP, TLS( was SSL), HTTP, Kerberos
41
MORE DEFINTIONS: Network enabled services
Implementation of a protocol that defines a set of capabilities Protocol defines interaction with
service All services require protocols Not all protocols are used to provide
services (e.g. IP, TLS) Examples: FTP and Web servers
42
MORE DEFINTIONS: Application Programming Interface
(API)
A specification for a set of routines to facilitate application development
Spec often language specific (or IDL) Routine name, number, order and type of
arguments; mapping to language constructs Behaviour or function of routine
Examples GSS API(security), MPI (message passing)
43
MORE DEFINTIONS: Software Development Kit (SDK)
A particular instantiation of API SDK consists of libraries and tools
Provides implementation of API specification
Can have multiple SDKs for an API Examples of SDKs
MPICH, Motif Widgets
44
MORE DEFINTIONS: Syntax
Rules for encoding information, e.g. XML, Condor ClassAds, Globus RSL
Distinct from protocols One syntax may be used by many
protocols Syntaxes may be layered
E.g., Condor ClassAds -> XML->ASCII
45
References
Berman F., Fox G., Hey T. (2003) Grid Computing: Making the Global Infrastructure a Reality, Chichester, John Willey & Sons Inc.
http://www.computing.surrey.ac.uk/courses/csm23/list.html
46
CSM23 Assessment and Weighting
Components of Assessment
Method(s) Percentage weighting
Annotated bibliography Students are required to write a 200 word summary of each of 5 key research papers
10%
Oral Examination 20%
Laboratory Exercise Students are required to implement small-scale laboratory homework during the semester.
20%
Project Students are expected to implement a Grid project ad write IEEE formatted report about their projects. In addition, the students are asked to give a presentation.
Implementation:20%IEEE Report:20%Presentation:10%
47
CSM23 TimetableDate Topic Lecturer
16/01/2006 Overview and Motivation Mrs.Tugba Taskaya-Temizel
23/01/2006 Grid Architecture and Technologies
Mrs.Tugba Taskaya-Temizel
30/01/2006 Security Dr.James Heather
6/02/2006 Parallel Computing Dr.Roger M A Peel,
13/02/2006 Resource Allocation, Data Management, Information Services and Peer-to-Peer Networks
Mrs.Tugba Taskaya-Temizel
20/02/2006 Grid Applications Mrs.Tugba Taskaya-Temizel
27/02,6/03, 13/03, 20/03
Seminars Mrs.Tugba Taskaya-Temizel