Fundamentals of Grid Fundamentals of Grid ComputingComputing
IBM Redbooks paperIBM Redbooks paperViktors BerstisViktors Berstis
Presented by:Presented by: Saeed GhanbariSaeed Ghanbari
What is Grid Computing?What is Grid Computing?
• The term Grid computing originated in the early 1990s as a metaphor for making computer power as easy to access as an electric power grid. – The definitive definition of a Grid is provided by Ian Foster in his
article "What is the Grid? • Computing resources are not administered centrally. • Open standards are used. • Non-trivial quality of service is achieved.
– Plaszczak/Wellner define Grid technology as "the technology that enables resource virtualization, on-demand provisioning, and service (resource) sharing between organizations."
– IBM : "A Grid is a type of parallel and distributed system that enables the sharing, selection, and aggregation of resources distributed across multiple administrative domains based on the resources availability, capacity, performance, cost and users' quality-of-service requirements"
Topics to be coveredTopics to be covered
• What grid computing can do
• Grid concepts and components
• Grid construction
• Using a grid– A user’s perspective– An administrator’s perspective– An application developer’s perspective
What grid computing can do(1)What grid computing can do(1)
• Exploiting underutilized resources– Computing:
• Desktop: less than %5• Even servers in many organizations
– Unused disk capacity– Implications:
• without undue overhead.• remote machine must meet any special hardware, software,
or resource requirements
• Parallel CPU capacity– Subjobs on different machines– Barriers often exist to perfect scalability.
What grid computing can do(2)What grid computing can do(2)
• Applications– Grid-enabled applications– no practical tools for transforming arbitrary
applications to exploit the parallel capabilities of a grid.
What grid computing can do(3)What grid computing can do(3)
• Virtual resources and virtual organizations for collaboration– More capable than distributed computing
• Wider audience• Open standards, hence highly heterogeneous
systems
– Data, equipment, software, services, licenses,…
– Several real and virtual organizations
What grid computing can do(3)What grid computing can do(3)
• Access to additional resources– special equipment, software, licenses, and
other services
• Resource balancing
What grid computing can do(4)What grid computing can do(4)
• Reliability– Now: redundancy in hardware– Future: Software– Utilize “autonomic computing”
• Management– More disperse IT
infrastructure– Priority among projects
Grid concepts and components(1)Grid concepts and components(1)Types of resourcesTypes of resources
– Computation– Storage
• Primary/secondary storage• Mountable networked filed system
– AFS, NFS, DFS, GPFS
• Capacity increase• Uniform name space• Data Stripping
Grid concepts and components(2)Grid concepts and components(2)Types of resources (cont)Types of resources (cont)
• Communications– Redundant communication paths
• Software and licenses– License management software
• Special equipment, capacities, architectures, and policies– different architectures, operating systems, devices, capacities,
and equipment.• Jobs and applications
– Application is a collection of jobs– Specific dependencies
Grid concepts and components(3)Grid concepts and components(3)Types of resources (cont)Types of resources (cont)
• Scheduling, reservation, and scavenging– scheduler
• automatically finds the most appropriate machine on which to run any given job
– scavenging• report its idle status to the grid management node.
– SETI@home: Search for Extraterrestrial Intelligence at Home
– Reserved• dedicated resources
Grid concepts and components(4)Grid concepts and components(4)
• Intragrid to Intergrid– cluster
• same hardware/software
– Intragrid• heterogeneous
machines/software• multiple department/same
organization
– Intergrid• heterogeneous
machines/software• multiple
department/multiple organization
Grid construction(1)Grid construction(1)Grid software componentsGrid software components
• Management components– resource accounting
• load sensors
– resource evaluation• overall usage patterns
– autonomic computing
• Donor software– each machine needs to enroll as a member of the grid and install
some software that manages the grid’s use of its resources– authentication– monitoring– check pointing / resuming
• Submission software
Grid construction(2)Grid construction(2)Grid software components Grid software components
(cont.)(cont.)• Distributed grid management
– hierarchy of clusters• Schedulers
– job priority system– react to immediate load– monitor the progress of scheduled jobs & re-
submisson– reservation system– meta-scheduler
• Communications– jobs communicate with each other.
• The open standard Message Passing Interface (MPI)
Using a grid: A user’s Using a grid: A user’s perspective(1)perspective(1)
• Enrolling and installing grid software– authentication for security purposes– certificate authority– decide which resources to donate to the grid
• Logging onto the grid– grid login ID
Using a grid: A user’s Using a grid: A user’s perspective(2)perspective(2)
• Queries and submitting jobs– staging the input data– different architectures : multiple versions of the
program – job execution
• sandbox– collect results
• Data configuration– data replication– networked file system
• caching feature enabled
Using a grid: A user’s Using a grid: A user’s perspective(3)perspective(3)
• Monitoring progress and recovery– Degree of recovery for subjobs that fail– Failures
• Programming error• Hardware or power failure• Communications interruption• Excessive slowness
– Recovery• Scheduler• User
Using a grid: An administrator’s Using a grid: An administrator’s perspective(1)perspective(1)
• Planning• Installation• Managing enrollment of donors and users• Certificate authority
– It is critical to ensure the highest levels of security in a grid because the grid is designed to execute code and not just share data
• Positively identify entities requesting certificates• Issuing, removing, and archiving certificates• Protecting the certificate authority server• Maintaining a namespace of unique names for certificate owners• Serve signed certificates to those needing to authenticate entities• Logging activity
Using a grid: An administrator’s Using a grid: An administrator’s perspective(2)perspective(2)
• Resource management– setting permissions– Tracking resource usage – Implementing a billing system– policies to achieve better utilization
Using a grid: An application Using a grid: An application developer’s perspective(1)developer’s perspective(1)
• Applications that are not enabled for using multiple processors but can be executed on different machines.
• Applications that are already designed to use the multiple processors of a grid setting.
• Applications that need to be modified or rewritten to better exploit a grid– Tools for debugging and measuring the behavior of
grid applications
Using a grid: An application Using a grid: An application developer’s perspective(2)developer’s perspective(2)
• Globus– developer’s toolkit
• Manage grid operations• Measurement• Repair• Debug grid applications
• Open Grid Services Architecture (OGSA)
A brief surveyA brief survey
A quick surveyA quick survey
A quick surveyA quick survey
A quick surveyA quick survey
A quick surveyA quick survey
A quick surveyA quick survey
A quick surveyA quick survey
Enabling Grids for E-sciencE Enabling Grids for E-sciencE (EGEE) (EGEE)
• CERN's new particle accelerator – 15 petabytes(15 million gigabytes) a year
• stack of CDs more than 20 km high!!!
– 200 sites around the globe– Over 20 000 computers – Runing up to 30 000 jobs per day
• Has already served for:– 300 000 chemical compounds in search of potential
drugs for Flu– Simulations of over 40 million potential drug
molecules against malaria
QuestionsQuestions
?