EGEE-II INFSO-RI- 031688 Enabling Grids for E-sciencE www.eu-egee.org EGEE A Large-scale Production Grid Infrastructure Erwin Laure EGEE Technical Director ISSGC06 July 16-28, 2006 Ischia, Italy
56
Embed
EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE EGEE A Large-scale Production Grid Infrastructure Erwin Laure EGEE Technical Director.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Slide 1
EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE
www.eu-egee.org EGEE A Large-scale Production Grid Infrastructure
Erwin Laure EGEE Technical Director ISSGC06 July 16-28, 2006
Ischia, Italy
Slide 2
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 2 Lost in Definitions?
Defining the Grid: Access to (high performance) computing power
Distributed parallel computing Improved resource utilization
through resource sharing Increased storage provision Controlled
access to distributed storage Interconnection of arbitrary
resources (sensors, instruments, ) Collaboration between
users/resources Higher abstraction layer above network services
Corresponding security
Slide 3
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 3 Defining the Grid A
Grid is the combination of networked resources and the
corresponding middleware, which provides services for the user.
This interconnection of users, resources, and services for jointly
addressing dedicated tasks is called a virtual organization.
Comparison between Grids and Networks: Networks realize message
exchange between endpoints Grids realize services for the users
higher level of abstraction
Slide 4
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 4 Defining the Grid A
Grid is the combination of networked resources and the
corresponding middleware, which provides services for the
user.
Slide 5
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 5 The EGEE Project Aim
of EGEE: to establish a seamless European Grid infrastructure for
the support of the European Research Area (ERA) EGEE 1 April 2004
31 March 2006 71 partners in 27 countries, federated in regional
Grids EGEE-II 1 April 2006 31 March 2008 Expanded consortium 91
partners
Slide 6
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 6 Defining the Grid A
Grid is the combination of networked resources and the
corresponding middleware, which provides services for the
user.
Slide 7
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 7 EGEE Infrastructure
Country participating in EGEE Scale (June 2006): ~ 200 sites in 40
countries ~ 25 000 CPUs > 10 PB storage > 35 000 jobs per day
> 60 Virtual Organizations
Slide 8
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 8 EGEE Infrastructures
Production service Scaling up the infrastructure with resource
centres around the globe Stable, well-supported infrastructure,
running only well-tested and reliable middleware Pre-production
service Run in parallel with the production service (restricted nr
of sites) First deployment of new versions of the gLite middleware
Test-bed for applications and other external functionality
T-Infrastructure (Training&Education) Complete suite of Grid
elements and application (Testbed, CA, VO, monitoring, support, )
Everyone can register and use GILDA for training and testing 20
sites on 3 continents
Slide 9
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 9 EGEE Operations
Process Geographically distributed responsibility for operations:
There is no central operation Regional Operation Centers
Responsible or resource centers in their region Tools are
developed/hosted at different sites: GOC DB (RAL), SFT (CERN),
GStat (Taipei), CIC Portal (Lyon) Grid operator on duty 6 teams
working in weekly rotation CERN, IN2P3, INFN, UK/I, Ru,Taipei
Crucial in improving site stability and management Expanding to all
ROCs in EGEE-II Operations coordination Weekly operations meetings
Regular ROC managers meetings Series of EGEE Operations Workshops
Nov 04, May 05, Sep 05, June 06 Procedures described in Operations
Manual Introducing new sites Site downtime scheduling Suspending a
site Escalation procedures; etc. Highlights: Distributed operation
Evolving and maturing procedures Procedures being in introduced
into and shared with the related infrastructure projects
Slide 10
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 10 Defining the Grid A
Grid is the combination of networked resources and the
corresponding middleware, which provides services for the
user.
Slide 11
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 11 Production Grid
Middleware Key factors in EGEE Grid Middleware Development:
1.Strict software process Use industry standard software
engineering methods Software configuration management, version
control, defect tracking, automatic build system, 2.Conservative
approach in what software to use Avoid cutting-edge software
Deployment on over 100 sites cannot assume a homogenous environment
middleware needs to work with many underlying software flavors
Avoid evolving standards Evolving standards change quickly (and
sometime significantly cf. OGSI vs. WSRF) impossible to keep pace
on > 100 sites Long (and tedious) path from prototypes to
production
Slide 12
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 12 EGEE Middleware:
gLite Exploit experience & existing components VDT (Condor,
Globus) EDG/LCG AliEn Develop a lightweight stack of EGEE generic
middleware Dynamic deployment Pluggable components Focus is on
re-engineering and hardening March 4, 2006: gLite 3.0 LCG-2
prototyping product 20042004 2005 product gLite 2006 gLite 3.0
Slide 13
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 13 Developing gLite 3.0
now available on production infrastructure After gLite 3.0:
Continuous release of single components As needed by users and as
made available by developers Major releases provide a check-point
In general in coincidence with major application challenges
Continuing development to Bring components not yet included in
release to maturity Improve functionality Increase robustness
Increase usability Improve the compliance to international
standards
Slide 14
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 14 Grid Interoperability
Leading role in building world-wide grids Incubator for new Grid
projects world-wide Interoperation efforts Bilateral: EGEE/OSG,
EGEE/NDGF, EGEE/NAREGI Multilateral: Grid Interoperability Now
(GIN) Experiences and requirements fed back into standardization
process (GGF now OGF) Strengthening contacts with industry
GINGIN
Slide 15
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 15 Middleware Globus
GT4Condor APST Platform Infrastructure UnixWindowsJVMTCP/IPMPI.Net
Runtime Environmental Sciences Life & Pharmaceutical Sciences
Applications Geo Sciences Building Software for the Grid VPNSSH
Courtesy IBM Slide Courtesy David Abramson
Slide 16
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 16 Middleware Globus
GT4Condor APST Platform Infrastructure UnixWindowsJVMTCP/IPMPI.Net
Runtime Environmental Sciences Life & Pharmaceutical Sciences
Applications Geo Sciences Building Software for the Grid VPNSSH
Courtesy IBM, Upper Middleware & Tools Lower Middleware Bonds
Slide Courtesy David Abramson
Slide 17
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 17 Middleware structure
Higher-Level Grid Services may or may not be used by the
applications should help them but not be mandatory Foundation Grid
Middleware is deployed on the infrastructure should not assume the
use of Higher-Level Grid Services must be complete and robust
should allow interoperation with other major grid
infrastructures
Slide 18
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 18 gLite Grid Middleware
Services Overview paper
http://doc.cern.ch//archive/electronic/egee/tr/egee-tr-2006-001.pdf
Slide 19
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 19 Job submission
Computing Element Storage Element Site X Information System submit
query retrieve Resource Broker User Interface publish state File
and Replica Catalogs Authorization Service query update credential
publish state discover services
Slide 20
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 20 SA3 Testing &
Certification Functional Tests Testbed Deployment gLite Software
Process JRA1 Development Software Error Fixing SA3 Integration
Deployment Packages Integration Tests Installation Guide, Release
Notes, etc SA1 Pre- Production Scalability Tests Pre-Production
Deployment Fail Pass SA1 Production Infrastructure Release Problem
Serious problem Directives
Slide 21
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 21 Defining the Grid A
Grid is the combination of networked resources and the
corresponding middleware, which provides services for the
user.
Slide 22
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 22 EGEE Applications
>20 applications Astronomy Biomedicine Computational Chemistry
Earth Sciences Financial Simulation Fusion Geo-Physics High Energy
Physics Further applications in evaluation Applications now moving
from testing to routine and daily usage
Slide 23
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 23 High Energy Physics
Large Hadron Collider (LHC): One of the most powerful instruments
ever built to investigate matter 4 Experiments: ALICE, ATLAS, CMS,
LHCb 27 km circumference tunnel Due to start up in 2007 Mont Blanc
(4810 m) Downtown Geneva
Slide 24
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 24 Accelerating and
colliding particles
Slide 25
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 25 The accelerator
generates 40 million particle collisions (events) every second at
the centre of each of the four experiments detectors The LHC
Accelerator
Slide 26
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 26 LHC DATA This is
reduced by online computers that filter out a few hundred good
events per sec. Which are recorded on disk and magnetic tape at
100-1,000 MegaBytes/sec ~15 PetaBytes per year for all four
experiments
Slide 27
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 27 simulation
reconstruction analysis interactive physics analysis batch physics
analysis batch physics analysis detector event summary data raw
data event reprocessing event reprocessing event simulation event
simulation analysis objects (extracted by physics topic) Data
Handling and Computation for Physics Analysis event filter
(selection & reconstruction) event filter (selection &
reconstruction) processed data [email protected]
Slide 28
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 28 LCG depends on two
major science grid infrastructures . EGEE - Enabling Grids for
E-Science OSG - US Open Science Grid
Slide 29
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 29 Example: HEP LHC data
and service challenges Preparing for LHC start-up in 2007 Ensure
key services & infrastructure are in place Emphasis on
providing a service Computing needs of experiments E.g. LHCb: ~700
CPU years in 2005 on the EGEE infrastructure E.g. ATLAS: over
10,000 jobs per day ATLAS LHCb ATLAS Massive data transfers >
1.5 GB/s
Slide 30
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 30 Example: Addressing
emerging diseases Emerging diseases know no frontiers. Time is a
critical factor Avian influenza: human casualties International
collaboration is required for: Early detection Epidemiological
watch Prevention Search for new drugs Search for vaccines
Slide 31
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 31 WISDOM, the first
step WISDOM focuses on drug discovery for neglected and emerging
diseases. Summer 2005: World-wide In Silico Docking On Malaria 46
million ligands docked in 6 weeks ~1 million virtual ligands
selected 1TB of data produced 1000 computers in 15 countries
Equivalent to 80 CPU years Spring 2006: drug design against H5N1
neuraminidase involved in virus propagation impact of selected
point mutations on the efficiency of existing drugs identification
of new potential drugs acting on mutated N1 N1H5
Slide 32
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 32 Challenges for high
throughput virtual docking 300,000 Chemical compounds: ZINC &
Chemical combinatorial library Target (PDB) : Neuraminidase (8
structures) Millions of chemical compounds available in
laboratories High Throughput Screening 2$/compound, nearly
impossible Molecular docking (Autodock) ~100 CPU years, 600 GB data
Data challenge on EGEE, Auvergrid, TWGrid ~6 weeks on ~2000
computers In vitro screening of 100 hits Hits sorting and
refining
Slide 33
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 33 Example:
Pharmacokinetis A lesion is detected in an MRI study of a patient
start with virtual biopsy The process requires obtaining a sequence
of MRI volumetric images. Different images are obtained in
different breath-holds. Before analyzing the variation of each
voxel, images must be co-registered to minimize deformation due to
different breath holds. The total computational cost of a clinical
trial of 20 patients is around 100 CPU days.
Slide 34
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 34 Example: Determining
earthquake mechanisms Seismic software application determines
epicentre, magnitude, mechanism Analysis of Indonesian earthquake
(28 March 2005) Seismic data within 12 hours after the earthquake
Solution found within 30 hours after earthquake occurred 10 times
faster on the Grid than on local computers Results Not an
aftershock of December 2004 earthquake Different location
(different part of fault line further south) Different mechanism
Rapid analysis of earthquakes important for relief efforts Peru,
June 23, 2001 Mw=8.4 Sumatra, March 28, 2005 Mw=8.5
Slide 35
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 35 Flood forecasting
problem Many kinds of data Meteorological, hydrological, hydraulic
Generated by simulations or obtained from sensors Permanent or
periodically updated Publicly available or with restricted
access
Slide 36
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 36 ITU-BR system for RRC
2006 ITU-BR developed a system for RRC 2006 Run compatibility and
complementary analysis 84 PCs executing 168 parallel tasks
Compatibility analysis < 4h Great Success ! ITU-BR wanted to be
sure and do even better Provide more CPU power Reduce risks by
providing a supplementary system Gain experience on how to access
large and reliable computing resources on demand EGEE used a subset
of its Grid for RRC 2006 Over 400 PCs Compatibility analysis <
1h
Slide 37
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 37 The Future of Grids
Increasing the number of infrastructure users by increasing
awareness Dissemination and outreach Training and education
Increasing the number of applications by improving application
support and middleware functionality Improved usability through
high level grid middleware extensions Increasing the grid
infrastructure Incubating related projects Ensuring
interoperability between projects Protecting user investments
Towards a sustainable grid infrastructure
Slide 38
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 38 User Information
& Support More than 170 training events and summer schools
across many countries >3000 people trained induction;
application developer; advanced; retreats Material archive online
with ~250 presentations Public and technical websites Dissemination
material constantly evolving to expand information and keep it up
to date 4 conferences organized (~ 460 @ Pisa) Next conference:
September 2006 in Geneva ~600 participants
Slide 39
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 39 Industry and EGEE-II
Industry Task Force Group of industry partners in the project Links
related industry projects (NESSI, BEinGRID, ) Works with EGEEs
Technical Coordination Group Collaboration with CERN openlab
project IT industry partnerships for hardware and software
development EGEE Business Associates (EBA) Companies sponsoring
work on joint-interest subjects Industry Forum Led by Industry to
improve Grid take-up in Industry Organises industry events and
disseminates grid information e.g. this Wednesday here at the
school
Slide 40
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 40 The Future of Grids
Increasing the number of infrastructure users by increasing
awareness Dissemination and outreach Training and education
Increasing the number of applications by improving application
support and middleware functionality Improved usability through
high level grid middleware extensions Increasing the grid
infrastructure Incubating related projects Ensuring
interoperability between projects Protecting user investments
Towards a sustainable grid infrastructure
Slide 41
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 41 Middleware Globus
GT4Condor APST Platform Infrastructure UnixWindowsJVMTCP/IPMPI.Net
Runtime Environmental Sciences Life & Pharmaceutical Sciences
Applications Geo Sciences Building Software for the Grid VPNSSH
Courtesy IBM, Lower Middleware Upper Middleware & Tools Bonds
Slide Courtesy David Abramson ???
Slide 42
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 42 Portals on EGEE
P-Grade Genius
Slide 43
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 43 Example: Biomedicine
Parallel simulation of blood flow on the Grid Online visualization
of simulation results on the desktop Interactive steering of
simulation Grid is invisible Cooperation with University
Amsterdam
Slide 44
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 44 Example: Flooding
Crisis Support Simulation of flooding on the Grid Online
visualization of simulation results in the CAVE Interactive
steering of simulation Grid is invisible Cooperation with Slowak
Academy of Sciences
Slide 45
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 45 Scientific
Visualization Use your favourite device to connect to the Grid:
Sony PSP PlayStation Portable
Slide 46
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 46 Not only portals
Portals are a good way to bring computing power to end-users In
most cases domain specific Application programmers (and portal
programmers) need more powerful interfaces Workflow engines Higher
level programming abstractions (SAGA, DRMAA, ) Programming
environments (gEclipse) Compilers?
Slide 47
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 47 The Future of Grids
Increasing the number of infrastructure users by increasing
awareness Dissemination and outreach Training and education
Increasing the number of applications by improving application
support and middleware functionality Improved usability through
high level grid middleware extensions Increasing the grid
infrastructure Incubating related projects Ensuring
interoperability between projects Protecting user investments
Towards a sustainable grid infrastructure
Slide 48
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 48 Projects related to
EGEE EUGRID
Slide 49
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 49 Related
Infrastructures GINGIN
Slide 50
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 50 The Future of Grids
Increasing the number of infrastructure users by increasing
awareness Dissemination and outreach Training and education
Increasing the number of applications by improving application
support and middleware functionality Improved usability through
high level grid middleware extensions Increasing the grid
infrastructure Incubating related projects Ensuring
interoperability between projects Protecting user investments
Towards a sustainable grid infrastructure
Slide 51
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 51 Sustainability:
Beyond EGEE-II Need to prepare for permanent Grid infrastructure
Maintain Europes leading position in global science Grids Ensure a
reliable and adaptive support for all sciences Independent of
project funding cycles Modelled on success of GANT Infrastructure
managed centrally in collaboration with national bodies (in
EGEE-II: JRUs)
Slide 52
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 52 Grids in Europe Great
investment in developing Grid technology Sample of National Grid
projects: Austrian Grid Initiative DutchGrid France: Grid5000
Germany: D-Grid; Unicore Greece: HellasGrid Grid Ireland Italy:
INFNGrid; GRID.IT NorduGrid Swiss Grid UK e-Science: National Grid
Service; OMII; GridPP EGEE provides framework for national,
regional and thematic Grids
Slide 53
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 53 Evolution EGEE
EGEE-II EDG EGEE-III European e-Infrastructure Coordination
Testbeds Utility Service Routine Usage
Slide 54
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 54 Summary Grids
represent a powerful new tool for science Today we have a window of
opportunity to move grids from research prototypes to permanent
production systems (as networks did a few years ago) EGEE offers a
mechanism for linking together people, resources and data of many
scientific community a basic set of middleware for gridfying
applications with documentation, training and support regular
forums for linking with grid experts, other communities and
industry
Slide 55
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 55 Summary Success will
lead to the adoption of grids as the main computing infrastructure
for science If we succeed then the potential return to
international scientific communities will be enormous and open the
path for commercial and industrial applications
Slide 56
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 EGEE - A
Large-scale Production Grid Infrastructure 56 EGEE06 Conference
EGEE06 Capitalising on e-infrastructures Demos Related Projects
Industry International community (UN organisations in Geneva etc.)
25-29 September 2006 Geneva, Switzerland
http://www.eu-egee.org/egee06