The EU DataGrid - Introduction The European DataGrid Project Team http://www.eu-datagrid.org/ [email protected]
Dec 20, 2015
The EU DataGrid - Introduction
The European DataGrid Project Team
http://www.eu-datagrid.org/
The EDG Intro– Tutorial - n° 2
Contents
The EDG Project scope
Achievements
EDG structure
Middleware Workpackages: Goals, Achievements, Issues
Testbed Release Plans
The EDG Intro– Tutorial - n° 3
Glossary RB Resource Broker
VO Virtual Organisation
CE Computing Element
SE Storage Element
GDMP GRID Data Mirroring Package
LDAP Lightweighted Directory Access Protocol
LCFG Local Configuration System
LRMS Local Resource management system (Batch) (PBS, LSF)
WMS Workload Management System
LFN Logical File Name (like MyMu.dat)
SFN Site File Name ( like storageEl1.cern.ch:/home/data/MyMu.dat )
The EDG Intro– Tutorial - n° 4
The Grid vision
Flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions, and resource
From “The Anatomy of the Grid: Enabling Scalable Virtual Organizations”
Enable communities (“virtual organizations”) to share geographically distributed resources as they pursue common goals -- assuming the absence of…
central location,
central control,
omniscience,
existing trust relationships.
The EDG Intro– Tutorial - n° 5
Grids: Elements of the Problem
Resource sharing Computers, storage, sensors, networks, …
Sharing always conditional: issues of trust, policy, negotiation, payment, …
Coordinated problem solving Beyond client-server: distributed data analysis, computation,
collaboration, …
Dynamic, multi-institutional virtual orgs Community overlays on classic org structures
Large or small, static or dynamic
The EDG Intro– Tutorial - n° 6
EDG overview : goals DataGrid is a project funded by European Union whose objective is to
exploit and build the next generation computing infrastructure providing intensive computation and analysis of shared large-scale databases.
Enable data intensive sciences by providing world wide Grid test beds to large distributed scientific organisations ( “Virtual Organisations, Vos”)
Start ( Kick off ) : Jan 1, 2001 End : Dec 31, 2003
Applications/End Users Communities : HEP, Earth Observation, Biology
Specific Project Objetives: Middleware for fabric & grid management Large scale testbed Production quality demonstrations Collaborate and coordinate with other projects (Globus, Condor, CrossGrid,
DataTAG, etc) Contribute to Open Standards and international bodies
( GGF, Industry&Research forum)
The EDG Intro– Tutorial - n° 7
DataGrid Main Partners
CERN – International (Switzerland/France)
CNRS - France
ESA/ESRIN – International (Italy)
INFN - Italy
NIKHEF – The Netherlands
PPARC - UK
The EDG Intro– Tutorial - n° 8
Research and Academic Institutes•CESNET (Czech Republic)•Commissariat à l'énergie atomique (CEA) – France•Computer and Automation Research Institute, Hungarian Academy of Sciences (MTA SZTAKI)•Consiglio Nazionale delle Ricerche (Italy)•Helsinki Institute of Physics – Finland•Institut de Fisica d'Altes Energies (IFAE) - Spain•Istituto Trentino di Cultura (IRST) – Italy•Konrad-Zuse-Zentrum für Informationstechnik Berlin - Germany•Royal Netherlands Meteorological Institute (KNMI)•Ruprecht-Karls-Universität Heidelberg - Germany•Stichting Academisch Rekencentrum Amsterdam (SARA) – Netherlands•Swedish Research Council - Sweden
Assistant Partners
Industrial Partners•Datamat (Italy)•IBM-UK (UK)•CS-SI (France)
The EDG Intro– Tutorial - n° 9
Project Schedule
Project started on 1/Jan/2001
TestBed 0 (early 2001) International test bed 0 infrastructure deployed
Globus 1 only - no EDG middleware
TestBed 1 ( now ) First release of EU DataGrid software to defined users within the project:
HEP experiments (WP 8), Earth Observation (WP 9), Biomedical applications (WP 10)
Successful Project Review by EU: March 1st 2002
TestBed 2 (October 2002) Builds on TestBed 1 to extend facilities of DataGrid
TestBed 3 (March 2003) & 4 (September 2003)
Project stops on 31/Dec/2003
The EDG Intro– Tutorial - n° 10
EDG Highlights
The project is up and running! All 21 partners are now contributing at contractual level total of ~60 man years for first year
All EU deliverables (40, >2000 pages) submitted in time for the review according to the contract technical annex
First test bed delivered with real production demos
All deliverables (code & documents) available via www.edg.org http://eu-datagrid.web.cern.ch/eu-datagrid/Deliverables/default.htm
requirements, surveys, architecture, design, procedures, testbed analysis etc.
The EDG Intro– Tutorial - n° 11
DataGrid work packages The EDG collaboration is structured in 12 Work Packages
WP1: Work Load Management System WP2: Data Management WP3: Grid Monitoring / Grid Information Systems WP4: Fabric Management WP5: Storage Element WP6: Testbed and demonstrators – Production quality
International Infrastructure WP7: Network Monitoring WP8: High Energy Physics Applications WP9: Earth Observation WP10: Biology WP11: Dissemination WP12: Management
The EDG Intro– Tutorial - n° 12
Objectives for the first year of the project
Collect requirements for middleware
Take into account requirements from application groups
Survey current technology For all middleware
Core Services testbed Testbed 0: Globus (no EDG middleware)
First Grid testbed release
Testbed 1: first release of EDG middleware
WP1: workloadJob resource specification & scheduling
WP2: data managementData access, migration & replication
WP3: grid monitoring servicesMonitoring infrastructure, directories & presentation tools
WP4: fabric managementFramework for fabric configuration management & automatic sw installation
WP5: mass storage managementCommon interface for Mass Storage Sys.
WP7: network servicesNetwork services and monitoring
The EDG Intro– Tutorial - n° 13
DataGrid Architecture
Collective ServicesCollective Services
Information &
Monitoring
Information &
Monitoring
Replica ManagerReplica
ManagerGrid
SchedulerGrid
Scheduler
Local ApplicationLocal Application Local DatabaseLocal Database
Underlying Grid ServicesUnderlying Grid Services
Computing Element Services
Computing Element Services
Authorization Authentication and Accounting
Authorization Authentication and Accounting
Replica CatalogReplica Catalog
Storage Element Services
Storage Element Services
SQL Database Services
SQL Database Services
Fabric servicesFabric services
ConfigurationManagement
ConfigurationManagement
Node Installation &Management
Node Installation &Management
Monitoringand
Fault Tolerance
Monitoringand
Fault Tolerance
Resource Management
Resource Management
Fabric StorageManagement
Fabric StorageManagement
Grid
Fabric
Local Computing
Grid Grid Application LayerGrid Application Layer
Data Management
Data Management
Job Management
Job Management
Metadata Management
Metadata Management
Object to File
Mapping
Object to File
Mapping
Logging & Book-
keeping
Logging & Book-
keeping
The EDG Intro– Tutorial - n° 14
EDG Interfaces
Collective ServicesCollective Services
Information & MonitoringInformation
& MonitoringReplica ManagerReplica Manager
Grid Scheduler
Grid Scheduler
Local ApplicationLocal Application Local DatabaseLocal Database
Underlying Grid ServicesUnderlying Grid Services
Computing Element Services
Computing Element Services
Authorization Authentication and Accounting
Authorization Authentication and Accounting
Replica CatalogReplica Catalog
Storage Element Services
Storage Element Services
SQL Database Services
SQL Database Services
Fabric servicesFabric services
ConfigurationManagement
ConfigurationManagement
Node Installation &Management
Node Installation &Management
Monitoringand
Fault Tolerance
Monitoringand
Fault ToleranceResource
ManagementResource
ManagementFabric StorageManagement
Fabric StorageManagement
Grid Application LayerGrid Application Layer
Data Management
Data ManagementJob
ManagementJob
ManagementMetadata
ManagementMetadata
ManagementObject to File
MappingObject to File
Mapping
Logging & Book-
keeping
Logging & Book-
keeping
Computing Computing ElementsElements
SystemSystemManagersManagers
ScientiScientistssts
OperatingOperatingSystemsSystems
FileFile SystemsSystems
StorageStorageElementsElementsMassMass Storage Storage
SystemsSystemsHPSS, CastorHPSS, Castor
UserUser AccountsAccounts
CertificateCertificate AuthoritiesAuthorities
ApplicatiApplicationonDevelopeDevelopersrs
BatchBatch SystemsSystemsPBS, LSFPBS, LSF
The EDG Intro– Tutorial - n° 15
WP1: Work Load Management
Goals Maximize use of resources by efficient
scheduling of user jobs
Achievements Analysis of work-load management system
requirements & survey of existing mature implementations Globus & Condor (D1.1)
Definition of architecture for scheduling & res. mgmt. (D1.2)
Development of "super scheduling" component using application data and computing elements requirements
Issues Integration with software from other WPs Advanced job submission facilities
Current components
Job Description Language
Resource Broker
Job Submission Service
Information Index
User Interface
Logging & Bookkeeping Service
Collective ServicesCollective Services
Information & Monitoring
Information & Monitoring
Replica ManagerReplica Manager
Grid Scheduler
Grid Scheduler
Local ApplicationLocal Application Local DatabaseLocal Database
Underlying Grid ServicesUnderlying Grid Services
Computing Element Services
Computing Element Services
Authorization Authentication and Accounting
Authorization Authentication and Accounting
Replica CatalogReplica Catalog
Storage Element Services
Storage Element Services
SQL Database Services
SQL Database Services
Fabric servicesFabric services
ConfigurationManagement
ConfigurationManagement
Node Installation &Management
Node Installation &Management
Monitoringand
Fault Tolerance
Monitoringand
Fault ToleranceResource
ManagementResource
ManagementFabric StorageManagement
Fabric StorageManagement
Grid Application LayerGrid Application Layer
Data Management
Data ManagementJob
ManagementJob
ManagementMetadata
ManagementMetadata
ManagementObject to File
MappingObject to File
Mapping
Logging & BookkeepingLogging &
Bookkeeping
The EDG Intro– Tutorial - n° 16
WP2: Data Management Goals
Coherently manage and share petabyte-scale information volumes in high-throughput production-quality grid environments
Achievements Survey of existing tools and technologies for data
access and mass storage systems (D2.1) Definition of architecture for data management
(D2.2) Deployment of Grid Data Mirroring Package (GDMP)
in testbed 1 Close collaboration with Globus, PPDG/GriPhyN &
Condor Working with GGF on standards
Issues Security: clear mechanisms handling authentication
and authorization
Current components
GDMP
Replica Catalog
Replica Manager
Spitfire
Collective ServicesCollective Services
Information & Monitoring
Information & Monitoring
Replica ManagerReplica
ManagerGrid
SchedulerGrid
Scheduler
Local ApplicationLocal Application Local DatabaseLocal Database
Underlying Grid ServicesUnderlying Grid Services
Computing Element Services
Computing Element Services
Authorization Authentication and Accounting
Authorization Authentication and Accounting
Replica CatalogReplica Catalog
Storage Element Services
Storage Element Services
SQL Database Services
SQL Database Services
Fabric servicesFabric services
ConfigurationManagement
ConfigurationManagement
Node Installation &Management
Node Installation &Management
Monitoringand
Fault Tolerance
Monitoringand
Fault ToleranceResource
ManagementResource
ManagementFabric StorageManagement
Fabric StorageManagement
Grid Application LayerGrid Application Layer
Data Management
Data ManagementJob
ManagementJob
ManagementMetadata
ManagementMetadata
ManagementObject to File
MappingObject to File
Mapping
Logging & BookkeepingLogging &
Bookkeeping
The EDG Intro– Tutorial - n° 17
WP3: Grid Monitoring Services
Goals Provide information system for
discovering resources and monitoring status
Achievements Survey of current technologies (D3.1) Coordination of schemas in testbed 1 Development of Ftree caching backend based
on OpenLDAP (Light Weight Directory Access Protocol) to address shortcoming in MDS v1
Design of Relational Grid Monitoring Architecture (R-GMA) (D3.2) – to be further developed with GGF
GRM and PROVE adapted to grid environments to support end-user application monitoring
Issues MDS vs. R-GMA
Components
MDS/Ftree
R-GMA
GRM/PROVE
Collective ServicesCollective Services
Information & Monitoring
Information & Monitoring
Replica ManagerReplica
ManagerGrid
SchedulerGrid
Scheduler
Local ApplicationLocal Application Local DatabaseLocal Database
Underlying Grid ServicesUnderlying Grid Services
Computing Element Services
Computing Element Services
Authorizat ion Authentication and Accounting
Authorizat ion Authentication and Accounting
Replica CatalogReplica Catalog
Storage Element Services
Storage Element Services
SQL Database Services
SQL Database Services
Fabric servicesFabric services
ConfigurationManagement
ConfigurationManagement
Node Installation &Management
Node Installation &Management
Monitoringand
Fault Tolerance
Monitoringand
Fault ToleranceResource
ManagementResource
ManagementFabric StorageManagement
Fabric StorageManagement
Grid Application LayerGrid Application Layer
Data Management
Data ManagementJob
ManagementJob
ManagementMetadata
ManagementMetadata
ManagementObject to File
MappingObject to File
Mapping
Logging & Book-keepingLogging &
Book-keeping
The EDG Intro– Tutorial - n° 18
WP4: Fabric Management Goals
manage clusters (~thousands) of nodes
Achievements Survey of existing tools, techniques and
protocols (D4.1) Defined an agreed architecture for fabric
management (D4.2) Initial implementations deployed at
several sites in testbed 1
Issues How to ensure the node
configurations are consistent and handle updates to the software suites
Components
LCFG
PBS & LSF info providers
Image installation
Config. Cache Mgr
Collective ServicesCollective Services
Information & Monitoring
Information & Monitoring
Replica ManagerReplica Manager
Grid Scheduler
Grid Scheduler
Local ApplicationLocal Application Local DatabaseLocal Database
Underlying Grid ServicesUnderlying Grid Services
Computing Element Services
Computing Element Services
Authorization Authentication and Accounting
Authorization Authentication and Accounting
Replica CatalogReplica Catalog
Storage Element Services
Storage Element Services
SQL Database Services
SQL Database Services
Fabric servicesFabric services
ConfigurationManagement
ConfigurationManagement
Node Installation &Management
Node Installation &Management
Monitoringand
Fault Tolerance
Monitoringand
Fault ToleranceResource
ManagementResource
ManagementFabric StorageManagement
Fabric StorageManagement
Grid Application LayerGrid Application Layer
Data Management
Data ManagementJob
ManagementJob
ManagementMetadata
ManagementMetadata
ManagementObject to File
MappingObject to File
Mapping
Logging & Book-keepingLogging &
Book-keeping
The EDG Intro– Tutorial - n° 19
WP5: Mass Storage Management Goals
Provide common user and data export/import interfaces to existing local mass storage systems
Achievements Review of Grid data systems, tape and disk
storage systems and local file systems (D5.1) Definition of Architecture and Design for
DataGrid Storage Element (D5.2) Collaboration with Globus on GridFTP/RFIO Collaboration with PPDG on control API First attempt at exchanging Hierarchical
Storage Manager (HSM) tapes
Issues Scope and requirements for storage element Inter-working with other Grids
Components
Storage Element info. providers
RFIO
MSS staging
Collective ServicesCollective Services
Information & Monitoring
Information & Monitoring
Replica ManagerReplica Manager
Grid Scheduler
Grid Scheduler
Local ApplicationLocal Application Local DatabaseLocal Database
Underlying Grid ServicesUnderlying Grid Services
Computing Element Services
Computing Element Services
Authorization Authentication and Accounting
Authorization Authentication and Accounting
Replica CatalogReplica Catalog
Storage Element Services
Storage Element Services
SQL Database Services
SQL Database Services
Fabric servicesFabric services
ConfigurationManagement
ConfigurationManagement
Node Installation &Management
Node Installation &Management
Monitoringand
Fault Tolerance
Monitoringand
Fault ToleranceResource
ManagementResource
ManagementFabric StorageManagement
Fabric StorageManagement
Grid Application LayerGrid Application Layer
Data Management
Data ManagementJob
ManagementJob
ManagementMetadata
ManagementMetadata
ManagementObject to File
MappingObject to File
Mapping
Logging & BookkeepingLogging &
Bookkeeping
The EDG Intro– Tutorial - n° 20
WP7: Network Services Goals
Review the network service requirements for DataGrid
Establish and manage the DataGrid network facilities
Monitor the traffic and performance of the network Deal with the distributed security aspects
Achievements Analysis of network requirements for testbed 1 &
study of available network physical infrastructure (D7.1)
Use of European backbone GEANT since Dec. 2001 Initial network monitoring architecture defined
(D7.2) and first tools deployed in testbed 1 Collaboration with Dante & DataTAG Working with GGF (Grid High Performance
Networks) & Globus (monitoring/MDS)
Issues Resources for study of security issues End-to-end performance for applications depend on
a complex combination of components
Components
network monitoring tools:
PingER
Udpmon
Iperf
Collective ServicesCollective Services
Information & Monitoring
Information & Monitoring
Replica ManagerReplica Manager
Grid Scheduler
Grid Scheduler
Local ApplicationLocal Application Local DatabaseLocal Database
Underlying Grid ServicesUnderlying Grid Services
Computing Element Services
Computing Element Services
Authorization Authentication and Accounting
Authorization Authentication and Accounting
Replica CatalogReplica Catalog
Storage Element Services
Storage Element Services
SQL Database Services
SQL Database Services
Fabric servicesFabric services
ConfigurationManagement
ConfigurationManagement
Node Installation &Management
Node Installation &Management
Monitoringand
Fault Tolerance
Monitoringand
Fault ToleranceResource
ManagementResource
ManagementFabric StorageManagement
Fabric StorageManagement
Grid Application LayerGrid Application Layer
Data Management
Data ManagementJob
ManagementJob
ManagementMetadata
ManagementMetadata
ManagementObject to File
MappingObject to File
Mapping
Logging & BookkeepginLogging &
Bookkeepgin
The EDG Intro– Tutorial - n° 21
WP6: TestBed Integration Goals
Deploy testbeds for the end-to-end application experiments & demos
Integrate successive releases of the software components
Achievements Integration of EDG sw release 1.0 and deployment Working implementation of multiple Virtual
Organisations (VOs) s & basic security infrastructure
Definition of acceptable usage contracts and creation of Certification Authorities group
Issues Procedures for software integration Test plan for software release Support for production-style usage of the testbed
Components
Globus packaging & EDG config
Build tools
End-user documents
Collective ServicesCollective Services
Information & Monitoring
Information & Monitoring
Replica ManagerReplica Manager
Grid Scheduler
Grid Scheduler
Local ApplicationLocal Application Local DatabaseLocal Database
Underlying Grid ServicesUnderlying Grid Services
Computing Element Services
Computing Element Services
Authorization Authentication and Accounting
Authorization Authentication and Accounting
Replica CatalogReplica Catalog
Storage Element Services
Storage Element Services
SQL Database Services
SQL Database Services
Fabric servicesFabric services
ConfigurationManagement
ConfigurationManagement
Node Installation &Management
Node Installation &Management
Monitoringand
Fault Tolerance
Monitoringand
Fault ToleranceResource
ManagementResource
ManagementFabric StorageManagement
Fabric StorageManagement
Grid Application LayerGrid Application Layer
Data Management
Data ManagementJob
ManagementJob
ManagementMetadata
ManagementMetadata
ManagementObject to File
MappingObject to File
Mapping
Logging & BookkeepingLogging &
Bookkeeping
WP6 additionsto Globus
GlobusEDG release
The EDG Intro– Tutorial - n° 22
Grid aspects covered by EDG testbed 1
VO servers LDAP directory for mapping users (with certificates) to correct VO
Storage Element Grid-aware storage area, situated close to a CE
User Interface Submit & monitor jobs, retrieve output
Replica Manager Replicates data to one or more CEs
Job Submission Service
Manages submission of jobs to Res. Broker
Replica Catalog Keeps track of multiple data files “replicated” on different CEs
Information index Provides info about grid resources via GIIS/GRIS hierarchy
Information & Monitoring
Provides info on resource utilization & performance
Resource Broker Uses Info Index to discover & select resources based on job requirements
Grid Fabric Mgmt Configure, installs & maintains grid sw packages and environ.
Logging and Bookkeeping
Collects resource usage & job status
Network performance, security and monitoring
Provides efficient network transport, security & bandwidth monitoring
Computing Element Gatekeeper to a grid computing resource
Testbed admin. Certificate auth.,user reg., usage policy etc.
The EDG Intro– Tutorial - n° 23
Tasks for the WP6 integration team Testing and integration of the Globus package
Exact definition of RPM lists (components) for the various testbed machine profiles (CE service , RB, UI, SE service , NE, WN, ) – check dependencies
Perform preliminary centrally (CERN) managed tests on EDG m/w before green light for spread EDG testbed sites deployment
Provide, update end user documentation for installers/site managers, developers and end users
Define EDG release policies, coordinate the integration team staff with the various WorkPackage managers – keep high inter-coordination.
Assign the reported bugs to the corresponding developers/site managers (BugZilla)
Complete support for the iTeam testing VO
The EDG Intro– Tutorial - n° 24
EDG overview: Middleware release schedule
Planned intermediate release schedule Release 1.1: January 2002 Release 1.2: March 2002 Release 1.3: May 2002 Release 1.4: July 2002
Similar schedule for 2003 Each release includes
feedback from use of previous release by application groups planned improvements/extension by middle-ware WPs use of WP6 software infrastructure feeds into architecture group
July 1.1.3
Internal
August
The EDG Intro– Tutorial - n° 25
Release Plan details
Current release EDG 1.1.4
Deployed on testbed under RedHat 6.2
Finalising build of EDG 1.2 (now)
GDMP 3.0
GSI-enabled RFIO client and server
EDG 1.3 (internal)
Build using autobuild tools – to ease future porting
Support for MPI on single site
EDG 1.4 (August) Support RH 6.2 & 7.2
Basic support for interactive jobs
Integration of Condor DAGman
Use MDS 2.2 with first GLUE schema
EDG 2.0 (Oct) Still based on Globus 2.x (pre-
OGSA)
Use updated GLUE schema
Job partitioning & check-pointing
Advanced reservation/co-allocation
See http://edms.cern.ch/document/333297 for further details
The EDG Intro– Tutorial - n° 26
EDG overview : testbed schedule
Planned intermediate testbed schedule Testbed 0: March 2001 Testbed 1: November 2001-January 2002 Testbed 2: September-October 2002 Testbed 3: March 2003 Testbed 4: September-October 2003
Number of EDG testbed sites permanently increasing : currently 9 sites are visible to the CERN resource broker
Each site normally implements, at least : A central install & config server (LCFG server) WMS (WP1) dedicated machines : UI, CE (g/k & worker
node(s) ) MDS Info Providers to the global EDG GIIS/GRIS Network Monitoring
The EDG Intro– Tutorial - n° 27
Development & Production testbeds
Development Initial set of 5 sites will keep small cluster of PCs for development
purposes to test new versions of the software, configurations etc.
Production More stable environment for use by application groups
more sites more nodes per site (grow to meaningful size at major centres) more users per VO
Usage already foreseen in Data Challenge schedules for LHC experiments
harmonize release schedules