HENP DATA GRIDS HENP DATA GRIDS and STARTAP and STARTAP Worldwide Analysis at Regional Worldwide Analysis at Regional Centers Centers Harvey B. Newman (Caltech) Harvey B. Newman (Caltech) HPIIS Review HPIIS Review San Diego, October 25, 2000 San Diego, October 25, 2000 http://l3www.cern.ch/~newman/hpiis2000.ppt http://l3www.cern.ch/~newman/hpiis2000.ppt
40
Embed
HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
HENP DATA GRIDS HENP DATA GRIDS and STARTAPand STARTAP
Worldwide Analysis at Regional CentersWorldwide Analysis at Regional Centers Harvey B. Newman (Caltech)Harvey B. Newman (Caltech)
HPIIS ReviewHPIIS ReviewSan Diego, October 25, 2000San Diego, October 25, 2000
Next Generation Experiments: Next Generation Experiments: Physics and Technical GoalsPhysics and Technical Goals
The extraction of small or subtle new “discovery” The extraction of small or subtle new “discovery” signals from large and potentially overwhelming signals from large and potentially overwhelming backgrounds; or “precision” analysis of large samplesbackgrounds; or “precision” analysis of large samples
Providing rapid access to event samples and subsets Providing rapid access to event samples and subsets from massive data stores, from ~300 Terabytes in 2001 from massive data stores, from ~300 Terabytes in 2001 Petabytes by ~2003, ~10 Petabytes by 2006, to ~100 Petabytes by ~2003, ~10 Petabytes by 2006, to ~100 Petabytes by ~2010.Petabytes by ~2010.
Providing analyzed results with rapid turnaround, byProviding analyzed results with rapid turnaround, bycoordinating and managing the coordinating and managing the LIMITED LIMITED computing, computing, data handling and network resources effectivelydata handling and network resources effectively
Enabling rapid access to the data and the collaboration, Enabling rapid access to the data and the collaboration, across an ensemble of networks of varying capability, across an ensemble of networks of varying capability, using heterogeneous resources.using heterogeneous resources.
The Large Hadron Collider (2005-)The Large Hadron Collider (2005-)
A next-generation particle collider A next-generation particle collider
the largest superconductor installation in the largest superconductor installation in the worldthe world
A bunch-bunch collision will take place A bunch-bunch collision will take place every 25 nanoseconds: each generating ~20 every 25 nanoseconds: each generating ~20 interactionsinteractions
But only one in a trillion may lead to a But only one in a trillion may lead to a major physics discovery major physics discovery
Real-time data filtering: Real-time data filtering: Petabytes per second to Gigabytes per Petabytes per second to Gigabytes per secondsecond
Accumulated data of many Petabytes/YearAccumulated data of many Petabytes/Year
Large data samples explored and analyzed Large data samples explored and analyzed by thousands of geographically dispersed by thousands of geographically dispersed scientists, in hundreds of teamsscientists, in hundreds of teams
Computing Challenges: Computing Challenges: LHC ExampleLHC Example
Geographical dispersion:Geographical dispersion: of people and resources of people and resources Complexity:Complexity: the detector and the LHC environment the detector and the LHC environment Scale: Scale: Tens of Petabytes per year of dataTens of Petabytes per year of data
1800 Physicists 150 Institutes 34 Countries
Major challenges associated with:Major challenges associated with:Communication and collaboration at a distanceCommunication and collaboration at a distance
Network-distributed computing and data resources Network-distributed computing and data resources Remote software development and physics analysisRemote software development and physics analysisR&D: New Forms of Distributed Systems: Data GridsR&D: New Forms of Distributed Systems: Data Grids
Four LHC Experiments: The Four LHC Experiments: The Petabyte to Exabyte Petabyte to Exabyte
Higgs + New particles; Quark-Gluon Plasma; CP ViolationHiggs + New particles; Quark-Gluon Plasma; CP Violation
Data written to tapeData written to tape ~25 Petabytes/Year and ~25 Petabytes/Year and UP;UP; 0.25 Petaflops and UP 0.25 Petaflops and UP
0.1 to 1 Exabyte (1 EB = 100.1 to 1 Exabyte (1 EB = 101818 Bytes) Bytes) (~2010) (~2015 ?) Total for the LHC Experiments(~2010) (~2015 ?) Total for the LHC Experiments
Higgs SearchLEPC September 2000
All charged tracks with pt > 2 GeV
Reconstructed tracks with pt > 25 GeV
(+30 minimum bias events)
109 events/sec, selectivity: 1 in 1013 (1 person in a thousand world populations)
LHC: Higgs Decay into 4 muons LHC: Higgs Decay into 4 muons (tracker only); 1000X LEP Data Rate(tracker only); 1000X LEP Data Rate
On-line Filter SystemOn-line Filter System Large variety of triggers Large variety of triggers
and thresholds: select and thresholds: select physics à la cartephysics à la carte
Multi-level triggerMulti-level trigger Filter out less Filter out less
Managed, fair-shared access for Physicists everywhereManaged, fair-shared access for Physicists everywhere Maximize total funding resources while meeting the Maximize total funding resources while meeting the
total computing and data handling needstotal computing and data handling needs Balance between proximity of datasets to appropriate Balance between proximity of datasets to appropriate
resources, and to the usersresources, and to the users Tier-N ModelTier-N Model
Efficient use of network: higher throughputEfficient use of network: higher throughput Per Flow: Local > regional > national > internationalPer Flow: Local > regional > national > international
Utilizing all intellectual resources, in several time zonesUtilizing all intellectual resources, in several time zones CERN, national labs, universities, remote sitesCERN, national labs, universities, remote sites Involving physicists and students at their home institutionsInvolving physicists and students at their home institutions
Greater flexibility to pursue different physics interests, Greater flexibility to pursue different physics interests, priorities, and resource allocation strategies by regionpriorities, and resource allocation strategies by region
And/or by Common Interests (physics topics, subdetectors,…)And/or by Common Interests (physics topics, subdetectors,…) Manage the System’s ComplexityManage the System’s Complexity
Partitioning facility tasks, to manage and focus resourcesPartitioning facility tasks, to manage and focus resources
Data stores, networks, computers, display Data stores, networks, computers, display devices,… ; associated local servicesdevices,… ; associated local services
A Rich Set of HEP Data-Analysis A Rich Set of HEP Data-Analysis Related ApplicationsRelated Applications
[*] [*] Adapted from Ian FosterAdapted from Ian Foster
SDSS Data Grid (In GriPhyN): SDSS Data Grid (In GriPhyN): A Shared VisionA Shared Vision
Three main functions:Three main functions: Raw data processing on a Grid (FNAL)Raw data processing on a Grid (FNAL) Rapid turnaround with TBs of dataRapid turnaround with TBs of data Accessible storage of all image dataAccessible storage of all image data
Fast science analysis environmentFast science analysis environment(JHU)(JHU)
Combined data access + analysis Combined data access + analysis of calibrated dataof calibrated data
Distributed I/O layer and processing Distributed I/O layer and processing layer; shared by whole collaborationlayer; shared by whole collaboration
Public data accessPublic data access SDSS data browsing for SDSS data browsing for
astronomers, and studentsastronomers, and students Complex query engine for the publicComplex query engine for the public
RD45, RD45, GIODGIOD Networked Object DatabasesNetworked Object Databases Clipper/GC Clipper/GC High speed access to Objects or File data High speed access to Objects or File data FNAL/SAM FNAL/SAM for processing and analysisfor processing and analysis SLAC/OOFS SLAC/OOFS Distributed File System + Objectivity Distributed File System + Objectivity Interface Interface NILE, Condor:NILE, Condor: Fault Tolerant Distributed ComputingFault Tolerant Distributed Computing
Developed Java 3D OO Developed Java 3D OO Reconstruction, Analysis Reconstruction, Analysis and Visualization and Visualization Prototypes that Work Prototypes that Work Seamlessly OverSeamlessly OverWorldwide NetworksWorldwide Networks
Deployed facilities and Deployed facilities and database federations as database federations as testbedstestbeds for Computing for Computing Model studiesModel studies
The Particle Physics Data Grid The Particle Physics Data Grid (PPDG)(PPDG)
First Round Goal: First Round Goal: Optimized cached read access to 10-100 Gbytes Optimized cached read access to 10-100 Gbytes drawn from a total data set of 0.1 to ~1 Petabytedrawn from a total data set of 0.1 to ~1 Petabyte
A New Form of Integrated Distributed SystemA New Form of Integrated Distributed System
Meeting the Scientific Goals Meeting the Scientific Goals of LIGO, SDSS and the LHC Experiments of LIGO, SDSS and the LHC Experiments
Focus on Tier2 Centers at UniversitiesFocus on Tier2 Centers at Universities In a Unified Hierarchical Grid of Five LevelsIn a Unified Hierarchical Grid of Five Levels
18 Centers; with Four Sub-Implementations 18 Centers; with Four Sub-Implementations 5 Each in US for LIGO, CMS, ATLAS; 3 for SDSS5 Each in US for LIGO, CMS, ATLAS; 3 for SDSS Near Term Focus on LIGO, SDSS handling of real data; Near Term Focus on LIGO, SDSS handling of real data;
LHC “Data Challenges” with simulated dataLHC “Data Challenges” with simulated data Cooperation with PPDG, MONARC and EU DataGridCooperation with PPDG, MONARC and EU DataGrid
Genomics, Proteomics, ...Genomics, Proteomics, ... The Earth System Grid and EOSDISThe Earth System Grid and EOSDIS Federating Brain DataFederating Brain Data Computed MicroTomography Computed MicroTomography …… NVO, GVONVO, GVO
GRIDs In 2000: SummaryGRIDs In 2000: Summary
Grids are changing the way we do science Grids are changing the way we do science and engineeringand engineering
From Computation to DataFrom Computation to Data Key services and concepts have been Key services and concepts have been
identified, and development has startedidentified, and development has started Major IT challenges remainMajor IT challenges remain
AnAn Opportunity & Obligation for HEP/CSOpportunity & Obligation for HEP/CSCollaborationCollaboration
Transition of services and applications to production Transition of services and applications to production use is starting to occuruse is starting to occur
In future more sophisticated integrated services and In future more sophisticated integrated services and toolsets (Inter- and IntraGrids+) could drive advances in toolsets (Inter- and IntraGrids+) could drive advances in many fields of science & engineeringmany fields of science & engineering
HENP, facing the need for Petascale Virtual Data, HENP, facing the need for Petascale Virtual Data, is both an early adopter, and a leading developer is both an early adopter, and a leading developer of Data Grid technologyof Data Grid technology
Installed LinkBW in MbpsIncl. New SLACThroughput [*]
310
(120)
622
(250)
1600
(400)
2400
(600)
4000
(1000)
6500 [#]
(1600)
[#] Includes ~1.5 Gbps Each for ATLAS and CMS, Plus Babar, Run2 and Other[*] D0 and CDF at Run2: Needs Presumed to Be to be Comparable to BaBar
Daily, Weekly, Monthly and Yearly Statistics Daily, Weekly, Monthly and Yearly Statistics on the 45 Mbps US-CERN Linkon the 45 Mbps US-CERN Link
HEP Network RequirementsHEP Network Requirementsand STARTAPand STARTAP
Beyond the requirement of adequate bandwidth, Beyond the requirement of adequate bandwidth, physicists in HENP’s major experiments depend on:physicists in HENP’s major experiments depend on:
Network and user software that will work together to Network and user software that will work together to provide high throughput and to manage the provide high throughput and to manage the bandwidth effectivelybandwidth effectively
A suite of videoconference and high-level tools for A suite of videoconference and high-level tools for remote collaboration that make data analysis from remote collaboration that make data analysis from the US (and from other world regions) effectivethe US (and from other world regions) effective
An integrated set of local, regional, national and An integrated set of local, regional, national and international networks that interoperate seamlessly, international networks that interoperate seamlessly, without bottleneckswithout bottlenecks
Configuration at Chicago with Configuration at Chicago with KPN/Qwest KPN/Qwest
HEP Network RequirementsHEP Network Requirementsand STARTAPand STARTAP
The STARTAP, a professionally managed international The STARTAP, a professionally managed international peering point with an open HP policy, has been and peering point with an open HP policy, has been and will continue to be vital for US involvement in the LHC, will continue to be vital for US involvement in the LHC, and thus for the progress of the LHC physics program.and thus for the progress of the LHC physics program.
Our development of worldwide Data Grid systems,Our development of worldwide Data Grid systems,in collaboration with the European Union and otherin collaboration with the European Union and otherworld regions, will depend on the STARTAP for jointworld regions, will depend on the STARTAP for jointprototyping, tests and developments using next-prototyping, tests and developments using next-generation network, software and database technology.generation network, software and database technology.
A scalable and cost-effective growth path for the A scalable and cost-effective growth path for the STARTAP will be needed, as a central component STARTAP will be needed, as a central component of international networks for HENP, and other fields.of international networks for HENP, and other fields.
An optical STARTAP handling OC-48 and OC-192An optical STARTAP handling OC-48 and OC-192links, with favorable peering and transit arrangements links, with favorable peering and transit arrangements across the US would be well-matched to our future plans.across the US would be well-matched to our future plans.
US-CERN line connection to Esnet:US-CERN line connection to Esnet:to HENP Labs Through STARTAPto HENP Labs Through STARTAP
TCP throughput performance: TCP throughput performance: Caltech/CERN Via STARTAPCaltech/CERN Via STARTAP
From Caltech to CERN
From CERN to Caltech
Vancouver
Calgary
Regina Winnipeg
Ottawa
Montreal
Toronto
Halifax
St. John’s
Fredericton
Charlottetown
Chicago
Seattle
New York
Los Angeles Miami
Europe
Dedicated Wavelength
or SONET channel
OBGP switches
Optional Layer 3 aggregation service
Large channel WDM system
CA*net 4 Possible ArchitectureCA*net 4 Possible Architecture
Pasadena
Intermediate ISP
Tier 1 ISPTier 2 ISP
AS 1 AS 2 AS 3 AS 4
AS 5
Dual Connected
Router to AS 5
Optical switch looks like BGP router and AS1 is direct connected to Tier 1 ISP but still transits AS 5
Router redirects networks with heavy traffic load to optical switch, but routing policy still maintained by ISP
Beyond Grid Prototype Components: Integration of Beyond Grid Prototype Components: Integration of Grid Prototypes for End-to-end Data TransportGrid Prototypes for End-to-end Data Transport
Particle Physics Data Grid (PPDG) ReqMParticle Physics Data Grid (PPDG) ReqM PPDG/EU DataGrid GDMP for CMS HLT ProductionsPPDG/EU DataGrid GDMP for CMS HLT Productions
Start Building the Grid System(s): Integration with Start Building the Grid System(s): Integration with Experiment-specific software frameworksExperiment-specific software frameworks
Derivation of Strategies (MONARC Simulation System) Derivation of Strategies (MONARC Simulation System) Data caching, query estimation, co-schedulingData caching, query estimation, co-scheduling Load balancing and workload management amongst Load balancing and workload management amongst
Tier0/Tier1/Tier2 sites (SONN by Legrand)Tier0/Tier1/Tier2 sites (SONN by Legrand) Transaction robustness: simulate and verifyTransaction robustness: simulate and verify
Transparent Interfaces for Replica ManagementTransparent Interfaces for Replica Management Deep versus shallow copies: Thresholds; Deep versus shallow copies: Thresholds;
tracking, monitoring and controltracking, monitoring and control
Onset of large scale optimized Production file Onset of large scale optimized Production file transfers, involving both HENP Labs & Universitiestransfers, involving both HENP Labs & Universities Babar, CMS, ATLASBabar, CMS, ATLAS Upcoming D0, CDF at FNAL/Run2; RHICUpcoming D0, CDF at FNAL/Run2; RHIC
Seamless remote access to Object databasesSeamless remote access to Object databases CMSOO demos: IGrid2000 (Yokohama)CMSOO demos: IGrid2000 (Yokohama) Now starting on distributed CMS ORCA OO Now starting on distributed CMS ORCA OO
(TB to PB) DB Access(TB to PB) DB Access CMS User Analysis Environment (UAE)CMS User Analysis Environment (UAE)
Worldwide Grid-enabled view of the data, Worldwide Grid-enabled view of the data, along along with visualizations, data with visualizations, data
presentation presentation and analysisand analysis A User-view across the Data GridA User-view across the Data Grid
A Principal testbed to develop production A Principal testbed to develop production Grid systems, of worldwide scope Grid systems, of worldwide scope Grid Data Management Prototype (GDMP; US/EU) Grid Data Management Prototype (GDMP; US/EU) GriPhyN: 18-20 University facilities serving GriPhyN: 18-20 University facilities serving
CMS, ATLAS,LIGO and SDSS, CMS, ATLAS,LIGO and SDSS, Built on a strong foundation of grid security and Built on a strong foundation of grid security and
information infrastructure Foundationinformation infrastructure Foundation Deploying a Grid Virtual Data Toolkit (VDT)Deploying a Grid Virtual Data Toolkit (VDT)
VRVS: Worldwide-extensible videoconferencingVRVS: Worldwide-extensible videoconferencing and shared virtual spaces and shared virtual spaces
Future: Forward-looking view of Mobile Agent Future: Forward-looking view of Mobile Agent Coordination Architectures Coordination Architectures Survivable Loosely Coupled Systems withSurvivable Loosely Coupled Systems with