Top Banner
University o f Michigan ( May 8, 2003) Paul Avery 1 Paul Avery University of Florida http://www.phys.ufl.edu/~avery/ [email protected] Grids for 21 st Century Data Intensive Science University of Michigan May 8, 2003
53

Paul Avery University of Florida phys.ufl/~avery/ [email protected]

Jan 27, 2016

Download

Documents

cala

Grids for 21 st Century Data Intensive Science. Paul Avery University of Florida http://www.phys.ufl.edu/~avery/ [email protected]. University of Michigan May 8, 2003. Grids and Science. The Grid Concept. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 1

Paul AveryUniversity of Florida

http://www.phys.ufl.edu/~avery/[email protected]

Grids for 21st CenturyData Intensive Science

University of MichiganMay 8, 2003

Page 2: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 2

Grids and Science

Page 3: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 3

The Grid ConceptGrid: Geographically distributed computing

resources configured for coordinated use

Fabric: Physical resources & networks provide raw capability

Middleware: Software ties it all together (tools, services, etc.)

Goal: Transparent resource sharing

Page 4: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 4

Fundamental Idea: Resource Sharing

Resources for complex problems are distributedAdvanced scientific instruments (accelerators, telescopes, …)Storage, computing, people, institutions

Communities require access to common servicesResearch collaborations (physics, astronomy, engineering, …)Government agencies, health care organizations,

corporations, …

“Virtual Organizations”Create a “VO” from geographically separated componentsMake all community resources available to any VO memberLeverage strengths at different institutions

Grids require a foundation of strong networkingCommunication tools, visualizationHigh-speed data transmission, instrument operation

Page 5: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 5

Some (Realistic) Grid ExamplesHigh energy physics

3,000 physicists worldwide pool Petaflops of CPU resources to analyze Petabytes of data

Fusion power (ITER, etc.)Physicists quickly generate 100 CPU-years of simulations

of a new magnet configuration to compare with data

AstronomyAn international team remotely operates a telescope in

real time

Climate modelingClimate scientists visualize, annotate, & analyze

Terabytes of simulation data

BiologyA biochemist exploits 10,000 computers to screen

100,000 compounds in an hour

Page 6: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 6

Grids: Enhancing Research & Learning

Fundamentally alters conduct of scientific researchCentral model: People, resources flow inward to labsDistributed model: Knowledge flows between distributed

teamsStrengthens universities

Couples universities to data intensive scienceCouples universities to national & international labsBrings front-line research and resources to studentsExploits intellectual resources of formerly isolated schoolsOpens new opportunities for minority and women

researchersBuilds partnerships to drive advances in

IT/science/eng“Application” sciences Computer SciencePhysics Astronomy, biology, etc.Universities LaboratoriesScientists StudentsResearch Community IT industry

Page 7: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 7

Grid ChallengesOperate a fundamentally complex entity

Geographically distributed resourcesEach resource under different administrative controlMany failure modes

Manage workflow across GridBalance policy vs. instantaneous capability to complete tasksBalance effective resource use vs. fast turnaround for priority

jobsMatch resource usage to policy over the long termGoal-oriented algorithms: steering requests according to

metrics

Maintain a global view of resources and system stateCoherent end-to-end system monitoringAdaptive learning for execution optimization

Build high level services & integrated user environment

Page 8: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 8

Data Grids

Page 9: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 9

Data Intensive Science: 2000-2015Scientific discovery increasingly driven by data

collection Computationally intensive analysesMassive data collectionsData distributed across networks of varying capability Internationally distributed collaborations

Dominant factor: data growth (1 Petabyte = 1000 TB)

2000 ~0.5 Petabyte2005 ~10 Petabytes2010 ~100 Petabytes2015 ~1000 Petabytes?

How to collect, manage,access and interpret thisquantity of data?

Drives demand for “Data Grids” to handleadditional dimension of data access & movement

Page 10: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 10

Data Intensive Physical SciencesHigh energy & nuclear physics

Including new experiments at CERN’s Large Hadron Collider

AstronomyDigital sky surveys: SDSS, VISTA, other Gigapixel arraysVLBI arrays: multiple- Gbps data streams“Virtual” Observatories (multi-wavelength astronomy)

Gravity wave searchesLIGO, GEO, VIRGO, TAMA

Time-dependent 3-D systems (simulation & data)Earth Observation, climate modelingGeophysics, earthquake modelingFluids, aerodynamic designDispersal of pollutants in atmosphere

Page 11: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 11

Data Intensive Biology and Medicine

Medical dataX-Ray, mammography data, etc. (many petabytes)Radiation Oncology (real-time display of 3-D images)

X-ray crystallographyBright X-Ray sources, e.g. Argonne Advanced Photon

Source

Molecular genomics and related disciplinesHuman Genome, other genome databasesProteomics (protein structure, activities, …)Protein interactions, drug delivery

Brain scans (1-10m, time dependent)

Page 12: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 12

1800 Physicists150 Institutes32 Countries

Driven by LHC Computing Challenges

Complexity: Millions of individual detector channelsScale: PetaOps (CPU), Petabytes (Data)Distribution: Global distribution of people & resources

Page 13: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 13

CMS Experiment at LHC“Compact” Muon Solenoid

at the LHC (CERN)

Smithsonianstandard man

Page 14: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 14

LHC Data Rates: Detector to Storage

Level 1 Trigger: Special Hardware

40 MHz ~1000 TB/sec

75 KHz 75 GB/sec

5 KHz 5 GB/sec

Level 2 Trigger: Commodity CPUs

100 Hz 100 – 1500 MB/sec

Level 3 Trigger: Commodity CPUs

Raw Data to storage

Physics filtering

Page 15: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 15

All charged tracks with pt > 2 GeV

Reconstructed tracks with pt > 25 GeV

(+30 minimum bias events)

109 events/sec, selectivity: 1 in 1013

LHC: Higgs Decay into 4 muons

Page 16: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 16

CMS Experiment

Hierarchy of LHC Data Grid Resources

Online System

CERN Computer Center > 20 TIPS

UKKorea RussiaUSA

Institute

100-1500 MBytes/s

2.5-10 Gbps

1-10 Gbps

10-40 Gbps

1-2.5 Gbps

Tier 0

Tier 1

Tier 3

Tier 4

Tier0/( Tier1)/( Tier2) ~ 1:1:1

Tier 2

Physics cache

PCs

Institute

Institute

Institute

Tier2 Center

Tier2 Center

Tier2 Center

Tier2 Center

~10s of Petabytes by 2007-8~1000 Petabytes in 5-7 years

Page 17: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 17

Digital Astronomy

Future dominated by detector improvements

• Moore’s Law growth in CCDs

• Gigapixel arrays on horizon

• Growth in CPU/storage tracking data volumes

Glass

MPixels

•Total area of 3m+ telescopes in the world in m2

•Total number of CCD pixels in Mpixels•25 year growth: 30x in glass, 3000x in pixels

Page 18: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 18

The Age of Astronomical Mega-Surveys

Next generation mega-surveys will change astronomy

Large sky coverageSound statistical plans, uniform systematics

The technology to store and access the data is here

Following Moore’s law

Integrating these archives for the whole community

Astronomical data mining will lead to stunning new discoveries

“Virtual Observatory” (next slides)

Page 19: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 19

Virtual Observatories

Source Catalogs Image Data

Specialized Data:Spectroscopy, Time Series,

PolarizationInformation Archives:

Derived & legacy data: NED,Simbad,ADS, etcDiscovery Tools:

Visualization, Statistics

Standards

Multi-wavelength astronomy,Multiple surveys

Page 20: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 20

Virtual Observatory Data Challenge

Digital representation of the skyAll-sky + deep fields Integrated catalog and image databasesSpectra of selected samples

Size of the archived data40,000 square degreesResolution < 0.1 arcsec > 50 trillion pixelsOne band (2 bytes/pixel) 100 TerabytesMulti-wavelength: 500-1000 TerabytesTime dimension: Many Petabytes

Large, globally distributed database enginesMulti-Petabyte data size, distributed widelyThousands of queries per day, Gbyte/s I/O speed per siteData Grid computing infrastructure

Page 21: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 21

Sloan Sky Survey Data Grid

Page 22: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 22

International Grid/Networking Projects

US, EU, E. Europe, Asia, S. America, …

Page 23: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 23

Global Context: Data Grid ProjectsU.S. Projects

Particle Physics Data Grid (PPDG) DOEGriPhyN NSF International Virtual Data Grid Laboratory (iVDGL) NSFTeraGrid NSFDOE Science Grid DOENSF Middleware Initiative (NMI) NSF

EU, Asia major projectsEuropean Data Grid (EU, EC)LHC Computing Grid (LCG) (CERN)EU national Projects (UK, Italy, France, …)CrossGrid (EU, EC)DataTAG (EU, EC) Japanese ProjectKorea project

Page 24: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 24

Particle Physics Data Grid

Funded 2001 – 2004 @ US$9.5M (DOE)Driven by HENP experiments: D0, BaBar, STAR, CMS, ATLAS

Page 25: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 25

PPDG GoalsServe high energy & nuclear physics (HENP)

experimentsUnique challenges, diverse test environments

Develop advanced Grid technologiesFocus on end to end integration

Maintain practical orientationNetworks, instrumentation, monitoringDB file/object replication, caching, catalogs, end-to-end

movement

Make tools general enough for wide communityCollaboration with GriPhyN, iVDGL, EDG, LCGESNet Certificate Authority work, security

Page 26: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 26

GriPhyN and iVDGL Both funded through NSF ITR program

GriPhyN: $11.9M (NSF) + $1.6M (matching) (2000 – 2005) iVDGL: $13.7M (NSF) + $2M (matching) (2001 – 2006)

Basic compositionGriPhyN: 12 funded universities, SDSC, 3 labs (~80

people) iVDGL: 16 funded institutions, SDSC, 3 labs (~80

people)Expts: US-CMS, US-ATLAS, LIGO, SDSS/NVOLarge overlap of people, institutions, management

Grid research vs Grid deploymentGriPhyN: CS research, Virtual Data Toolkit (VDT)

development iVDGL: Grid laboratory deployment4 physics experiments provide frontier challengesVDT in common

Page 27: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 27

GriPhyN Computer Science Challenges

Virtual data (more later)Data + programs (content) +programs (executions)Representation, discovery, & manipulation of workflows

and associated data & programs

PlanningMapping workflows in an efficient, policy-aware manner to

distributed resources

ExecutionExecuting workflows, inc. data movements, reliably and

efficiently

PerformanceMonitoring system performance for scheduling &

troubleshooting

Page 28: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 28

Goal: “PetaScale” Virtual-Data Grids

Virtual Data Tools

Request Planning &Scheduling Tools

Request Execution & Management Tools

Transforms

Distributed resources(code, storage, CPUs,networks)

ResourceManagement

Services

Security andPolicy

Services

Other GridServices

Interactive User Tools

Production TeamSingle Researcher Workgroups

Raw datasource

PetaOps Petabytes Performance

Page 29: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 29

GriPhyN/iVDGL Science Drivers

US-CMS & US-ATLASHEP experiments at LHC/CERN100s of Petabytes

LIGOGravity wave experiment100s of Terabytes

Sloan Digital Sky SurveyDigital astronomy (1/4 sky)10s of Terabytes

Data

gro

wth

Com

mu

nit

y g

row

th2007

2002

2001

Massive CPU Large, distributed datasets Large, distributed communities

Page 30: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 30

Virtual Data: Derivation and Provenance

Most scientific data are not simple “measurements”They are computationally corrected/reconstructedThey can be produced by numerical simulation

Science & eng. projects are more CPU and data intensive

Programs are significant community resources (transformations)

So are the executions of those programs (derivations)

Management of dataset transformations important!Derivation: Instantiation of a potential data productProvenance: Exact history of any existing data product

We already do this, but manually!

Page 31: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 31

Transformation Derivation

Data

product-of

execution-of

consumed-by/generated-by

“I’ve detected a muon calibration error and want to know which derived data products need to be recomputed.”

“I’ve found some interesting data, but I need to know exactly what corrections were applied before I can trust it.”

“I want to search a database for 3 muon SUSY events. If a program that does this analysis exists, I won’t have to write one from scratch.”

“I want to apply a forward jet analysis to 100M events. If the results already exist, I’ll save weeks of computation.”

Virtual Data Motivations (1)

Page 32: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 32

Virtual Data Motivations (2)Data track-ability and result audit-ability

Universally sought by scientific applications

Facilitates tool and data sharing and collaborationData can be sent along with its recipe

Repair and correction of dataRebuild data products—cf., “make”

Workflow managementOrganizing, locating, specifying, and requesting data

products

Performance optimizationsAbility to re-create data rather than move it

Manual /error prone Automated /robust

Page 33: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 33

“Chimera” Virtual Data System Virtual Data API

A Java class hierarchy to represent transformations & derivations

Virtual Data Language Textual for people & illustrative examples XML for machine-to-machine interfaces

Virtual Data Database Makes the objects of a virtual data definition persistent

Virtual Data Service (future) Provides a service interface (e.g., OGSA) to persistent

objects

Version 1.0 available To be put into VDT 1.1.7

Page 34: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 34

1

10

100

1000

10000

100000

1 10 100

Num

ber

of C

lust

ers

Number of Galaxies

Chimera Virtual Data System+ GriPhyN Virtual Data Toolkit

+ iVDGL Data Grid (many CPUs)

Chimera Application: SDSS Analysis

Galaxy cluster data

Size distribution

Page 35: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 35

Virtual Data and LHC ComputingUS-CMS

Chimera prototype tested with CMS MC (~200K events)Currently integrating Chimera into standard CMS production

tools Integrating virtual data into Grid-enabled analysis tools

US-ATLAS Integrating Chimera into ATLAS software

HEPCAL document includes first virtual data use casesVery basic cases, need elaborationDiscuss with LHC expts: requirements, scope, technologies

New ITR proposal to NSF ITR program ($15M)Dynamic Workspaces for Scientific Analysis Communities

Continued progress requires collaboration with CS groups

Distributed scheduling, workflow optimization, …Need collaboration with CS to develop robust tools

Page 36: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 36

iVDGL Goals and ContextInternational Virtual-Data Grid Laboratory

A global Grid laboratory (US, EU, E. Europe, Asia, S. America, …)

A place to conduct Data Grid tests “at scale”A mechanism to create common Grid infrastructureA laboratory for other disciplines to perform Data Grid

testsA focus of outreach efforts to small institutions

Context of iVDGL in US-LHC computing programDevelop and operate proto-Tier2 centersLearn how to do Grid operations (GOC)

International participationDataTagUK e-Science programme: support 6 CS Fellows per year

in U.S.

Page 37: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 37

US-iVDGL Sites (Spring 2003)

Partners?EUCERNBrazilAustraliaKoreaJapan

UF

Wisconsin

BNL

Indiana

Boston USKC

Brownsville

Hampton

PSU

J. Hopkins

Caltech

Tier1Tier2Tier3

FIU

FSUArlington

Michigan

LBL

Oklahoma

Argonne

Vanderbilt

UCSD/SDSC

NCSA

Fermilab

Page 38: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 38

US-CMS Grid Testbed

UCSD

Florida

Wisconsin

Caltech

Fermilab

FIU

FSU

Brazil

Korea

CERN

Page 39: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 39

Production Run for Monte Carlo data productionAssigned 1.5 million events for “eGamma Bigjets”

~500 sec per event on 750 MHz processor; all production stages from simulation to ntuple

2 months continuous running across 5 testbed sitesDemonstrated at Supercomputing 2002

US-CMS Testbed Success Story

Page 40: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 40

Creation of WorldGridJoint iVDGL/DataTag/EDG effort

Resources from both sides (15 sites)Monitoring tools (Ganglia, MDS, NetSaint, …)Visualization tools (Nagios, MapCenter, Ganglia)

Applications: ScienceGridCMS: CMKIN, CMSIMATLAS:ATLSIM

Submit jobs from US or EU Jobs can run on any clusterDemonstrated at IST2002 (Copenhagen)Demonstrated at SC2002 (Baltimore)

Page 41: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 41

WorldGrid Sites

Page 42: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 42

Grid Coordination

Page 43: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 43

U.S. Project Coordination: TrilliumTrillium = GriPhyN + iVDGL + PPDG

Large overlap in leadership, people, experimentsDriven primarily by HENP, particularly LHC experiments

Benefit of coordinationCommon software base + packaging: VDT + PACMANCollaborative / joint projects: monitoring, demos, security,

…Wide deployment of new technologies, e.g. Virtual DataStronger, broader outreach effort

Forum for US Grid projects Joint view, strategies, meetings and workUnified entity to deal with EU & other Grid projects

Page 44: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 44

International Grid CoordinationGlobal Grid Forum (GGF)

International forum for general Grid effortsMany working groups, standards definitions

Close collaboration with EU DataGrid (EDG)Many connections with EDG activities

HICB: HEP Inter-Grid Coordination BoardNon-competitive forum, strategic issues, consensusCross-project policies, procedures and technology, joint

projectsHICB-JTB Joint Technical Board

Definition, oversight and tracking of joint projectsGLUE interoperability group

Participation in LHC Computing Grid (LCG)Software Computing Committee (SC2)Project Execution Board (PEB)Grid Deployment Board (GDB)

Page 45: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 45

HEP and International Grid ProjectsHEP continues to be the strongest science driver

(In collaboration with computer scientists)Many national and international initiativesLHC a particularly strong driving function

US-HEP committed to working with international partners

Many networking initiatives with EU colleaguesCollaboration on LHC Grid Project

Grid projects driving & linked to network developments

DataTag, SCIC, US-CERN link, Internet2

New partners being actively soughtKorea, Russia, China, Japan, Brazil, Romania, …Participate in US-CMS and US-ATLAS Grid testbedsLink to WorldGrid, once some software is fixed

Page 46: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 46

New Grid Efforts

Page 47: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

An Inter-Regional Center for High Energy Physics Research and Educational

Outreach (CHEPREO) at Florida International University

E/O Center in Miami area iVDGL Grid Activities CMS Research AMPATH network (S.

America) Int’l Activities (Brazil, etc.)

Status: Proposal submitted Dec. 2002 Presented to NSF review panel Project Execution Plan

submitted Funding in June?

Page 48: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 48

A Global Grid Enabled Collaboratory for Scientific

Research (GECSR)$4M ITR proposal from

Caltech (HN PI,JB:CoPI) Michigan (CoPI,CoPI) Maryland (CoPI)

Plus senior personnel from Lawrence Berkeley Lab Oklahoma Fermilab Arlington (U. Texas) Iowa Florida State

First Grid-enabled CollaboratoryTight integration between

Science of Collaboratories Globally scalable work

environmentSophisticated collaborative

tools (VRVS, VNC; Next-Gen)Agent based monitoring &

decision support system (MonALISA)

Initial targets are the global HENP collaborations, but GESCR is expected to be widely applicable to other large scale collaborative scientific endeavors

“Giving scientists from all world regions the means to function as full partners in the process of search and discovery”

Page 49: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 49

Large ITR Proposal: $15MDynamic Workspaces:

Enabling Global Analysis Communities

Page 50: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 50

UltraLight Proposal to NSF

10 Gb/s+ network• Caltech, UF, FIU, UM, MIT• SLAC, FNAL• Int’l partners• Cisco

Applications• HEP• VLBI• Radiation Oncology• Grid Projects

Page 51: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 51

GLORIADNew 10 Gb/s network linking US-Russia-China

Plus Grid component linking science projectsH. Newman, P. Avery participating

Meeting at NSF April 14 with US-Russia-China reps.HEP people (Hesheng, et al.)

Broad agreement that HEP can drive Grid portionMore meetings planned

Page 52: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 52

SummaryProgress on many fronts in PPDG/GriPhyN/iVDGL

Packaging: Pacman + VDTTestbeds (development and production)Major demonstration projectsProductions based on Grid tools using iVDGL resources

WorldGrid providing excellent experienceExcellent collaboration with EU partnersBuilding links to our Asian and other partnersExcellent opportunity to build lasting infrastructure

Looking to collaborate with more international partners

Testbeds, monitoring, deploying VDT more widely

New directionsVirtual data a powerful paradigm for LHC computingEmphasis on Grid-enabled analysis

Page 53: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

University of Michigan (May 8, 2003)

Paul Avery 53

Grid References Grid Book

www.mkp.com/grids Globus

www.globus.org Global Grid Forum

www.gridforum.org PPDG

www.ppdg.net GriPhyN

www.griphyn.org iVDGL

www.ivdgl.org TeraGrid

www.teragrid.org EU DataGrid

www.eu-datagrid.org