Top Banner
EDG Application The European DataGrid Project Team http://www.eu-datagrid.org
29

EDG Application The European DataGrid Project Team .

Jan 11, 2016

Download

Documents

Kathlyn Brooks
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: EDG Application The European DataGrid Project Team .

EDG Application

The European DataGrid Project Team

http://www.eu-datagrid.org

Page 2: EDG Application The European DataGrid Project Team .

EDG Applications Tutorial – n° 2

EDG Application Areas

High Energy Physics

Biomedical Applications

Earth Observation Science Applications

Page 3: EDG Application The European DataGrid Project Team .

EDG Applications Tutorial – n° 3

High Energy Physics

4 Experiments on LHC CMSATLAS

LHCb

~6-8 PetaBytes / year~108 events/year

~103 batch and interactive users

Page 4: EDG Application The European DataGrid Project Team .

EDG Applications Tutorial – n° 4

Europe: 267 institutes, 4603 usersElsewhere: 208 institutes, 1632 users

CERN’s Network in the World

Page 5: EDG Application The European DataGrid Project Team .

EDG Applications Tutorial – n° 5

Data Flow in LHC

RAW Data

DAQ

Trigger

Reconstruction

Event Summary Data (ESD) Reconstruction Tags

RAW Tags Conditions / Calibration Data

Physics Generator

Detector Simulation

Generator Data

RAWmc Data

Monte Carlo

Reconstruction

Event Summary Data (ESD) Reconstruction Tags

RAWmc Tags Conditions / Calibration Data

Page 6: EDG Application The European DataGrid Project Team .

EDG Applications Tutorial – n° 6

Example: CMS Monte Carlo Production

Page 7: EDG Application The European DataGrid Project Team .

EDG Applications Tutorial – n° 7

CMS jobs description

CMKIN : MC Generation of the proton-proton interaction for a physics channel (dataset)

CMSIM: Detailed simulation of the CMS detector, processing the data produced during the CMKIN step

CMKINJob

CMSIMJob

Output data

Output data

Grid Storage

Write to Grid

Storage Element

Write to Grid

Storage Element

Read from

Grid

Stora

ge Elem

ent

* PIII 1GHz 512MB 46.8 SI95

size/event time*/event

CMKIN ~ 0.05MB ~ 0.4-0.5 sec

CMSIM ~ 1.8 MB ~ 6 min

Page 8: EDG Application The European DataGrid Project Team .

EDG Applications Tutorial – n° 8

CMS EDG

SECE

CMS software

CMS production components interfaced

to EDG middleware

BOSSDB

WorkloadManagement

System

JDL

RefDB

parameters

Push data or info

Pull info

UIIMPALA/BOSS

CE

CMS software

CE

CMS software

CE

SE

SE

SE

Production is managed from the EDG User Interface with IMPALA/BOSS

CMS Virtual Organization server at NIKHEF (Amsterdam)

Page 9: EDG Application The European DataGrid Project Team .

EDG Applications Tutorial – n° 9

CMS EDG

SECE

CMS software

BOSSDB

WorkloadManagement

System

JDL

RefDB

parameters

data registration

input

dat a

lo

cat i

on

Push data or info

Pull info

UIIMPALA/BOSS

Replica Manager

CE

CMS software

CE

CMS software

CE

WN

SECE

CMS software

SE

SE

SE

CMKIN jobs running on all EDG Testbed sites with CMS software installed CMSIM jobs running on CE close to the input data produced data: scripts for batch replication to a dedicated SE

X

CMS production components interfaced

to EDG middleware

Page 10: EDG Application The European DataGrid Project Team .

EDG Applications Tutorial – n° 10

CMS EDG

SECE

CMS software

CMS production components interfaced

to EDG middleware

BOSSDB

WorkloadManagement

System

JDL

RefDB

parameters

data registration

Job output filteringRuntime monitoring

input

dat a

lo

cat i

on

Push data or info

Pull info

UIIMPALA/BOSS

Replica Manager

CE

CMS software

CE

CMS software

CE

WN

SECE

CMS software

SE

SE

SE

Job monitoring and bookkeeping: BOSS DBs, EDG Logging & Bookkeeping service

Page 11: EDG Application The European DataGrid Project Team .

EDG Applications Tutorial – n° 11

CMS use of the system (Statistics)

CEsSEs

Nb

. of

evts

time

Events Production within EDG is part of the Official CMS production

http://cmsdoc.cern.ch/cms/production/www/html/general/index.html

Page 12: EDG Application The European DataGrid Project Team .

EDG Applications Tutorial – n° 12

Summary of CMS work and the planning for use of EDG

middleware

RESULTS We can distribute and run CMS s/w in the EDG environment

We have generated ~250K events for physics with ~10000 jobs in 3 week period

OBSERVATIONS and PLANNING for the future We were able to quickly add new sites to provide extra resources

There was a fast turnaround in bug fixing and installing new software

The stress test was labor intensive (since software was developing and th

Release EDG 2.0 should fix the major problems and allow for enhanced scalability,and we look forward to evaluating it and using it in our Data Challenge work

Page 13: EDG Application The European DataGrid Project Team .

EDG Applications Tutorial – n° 13

ESA(IT) – KNMI(NL)Processing of raw GOMEdata to ozone profiles.

2 alternative algorithms~28000 profiles/day IPSL(FR)

Validate some of theGOME ozone profiles (~106/y)Coincident in space and time

with Ground-Based measurements

Visualization & Analyze

EDG EO challenge: Processing / validation of 1y of GOME data

LIDAR data (7 stations, 2.5MB per month)

DataGridenvironment

Level 2

(example of 1 day total O3)

Level 1

Raw satellite data from the GOME instrument(~75 GB - ~5000 orbits/y)

Page 14: EDG Application The European DataGrid Project Team .

EDG Applications Tutorial – n° 14

EO WebMap Portal

Page 15: EDG Application The European DataGrid Project Team .

EDG Applications Tutorial – n° 15

Web Portal EO ProductCatalogue

EDGStorage Element

EDGUser Interface

EDGResource

BrokerEDGComputing

Element

EO Replica Catalogue

Processing Sequence

EOGrid Engine

EO ProductArchive

1. Search Level-1 catalogue

2. Retrieve Level-2 products

3. Level-2 Products already registered in RC?

8. Submit jobs to process Level-1 data

7. Register Level-1 data

11. Register level-2 data

9. Process Level-1 data

10. Transfer Level-2 data to SE

12. Return new Level-2 products

Yes? 4. Return available Level-2 productsNo? 5. Perform GRID processing on-the-fly 6. Transfer

Level-1 data from Archive to the Grid

Page 16: EDG Application The European DataGrid Project Team .

EDG Applications Tutorial – n° 16

GOME Ozone Profile Validation

Goals of the DataGrid applicationvalidate satellite data with all ground based data available in an easy way: Comparison of ozone profiles provided by satellite with lidar data in different locations and times (see the web portal) Statistical comparison and analysis in order to improve algorithms.

OZONE LAYER50 km

10 km

ERS/GOME satellite

Lidar at the Haute Provence Observatory

Page 17: EDG Application The European DataGrid Project Team .

EDG Applications Tutorial – n° 17

Validation Processing Sequence

Level 2 Catalogue

Lidar data catalogue

Queries and data information retrieval from the Lidar metadata catalogue

GRID

ComputingElement

Storage Elements with

Lidar data

Queries and data information retrieval from the Gome Level 2 orbit or pixel metadata catalogues

When completed comparison between lidar and satellite ozone profiles

Satellite data validation Lidar site

Level 2 Catalogue

GRID Portal

Storage Elements with Gome L2 data

Submission of the Job in the GRID

1

2

3

4

Page 18: EDG Application The European DataGrid Project Team .

EDG Applications Tutorial – n° 18

Validation OutputFigure 1:

Estimation of the bias between Gome and Lidar using one month of data.

Figure 2 :

example of 2 profiles : Comparison between Gome profile and lidar profile for the 2nd October 2000.

Page 19: EDG Application The European DataGrid Project Team .

EDG Applications Tutorial – n° 19

Perspectives for Biomedical Applications

Grids open new perspectives in large scale genomics analysis

Complete genome annotation

Cross-genomes analysis

Data mining on distributed databases

Pipelining of huge automatic bio-informatics analysis

Medical image processing

Large databases processing

Anatomy and physiology modeling

Epidemiological studies

Page 20: EDG Application The European DataGrid Project Team .

EDG Applications Tutorial – n° 20

Biomedical Applications Bio-informatics

Phylogenetics : BBE Lyon (T. Sylvestre) Search for primers : Centrale Paris (K. Kurata) Statistical genetics : CNG Evry (N. Margetic) Bio-informatics web portal : IBCP (C. Blanchet) Parasitology : LBP Clermont, Univ B. Pascal (N. Jacq) Data-mining on DNA chips : Karolinska (R. Médina, R.

Martinez) Geometrical protein comparison : Univ. Padova (C. Ferrari)

Medical imaging MR image simulation : CREATIS (H. Benoit-Cattin) Medical data and metadata management : CREATIS (J.

Montagnat) Mammographies analysis ERIC/Lyon 2 (S. Miguet, T.

Tweed) Simulation platform for PET/SPECT based on Geant4 : GATE

collaboration (L. Maigne)

Applications deployedApplications tested on EDGApplications under preparation

Page 21: EDG Application The European DataGrid Project Team .

EDG Applications Tutorial – n° 21

Medical Imaging

Medical images

Metadata

HH

1. query

2. visu

alisat

ion

3. similarity search4. scores

5. best results visualisation

LFN image patient hospital ...

Page 22: EDG Application The European DataGrid Project Team .

EDG Applications Tutorial – n° 22

Graphic layer

Job Monitoring

Grid File Browsing

File registration and retrieval

Page 23: EDG Application The European DataGrid Project Team .

EDG Applications Tutorial – n° 23

Graphical InterfacesImage registration

Image retrieval

Local files Grid files Metadata

Query over metadata Query result

Page 24: EDG Application The European DataGrid Project Team .

EDG Applications Tutorial – n° 24

Image Registration

LFN image patient hospital ...

Imager

SE

Page 25: EDG Application The European DataGrid Project Team .

EDG Applications Tutorial – n° 25

Similarity searchSimilarity computation

Results visualization

Job monitoring Ranked list of images

Source image Most similar images Low score images

Page 26: EDG Application The European DataGrid Project Team .

EDG Applications Tutorial – n° 26

Future: Interfacing medical data with the Grid

Client 1interface

Client 2interface

RSinterface

core

grid - serverinterface

header blankingencryption

StorageElement

ReplicaCatalog

ReplicationService

RCinterface

Metadata interface

Medical (trusted) site

Grid middleware

File metadataACLsizechecksum...

Application metadataACLencryption keysensitive metadata...Medical server

StorageElementMSS

Master File

Replica

Imager

Page 27: EDG Application The European DataGrid Project Team .

EDG Applications Tutorial – n° 27

Parallel ProcessingMagnetic Resonance Images simulation using the grid

3 levels of parallelism:

Parallel isochromat computations

Multi-slice MRI computation

Parallel magnetization kernel

Magnetisationcomputation

kernel

Reconstructionalgorithm MRI

ImageVirtualobject

MRIsequence

Page 28: EDG Application The European DataGrid Project Team .

EDG Applications Tutorial – n° 28

Summary

Use Cases High Energy Physics

Earth Observation

Biomedical Applications

Page 29: EDG Application The European DataGrid Project Team .

EDG Applications Tutorial – n° 29

Further Information

High Energy Physics

http://datagrid-wp8.web.cern.ch/DataGrid-WP8/

Bio-Informatics

http://marianne.in2p3.fr/datagrid/wp10/index.html

Earth Observation

http://styx.esrin.esa.it/grid/