Top Banner
INFSO-RI-508833 Enabling Grids for E-sciencE www.eu- egee.org NA4/Biomed Demonstration Medical Data Management and processing EGEE 3 rd review rehearsal, May 4 th , 2006 Johan Montagnat Tristan Glatard
22

INFSO-RI-508833 Enabling Grids for E-sciencE NA4/Biomed Demonstration Medical Data Management and processing EGEE 3 rd review rehearsal,

Dec 13, 2015

Download

Documents

Jeremy Baker
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: INFSO-RI-508833 Enabling Grids for E-sciencE  NA4/Biomed Demonstration Medical Data Management and processing EGEE 3 rd review rehearsal,

INFSO-RI-508833

Enabling Grids for E-sciencE

www.eu-egee.org

NA4/Biomed DemonstrationMedical Data Management and processing

EGEE 3rd review rehearsal, May 4th, 2006

Johan Montagnat

Tristan Glatard

Page 2: INFSO-RI-508833 Enabling Grids for E-sciencE  NA4/Biomed Demonstration Medical Data Management and processing EGEE 3 rd review rehearsal,

NA4/biomed demonstration, 3rd EGEE review rehearsal, May 4th 2006 2

Enabling Grids for E-sciencE

INFSO-RI-508833

Demonstration content

• Medical Data Manager– Interface to clinical data storage (DICOM)

– Integrated to gLite 1.5 middleware

– Tackling data security and privacy needs

– Result of MDM TCG Working Group

• Application to medical images registration assessment– Data intensive workflow-based application

– Scientific results in the medical image processing area with consequences for clinical use

• Immediate scheduling of jobs submitted for the demonstration– Torque+MAUI configuration for efficient handling of short jobs

– Result of the SDJ TCG Working Group

Page 3: INFSO-RI-508833 Enabling Grids for E-sciencE  NA4/Biomed Demonstration Medical Data Management and processing EGEE 3 rd review rehearsal,

NA4/biomed demonstration, 3rd EGEE review rehearsal, May 4th 2006 3

Enabling Grids for E-sciencE

INFSO-RI-508833

Medical Data Manager• Objectives

– Expose an standard grid interface (SRM) for medical image servers (DICOM)

– Fulfill application security requirements without interfering with clinical practice

DICOM server

gLiteIOserver

SR

M-D

ICO

Min

terf

ace

AMGA Metadata

User Interface

Worker Node

HydraKey store

The MDM componentsDICOM clients

FiremanFile

Catalog

Page 4: INFSO-RI-508833 Enabling Grids for E-sciencE  NA4/Biomed Demonstration Medical Data Management and processing EGEE 3 rd review rehearsal,

NA4/biomed demonstration, 3rd EGEE review rehearsal, May 4th 2006 4

Enabling Grids for E-sciencE

INFSO-RI-508833

Interfacing sensitive medical data

• Computing interface to medical DICOM storage– Data are acquired from the hospital imagers in native DICOM

format

– Standard SRM interface exposed to the grid

– DICOM slices are assembled in 3D images

• Privacy– Fireman provides file level ACLs

– gLiteIO provides transparent access control

– AMGA provides metadata secured communication and ACLs

– SRM-DICOM provides on-the-fly data anonimization It is based on the dCache implementation (SRM v1.1)

• Data protection– Hydra provides encryption/decryption transparently

gLite 1.5 servicegLite 1.5 service

gLite 1.5 service

ARDA service

Page 5: INFSO-RI-508833 Enabling Grids for E-sciencE  NA4/Biomed Demonstration Medical Data Management and processing EGEE 3 rd review rehearsal,

NA4/biomed demonstration, 3rd EGEE review rehearsal, May 4th 2006 5

Enabling Grids for E-sciencE

INFSO-RI-508833

Medical Data Registration

DICOM server

AMGA Metadata

HydraKey store

gLiteIOserver

1. Image is acquired

2. Image is stored in DICOM server

3. glite-eds-put

3a. Image is registered (a GUID is associated)

3b. Image keyis produced andregistered

4. image m

etadataare registered

FiremanFile

Catalog

Page 6: INFSO-RI-508833 Enabling Grids for E-sciencE  NA4/Biomed Demonstration Medical Data Management and processing EGEE 3 rd review rehearsal,

NA4/biomed demonstration, 3rd EGEE review rehearsal, May 4th 2006 6

Enabling Grids for E-sciencE

INFSO-RI-508833

Medical Data Retrieval

DICOM serverS

RM

-DIC

OM

inte

rfac

e

AMGA Metadata

UserInterface

Worker Node

HydraKey store

gLiteIOclient 2. glite-eds-get

3. get SURL from GUID

4. request file

5. get file key

6. on-the-fly encryption and anonimyzation

return encrypted file

7. get file key and decrypt file locally

Metadata ACL control

Key ACL control

Anonimization & encryption

In-memorydecryption

1. get GUID from metadata

gLiteIOserver

FiremanFile

CatalogFile ACL control

Page 7: INFSO-RI-508833 Enabling Grids for E-sciencE  NA4/Biomed Demonstration Medical Data Management and processing EGEE 3 rd review rehearsal,

NA4/biomed demonstration, 3rd EGEE review rehearsal, May 4th 2006 7

Enabling Grids for E-sciencE

INFSO-RI-508833

Data replication and retrieval usecase

gLiteIOserver

DICOM server

SRM-DICOMinterface

User Interface Worker Node

HydraKey store

gLiteIOclient

gLiteIOclient

Any SE

SRM interface

Anyfile

1. Requestreplication 6. Request file

2. Get SURL

3. Get file 8. Get file

4. Get file key

5. (encrypted) File replication

7. Get SURL

9. Return (encrypted) file

gLiteIOserver

FiremanFile

CatalogFile ACL control

Page 8: INFSO-RI-508833 Enabling Grids for E-sciencE  NA4/Biomed Demonstration Medical Data Management and processing EGEE 3 rd review rehearsal,

NA4/biomed demonstration, 3rd EGEE review rehearsal, May 4th 2006 8

Enabling Grids for E-sciencE

INFSO-RI-508833

Bronze Standard application

• Medical image registration algorithms assessment– Registration needed in many clinical procedures

– Real clinical impact

• Interfaced to the medical data manager– To retrieve suitable input images

• Compute intensive– Medical image registration algorithms: minutes to hours of

computations on PCs

• Data intensive– Hundreds to thousands of image pairs

• Workflow-based– Using the MOTEUR service-based workflow manager

– Developed in the French ACI “Masse de données” AGIR project

No execution infrastructuregLite 1.5 phased out

May 4, 2:30pm updateB-plan should work:

- pre-install glite 1.5 DMS on prod.- Use production infrastructure for the

demo

Page 9: INFSO-RI-508833 Enabling Grids for E-sciencE  NA4/Biomed Demonstration Medical Data Management and processing EGEE 3 rd review rehearsal,

NA4/biomed demonstration, 3rd EGEE review rehearsal, May 4th 2006 9

Enabling Grids for E-sciencE

INFSO-RI-508833

After registrationBefore registration

Image Registration

Page 10: INFSO-RI-508833 Enabling Grids for E-sciencE  NA4/Biomed Demonstration Medical Data Management and processing EGEE 3 rd review rehearsal,

NA4/biomed demonstration, 3rd EGEE review rehearsal, May 4th 2006 10

Enabling Grids for E-sciencE

INFSO-RI-508833

Variability of a registration algorithm

Registration algorithm

Final transformation

External parameters

• Data (image) 1

• Data (image) 2

• Acquisition noise

• Patient effects

Varying internal parameters

• Initial transformation

• (…)

• Robustness: ability to find the right transformation (success/failure)

• Repeatability: w.r.t. some parameters (e.g. initialization)

• Accuracy: Variability w.r.t. the ground truth for typical data

Fixed internal parameters

• Multiscale resolution

• (Typical variance…)

Page 11: INFSO-RI-508833 Enabling Grids for E-sciencE  NA4/Biomed Demonstration Medical Data Management and processing EGEE 3 rd review rehearsal,

NA4/biomed demonstration, 3rd EGEE review rehearsal, May 4th 2006 11

Enabling Grids for E-sciencE

INFSO-RI-508833

Performance Evaluation without Gold Std

• Bronze standard: The exact result is an unknown variable

• Unbiased estimation: use redundant information

– use many different registration algorithms

(average biases, so that precision ~ accuracy)

– Use many different data (redundant information to ensure precision)

– Average transformations (maximal consistency)

Page 12: INFSO-RI-508833 Enabling Grids for E-sciencE  NA4/Biomed Demonstration Medical Data Management and processing EGEE 3 rd review rehearsal,

NA4/biomed demonstration, 3rd EGEE review rehearsal, May 4th 2006 12

Enabling Grids for E-sciencE

INFSO-RI-508833

Bronze Standard workflow

transrotjiT ,,,

Page 13: INFSO-RI-508833 Enabling Grids for E-sciencE  NA4/Biomed Demonstration Medical Data Management and processing EGEE 3 rd review rehearsal,

NA4/biomed demonstration, 3rd EGEE review rehearsal, May 4th 2006 13

Enabling Grids for E-sciencE

INFSO-RI-508833

Data-intensive service-based applications

• Service-based approach versus task-based approach

Service0

Service1 Service2

Service3

input0

4 instances Job0

Job1 Job2

Job3

Graph of services DAG of tasks

Job0

Job1 Job2

Job3

Job0

Job1 Job2

Job3

Job0

Job1 Job2

Job3

Job0

Job1 Job2

Job3

Job0

Job1 Job2

Job3

Job0

Job1 Job2

Job3

Job0

Job1 Job2

Job3

Job0

Job1 Job2

Job3

Job0

Job1 Job2

Job3

Job0

Job1 Job2

Job3

Job0

Job1 Job2

Job3

Job0

Job1 Job2

Job3

Job0

Job1 Job2

Job3

Job0

Job1 Job2

Job3

Job0

Job1 Job2

Job3

Page 14: INFSO-RI-508833 Enabling Grids for E-sciencE  NA4/Biomed Demonstration Medical Data Management and processing EGEE 3 rd review rehearsal,

NA4/biomed demonstration, 3rd EGEE review rehearsal, May 4th 2006 14

Enabling Grids for E-sciencE

INFSO-RI-508833

Data composition strategies

• Data composition patterns : data intensive applications– One-to-one All-to-all

– In our case: register all images of

the same patient

the same modality

A different exam date

Set 0 Set 1

I0

J0

I1

J1

I2

J2

Set 0 Set 1

I0

J0

I1

J1

I2

J2

Page 15: INFSO-RI-508833 Enabling Grids for E-sciencE  NA4/Biomed Demonstration Medical Data Management and processing EGEE 3 rd review rehearsal,

NA4/biomed demonstration, 3rd EGEE review rehearsal, May 4th 2006 15

Enabling Grids for E-sciencE

INFSO-RI-508833

workflowmanager

MOTEUR services orchestration

EGEEUser Interface

EGEEResources

Input 0

Service B

Output 0

Input 0 Input 1

Service A

Output 0

Data 0

Img Ref 0

Img Ref 0Img Ref 0 Img Ref 0

Img Ref 0Data 1

Img Ref 1

Img Ref 1

Img Ref 1Img Ref 1

Img Ref 1

Img Ref 1Img Ref 1

Img Ref 1Data 2

Img Ref 2

Img Ref 2Img Ref 1Img Ref 2Img Ref 2

Img Ref 2

Page 16: INFSO-RI-508833 Enabling Grids for E-sciencE  NA4/Biomed Demonstration Medical Data Management and processing EGEE 3 rd review rehearsal,

NA4/biomed demonstration, 3rd EGEE review rehearsal, May 4th 2006 16

Enabling Grids for E-sciencE

INFSO-RI-508833

workflowmanager

Legacy code encapsulation

Input 0 Input 1

Generic service

Output 0

Input 0 Input 1

Generic service

Output 0

Algorithm parametersdescription

Algorithm parametersdescription

Legacy code 2

Legacy code 1

<description> <executable name="CrestLines.pl"> <access type="URL"> <path value="http://colors.unice.fr:80/"/> </access> <value value="CrestLines.pl"/> <input name="image" option="-im1"> <access type="LFN" /> </input> <input name="scale" option="-s"/> <output name="crest_lines" option="-c2"> <access type="LFN" /> </output> <sandbox name="convert8bits"> <access type="URL"> <path value="http://colors.unice.fr:80/"/> </access> <value value="Convert8bits.pl"/> </sandbox> </executable></description>

Page 17: INFSO-RI-508833 Enabling Grids for E-sciencE  NA4/Biomed Demonstration Medical Data Management and processing EGEE 3 rd review rehearsal,

NA4/biomed demonstration, 3rd EGEE review rehearsal, May 4th 2006 17

Enabling Grids for E-sciencE

INFSO-RI-508833

Short Deadline Jobs

• Torque + MAUI specific configuration– Virtual processors allocation

– Does not interfere with normal batch scheduling (shared processor time)

– Enables efficient processing of short tasks on the production infrastructure

– Ersatz for lack of jobs prioritization

• Special submission queues– Three SDJ queues deployed on biomed-compliant sites

– Time-limited queues

• Submit-or-reject paradigm– Jobs are immediately executed or rejected if a too high number of

short jobs are already executing.

Page 18: INFSO-RI-508833 Enabling Grids for E-sciencE  NA4/Biomed Demonstration Medical Data Management and processing EGEE 3 rd review rehearsal,

NA4/biomed demonstration, 3rd EGEE review rehearsal, May 4th 2006 18

Enabling Grids for E-sciencE

INFSO-RI-508833

Workflow execution

Page 19: INFSO-RI-508833 Enabling Grids for E-sciencE  NA4/Biomed Demonstration Medical Data Management and processing EGEE 3 rd review rehearsal,

NA4/biomed demonstration, 3rd EGEE review rehearsal, May 4th 2006 19

Enabling Grids for E-sciencE

INFSO-RI-508833

Post-mortem trace

Page 20: INFSO-RI-508833 Enabling Grids for E-sciencE  NA4/Biomed Demonstration Medical Data Management and processing EGEE 3 rd review rehearsal,

NA4/biomed demonstration, 3rd EGEE review rehearsal, May 4th 2006 20

Enabling Grids for E-sciencE

INFSO-RI-508833

Scientific production

• 4 rigid-registration algorithms precision estimated on brain image database

• To be published in [HealthGrid’06]

Algorithm reg (deg) trans (mm)

CrestMatch

0.150 0.424

PFRegister

0.180 0.416

Baladin 0.139 0.395

Yasmina 0.137 0.445

Page 21: INFSO-RI-508833 Enabling Grids for E-sciencE  NA4/Biomed Demonstration Medical Data Management and processing EGEE 3 rd review rehearsal,

NA4/biomed demonstration, 3rd EGEE review rehearsal, May 4th 2006 21

Enabling Grids for E-sciencE

INFSO-RI-508833

Why grids?

• From days to hours– 10s to 100s of algorithms

To adapt to many clinical cases

– Virtually illimited parameterization

– Virtually illimited number of image databases Different modalities, different body regions

• Complex computation procedure– Difficult experimental set up

– Future plan: application portal

• Data federation– Obtain data sources needed for validation

• Algorithms sharing– Use registration services developed in different research groups

– Reproducible results

Page 22: INFSO-RI-508833 Enabling Grids for E-sciencE  NA4/Biomed Demonstration Medical Data Management and processing EGEE 3 rd review rehearsal,

NA4/biomed demonstration, 3rd EGEE review rehearsal, May 4th 2006 22

Enabling Grids for E-sciencE

INFSO-RI-508833

Conclusions

• Medical data management– Advanced Data management functionalities

– Application area-level layer on top of foundation middleware

– Dependent on the deployment of gLite 1.5 services

• Bronze Standard application– Complex, workflow-based application

– Data intensive Non-trivial parallel computations Data federation using grid data management services

– Production of scientific results

• Short deadline jobs– Immediate scheduling of short tasks

– Submit-or-reject paradigm