Top Banner
1 WP2: Data Management Gavin McCance RAL Middleware Workshop 24 February 2003
23

1 WP2: Data Management Gavin McCance RAL Middleware Workshop 24 February 2003.

Jan 03, 2016

Download

Documents

Randolph Bryant
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 WP2: Data Management Gavin McCance RAL Middleware Workshop 24 February 2003.

1

WP2: Data Management

Gavin McCance

RAL Middleware Workshop

24 February 2003

Page 2: 1 WP2: Data Management Gavin McCance RAL Middleware Workshop 24 February 2003.

RAL Workshop 24-25.Feb.2003 2

Outline

WP2 Tasks

Review of TB1 Components

Changes and review of current components

Plans for final year

Page 3: 1 WP2: Data Management Gavin McCance RAL Middleware Workshop 24 February 2003.

RAL Workshop 24-25.Feb.2003 3

WP2 Tasks

Replication Services

• Keep track of all the files and their copies

• Copy them about (to order and automatically)

Optimization of replication

• Give me the ‘best replica’ for my job

• Simulate the grid to tune the algorithms needed for this

Meta-data

• Where will the replication stuff keep its meta-data

• Where will the applications keep their meta-data

Security

• Authenticate with grid certificates

• Authorize users appropriately (better than just a grid-mapfile)

Page 4: 1 WP2: Data Management Gavin McCance RAL Middleware Workshop 24 February 2003.

RAL Workshop 24-25.Feb.2003 4

TB1 Replication: Replica Catalogue

Edg-replica-catalogue

• The repackaging of the much-loved Globus replica catalogue

• Based on LDAP

LFN -> PFN (1:many)

• One logical file name mapping to many physical instances of the file

• With appropriate utility functions, applications might never need to know the PFN. Use the LFN, and the middleware does the mapping for you in the background.

Page 5: 1 WP2: Data Management Gavin McCance RAL Middleware Workshop 24 February 2003.

RAL Workshop 24-25.Feb.2003 5

TB1 Replication: Copying to order

Edg-replica-manager

• Initially, it was a repackaging of Globus replica manager

Rewritten for TB1+ with better client interfaces

• Both command-line and C++

• copyAndRegister: ‘brings your new file to the grid’

• replicateFile: makes a new replica of a file

Page 6: 1 WP2: Data Management Gavin McCance RAL Middleware Workshop 24 February 2003.

RAL Workshop 24-25.Feb.2003 6

TB1 Replication: Copying ~automatically

GDMP: Grid Data Mirroring Package

• Born in CMS

Implements subscription-based replication

FuriousMonte Carlogeneration

Site A

Lots ofNew files

Site B

“Subscribe me!”

Notify: “I’ve got some new files!”

“Send me them”

GridFTP of new files

Replica catalogue

New replicasat site B

Page 7: 1 WP2: Data Management Gavin McCance RAL Middleware Workshop 24 February 2003.

RAL Workshop 24-25.Feb.2003 7

Replication Optimization

Most research-oriented task

Early TB1 getBestFile absent

RB matches LFNs against local storage elements

• Jobs only go where their data already is

• No clever data movement

OptorSim developed to test replica optimization ideas

• Data-centric grid simulation

• Simulates job times as function of replication mechanism and job data access patterns

• UK JANet and EU GEANT network modelled

Page 8: 1 WP2: Data Management Gavin McCance RAL Middleware Workshop 24 February 2003.

RAL Workshop 24-25.Feb.2003 8

Meta-data storage Spitfire meta-data storage

Two faces

• Spitfire browser

• Spitfire client API

Spitfire-browser allows a client to use web-browser to view the results of canned queries from a database or make canned inserts into the database.

Client uses their grid cert embedded in their web browser to authenticate (and then authorize) to the service.

DBSpitfireBrowser

NetscapeWeb browser

Fill in the web-page form

The result comes back to client

Page 9: 1 WP2: Data Management Gavin McCance RAL Middleware Workshop 24 February 2003.

RAL Workshop 24-25.Feb.2003 9

Meta-data storage 2

Spitfire client API

Imagine where you would use ODBC / JDBC in an application

• To do something with a database from inside your application

That’s where you use this API, except…

• Accesses DB over WAN

• Grid security (both authentication and authz)

• You shouldn’t have to know what the DB backend is

• NB. The API is not the same as ODBC!

Page 10: 1 WP2: Data Management Gavin McCance RAL Middleware Workshop 24 February 2003.

RAL Workshop 24-25.Feb.2003 10

Security

WP2 task feeding into EDG security group

Server side:

• Mostly JAVA

• Proper certificate trust-manager for java server applications (special plug-in for Tomcat)

• Flexible authorization manager to define whatever authz policies you like upon the server.

Client side

• Proper JAVA trust-manager for certificate checking

• Web services GSI-enabled for Java and C++

Page 11: 1 WP2: Data Management Gavin McCance RAL Middleware Workshop 24 February 2003.

RAL Workshop 24-25.Feb.2003 11

Changes: Web services

Most software has been redesigned to use web services

Much of the server-side stuff now written in Java

Retain security: GSI-enabled web-services

Services have been modified to expose an API in WSDL

For client programming, the client API libraries are auto-generated from the WSDL

For command-line, the tools are still there, but now talk to the server using web services.

What the applications user sees should not have changed as a result of adopting web services!

Page 12: 1 WP2: Data Management Gavin McCance RAL Middleware Workshop 24 February 2003.

RAL Workshop 24-25.Feb.2003 12

Changes: Replica Catalogue to RLS

edg-replica-catalogue being phased out

For Replica Location Service (RLS)

• collaboration with Globus

Local Replica Catalogs (LRCs) on the SEs hold the actual GUID -> PFN mappings [GUID is what used to be LFN]

Replica Location Indices (RLIs) redirect inquiries to LRCs actually having the file

LRCs are configured to send index updates to any number of RLIs

Much more scalable architecture

• The lookup time for an entry is independent of the number of catalogs. Tested for up to 108 entries.

• The catalog withstands simultaneous user queries of over 1000 queries or inserts per second.

Page 13: 1 WP2: Data Management Gavin McCance RAL Middleware Workshop 24 February 2003.

RAL Workshop 24-25.Feb.2003 13

RLS Demo at SC2002

possum

emu

wombat

koala

Melbourne

RLIs

LRCs

n16

n19

n17

n18

dc-n1

dc-n4

dc-n2

dc-n3

a33

a36

a34

a35

rls01

rls02

rls02

rls01

ANL(Chicago)

ISI(Los Angeles)

SC2002(Baltimore)

SLAC(Palo Alto)

Replica Location Index Nodes

Local Replica Catalogs

0342

pcr25

0343

pcr24

0344

grid03

0345

grid01

0346 grid8 grid6

grid7.mi

grid1

grid7.pi

CERN(Geneva)

Glasgow INFN(Pisa)

INFN(Milan)

Replica Location Index Nodes

Local Replica Catalogs

Australia Sites Unites States Sites Europe Sites

Page 14: 1 WP2: Data Management Gavin McCance RAL Middleware Workshop 24 February 2003.

RAL Workshop 24-25.Feb.2003 14

Changes: unified interface for replication

Many services, some the same, some new, with a bewildering array of acronyms…

All these services have their own APIs, and are individually accessible on the grid.

From applications point of view, it’s more appropriate is the have a single client facing interface (both programming and command line) that you can use to talk to all these services.

• Simpler… you only need to read one document ;-)

• Allows this single client to take care of transactional issues

• This is the new EDG Replica Manager (ERM) for TB2

Page 15: 1 WP2: Data Management Gavin McCance RAL Middleware Workshop 24 February 2003.

RAL Workshop 24-25.Feb.2003 15

TB2 Replica Manager Componentsand name changes…

ERM: EDG Replica Manager client interface and API

• Entry point for all clients

ROS: Replication Optimization Service

• Replica selection based on network metrics (WP7)

RSH: Replication Storage Handler (what was GDMP)

• Subscription-based replication

RLS: Replica Location Service (replacing replica catalogue)

• Local Replica Catalog services LRC: Logical to Physical file mappings

• Replica Location Index services RLI: index on Logical names

RMC:Replication Metadata Catalogue

• Similar to Spitfire with RDBMS backend and specialized schema

NEW

!

Page 16: 1 WP2: Data Management Gavin McCance RAL Middleware Workshop 24 February 2003.

RAL Workshop 24-25.Feb.2003 16

TB2: Replica Management Services

“Reptor”Replica

Management Services

Optimization

Replica Metadata

Subscription

Client

Replica Location

File Transfer

ERM

RMC

ROS

RSH

TB2 Components

GridFTP

RLS

Page 17: 1 WP2: Data Management Gavin McCance RAL Middleware Workshop 24 February 2003.

RAL Workshop 24-25.Feb.2003 17

LFNs, PFNs, GUIDs

Due to application requirement from LCG, a couple of changes:

PFN1, Glasgow

PFN2, CERN

PFN3, Lyon

GUID1223423-ASSDF4-11223-35465464

LFN1

LFN2

LFN3

Replica Location ServiceReplica Meta-data Catalogue

Page 18: 1 WP2: Data Management Gavin McCance RAL Middleware Workshop 24 February 2003.

RAL Workshop 24-25.Feb.2003 18

TB2: RMC and Spitfire

Replication meta-data considered sufficiently ‘specialized’ and vital to the replica management service that it has been split off from Spitfire

• Now called Replica Metadata Catalogue (RMC)

• Resolves LFNs to GUIDs

• Underlying technology is identical to Spitfire

• Exposed API is different More tailored for specific things you’d like to do with replication

meta-data.

• Application specific section for application meta-data that is keyed on LFNs or GUIDs.

Spitfire is still available for other meta-data

• e.g. storing calibration constants, etc.

Page 19: 1 WP2: Data Management Gavin McCance RAL Middleware Workshop 24 February 2003.

RAL Workshop 24-25.Feb.2003 19

Replica Optimization Service: ROS

Provides getAccessCosts( LFN[] , CE[] , … ) method to RB

• Allows RB to take into account the distribution of a job’s files when deciding where to run it

Provides listBestFile ( LFN , toSE ) [in the ERM interface]

• Uses networking bandwidth + storage cost measurements (WP7 and WP5) to determine the best replica to get.

Provides getBestFile ( LFN , toSE, … ) [in the ERM interface]

• The same, except it actually does this replication, if needed.

For TB2, simple replication algorithms will be deployed initially.

• More adventurous ones can be added without impacting the interface, since the replication algorithm is internal to the RMS

Page 20: 1 WP2: Data Management Gavin McCance RAL Middleware Workshop 24 February 2003.

RAL Workshop 24-25.Feb.2003 20

Current OptorSim status

OptorSim used to simulate possible algorithms for ROS

Simulation now includes sampled network background (UCL)

Live network simulator GUI

• Or offline in a compute farm to get useful results..!

Page 21: 1 WP2: Data Management Gavin McCance RAL Middleware Workshop 24 February 2003.

RAL Workshop 24-25.Feb.2003 21

Current OptorSim results

Initial results from simulation show that including network background increases job times by ~10%.

Further study underway…

Study of different replication algorithms and access patterns

Data access pattern has a large effect

• Further study here

Economic models do well for sequential data access

- 6 experiments, 22 sites

- predicted available CPUs & storage

- realistic file sizes (1GB) and dataset sizes (1TB)

- realistic number of jobs (~60 users)

- inclusion of background network traffic

Page 22: 1 WP2: Data Management Gavin McCance RAL Middleware Workshop 24 February 2003.

RAL Workshop 24-25.Feb.2003 22

Plans for final year: Meta-data

RMC is now ~fixed in functionailty

Spitfire will evolve a bit more

• To allow authorized users to hot-deploy their own interfaces onto the service to do something useful.

• e.g. you as an analysis-group hardware person can ‘invent’ an method call (an interface) to extract some data from an obscure calibration constant table.

• Spitfire (which sits in front of DB containing these tables) will then expose your newly invented interface so that people can use it by standard web-services remote procedure call

And web-services will write the client stub for you automatically…

Keep working on OGSA (and GGF DAIS standard)

Page 23: 1 WP2: Data Management Gavin McCance RAL Middleware Workshop 24 February 2003.

RAL Workshop 24-25.Feb.2003 23

Plans for final year: RMS

RMS architecture is now defined

Consolidate and concentrate on quality

• Few new features

Support LCG

• Software was developed alongside LCG requirements

Work will continue on improving the algorithms used internally by the ROS (replica optimization)

Work towards EGEE…