Top Banner
EGEE is a project funded by the European Union under contract IST-2003-508833 EGEE Middleware Requirements Are we happy with HEPCAL? Stephen Burke, CCLRC Loose Cannon GridPP EB-TB Open Meeting, 13 th May 2004 www.eu-egee.org
17

EGEE is a project funded by the European Union under contract IST-2003-508833 EGEE Middleware Requirements Are we happy with HEPCAL? Stephen Burke, CCLRC.

Mar 31, 2015

Download

Documents

Norma Sherman
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: EGEE is a project funded by the European Union under contract IST-2003-508833 EGEE Middleware Requirements Are we happy with HEPCAL? Stephen Burke, CCLRC.

EGEE is a project funded by the European Union under contract IST-2003-508833

EGEE Middleware Requirements

Are we happy with HEPCAL?

Stephen Burke, CCLRCLoose Cannon

GridPP EB-TB Open Meeting, 13th May 2004

www.eu-egee.org

Page 2: EGEE is a project funded by the European Union under contract IST-2003-508833 EGEE Middleware Requirements Are we happy with HEPCAL? Stephen Burke, CCLRC.

EB-TB Open Meeting, 13/5/04 - 2/17

Contents

• HEPCAL History

• HEPCAL Use Cases

• HEPCAL II

• EGEE/NA4

• Summary

Page 3: EGEE is a project funded by the European Union under contract IST-2003-508833 EGEE Middleware Requirements Are we happy with HEPCAL? Stephen Burke, CCLRC.

EB-TB Open Meeting, 13/5/04 - 3/17

HEPCAL History

• In early 2002 the Loose Cannons in EDG WP8 started interviewing representatives of the LHC experiments to collect common Use Cases, and produced a document called HEPCAL (HEP Common Application Layer).

• An LCG RTAG was then formed to extend the document. This was published in May 2002.

• The members of the HEPCAL RTAG formed a permanent LCG committee called GAG (Grid Application Group) at the beginning of 2003 to consider requirements and experiment needs, and give feedback to LCG and other Grid projects.

Page 4: EGEE is a project funded by the European Union under contract IST-2003-508833 EGEE Middleware Requirements Are we happy with HEPCAL? Stephen Burke, CCLRC.

EB-TB Open Meeting, 13/5/04 - 4/17

HEPCAL History - 2

• In October 2003 the GAG published the HEPCAL II document discussing requirements for analysis as opposed to production use.

• In March 2004 the original HEPCAL document was updated, including more information on priorities and quantitative requirements (known as HEPCAL-prime).

http://project-lcg-gag.web.cern.ch/project-lcg-gag/LCG_GAG_Docs/HEPCAL-prime.doc

http://lcg.web.cern.ch/LCG/sc2/GAG/HEPCAL-II.doc

Page 5: EGEE is a project funded by the European Union under contract IST-2003-508833 EGEE Middleware Requirements Are we happy with HEPCAL? Stephen Burke, CCLRC.

EB-TB Open Meeting, 13/5/04 - 5/17

HEPCAL Use Cases

• There are 43 Use Cases. They are generally intended to cover basic operations, and are not in any sense complete. The documents also have some implications for general

requirements, but this is not the main focus.

• In practice only about half of the Use Cases were implemented by EDG middleware, and there was fairly little progress between EDG 1.x and 2.x.

Page 6: EGEE is a project funded by the European Union under contract IST-2003-508833 EGEE Middleware Requirements Are we happy with HEPCAL? Stephen Burke, CCLRC.

EB-TB Open Meeting, 13/5/04 - 6/17

USE CASE: DATASET BROWSING

Identifier UC#dsbrowse

Goals in Context Browse the LDNs

Actors User

Triggers Need to consult the DS list

Included Use Cases Grid login

Specialised Use Cases

Pre-conditions User has a valid Grid login. A VO DMS is accessible by the user and contains the files to be browsed

Post-conditions

Basic Flow The user connects to her VO DMS, via Web or command line interfaceThe user browses the available DS.

Devious Flow(s) 1.User has no right to browse the DMS database. Operation is aborted.2.DMS database is not accessible. Operation is aborted.

Importance and Frequency As important and probably frequently used as the ls command.

Additional Requirements

Example $ dsbrowse [parameters to be defined such as date created etc] <SQL query>int dsbrowse(char* SQL_query, char* option, char*[] LDNs);Call returns the number of LDNs that satisfy the search options.

Page 7: EGEE is a project funded by the European Union under contract IST-2003-508833 EGEE Middleware Requirements Are we happy with HEPCAL? Stephen Burke, CCLRC.

EB-TB Open Meeting, 13/5/04 - 7/17

Basic

• There are 19 Use Cases covering basic concepts. These relate to fundamental Grid operations like submitting and

controlling jobs, registering and replicating files, and querying the state of the system.

• Of these, 15 are implemented by the EDG middleware, although in some cases there are minor areas where the implementation is not ideal, in particular concerning the detection and treatment of errors and support for file metadata.

• Missing Use Cases relate to querying the state of jobs, detailed job control, and to a specific method of file registration (the latter has now been implemented by LCG).

Page 8: EGEE is a project funded by the European Union under contract IST-2003-508833 EGEE Middleware Requirements Are we happy with HEPCAL? Stephen Burke, CCLRC.

EB-TB Open Meeting, 13/5/04 - 8/17

Security

• Security issues were not considered in detail, but there are 5 security-related Use Cases.

• Two concern the joining and leaving of a VO, and a third specifies single sign-on. These are implemented in EDG/LCG and will be enhanced with the

use of VOMS.

• The two other security Use Cases concern the advance reservation of resources and the allocation of resources between VO members. These are not addressed in the current system.

Page 9: EGEE is a project funded by the European Union under contract IST-2003-508833 EGEE Middleware Requirements Are we happy with HEPCAL? Stephen Burke, CCLRC.

EB-TB Open Meeting, 13/5/04 - 9/17

Metadata and Virtual Data

• Metadata is relevant to several Use Cases, but two of them specifically involve the modification of file-related metadata and performing queries to select files based on the metadata. The EDG Replica Metadata Catalogue offers a prototype with partial

support for these Use Cases, but more work is needed by both application and middleware developers in this area.

• Two Use Cases are associated with the concept of Virtual Data. This was out of the scope of EDG, and would be likely to require

substantial further work to implement. In general it is not a high priority.

Page 10: EGEE is a project funded by the European Union under contract IST-2003-508833 EGEE Middleware Requirements Are we happy with HEPCAL? Stephen Burke, CCLRC.

EB-TB Open Meeting, 13/5/04 - 10/17

Optimisation - Data

• There are four Use Cases related to optimisation. One concerns the evaluation of cost functions for data access to allow the most efficient access method to be chosen. The EDG middleware has a substantial amount of support for this

concept, but testing has been limited, and the ROS is not deployed in LCG.

• Another case relates to the possibility of using remote access to a small part of a file to avoid the overhead of complete replication. This has not been considered up to now, although GridFTP does

support partial file access.

Page 11: EGEE is a project funded by the European Union under contract IST-2003-508833 EGEE Middleware Requirements Are we happy with HEPCAL? Stephen Burke, CCLRC.

EB-TB Open Meeting, 13/5/04 - 11/17

Optimisation – Job Submission

• The other optimisation Use Cases relate to job submission. One concerns the specification of hints, e.g. for cpu time consumption, memory usage or disk space needed, to allow jobs to be scheduled efficiently. This is supported to the extent that jobs can apply their own

constraints and ranking criteria based on information stored in the information system, but any optimisation is provided by the user rather than the WMS.

• The final Use Case concerns the automatic splitting of jobs into subjobs. This was one of the goals for the EDG WMS, adapting the Condor

DAGMAN software, but the functionality is not fully integrated in the deployed system.

Page 12: EGEE is a project funded by the European Union under contract IST-2003-508833 EGEE Middleware Requirements Are we happy with HEPCAL? Stephen Burke, CCLRC.

EB-TB Open Meeting, 13/5/04 - 12/17

Application Databases

• Four Use Cases relate to databases (referred to as Catalogues in the documents), i.e. read-write entities as opposed to read-only datasets. So far this is not addressed by the middleware. Even the middleware’s own databases (broker LB, R-GMA registry,

LRC and RMC) are not distributed or replicated.

• R-GMA provides a different model for a distributed database which may be suitable for some Use Cases.

Page 13: EGEE is a project funded by the European Union under contract IST-2003-508833 EGEE Middleware Requirements Are we happy with HEPCAL? Stephen Burke, CCLRC.

EB-TB Open Meeting, 13/5/04 - 13/17

Application Interfaces

• The final set of seven Use Cases are at a higher level, and relate to interactions between middleware and application software. These can generally be achieved by implementing the functionality at the

application level, but have no specific support in the middleware.

• Two relate to the submission and control of large sets of jobs treated as a single production, e.g. to process a large number of files, and a third relates to storing user-defined metadata about jobs in the WMS job database.

• Three concern specialised kinds of jobs: specification of input data via a metadata query, verification of the functionality of application software, and validation of the content of a dataset, either in a standalone job or as the final stage of a data production job.

• Finally, there is the question of the installation and publication of application software. This is a long-standing problem, although LCG has made some progress.

Page 14: EGEE is a project funded by the European Union under contract IST-2003-508833 EGEE Middleware Requirements Are we happy with HEPCAL? Stephen Burke, CCLRC.

EB-TB Open Meeting, 13/5/04 - 14/17

HEPCAL II

• The original HEPCAL document was largely aimed at managed production-style jobs. HEPCAL II was an attempt to consider the needs of chaotic analysis jobs.

• The document has a fairly extensive description of models for analysis jobs, but does not have specifically identified requirements.

• There are also no detailed Use Cases, just three general analysis scenarios (user-level, group-level, and managed production).

• Analysis models could benefit from the experience of running experiments.

Page 15: EGEE is a project funded by the European Union under contract IST-2003-508833 EGEE Middleware Requirements Are we happy with HEPCAL? Stephen Burke, CCLRC.

EB-TB Open Meeting, 13/5/04 - 15/17

Security Requirements

• Always difficult to get particle physicists to care about security! No comprehensive requirements yet, although some documents

exist.

• The main HEP requirements are likely to be in the areas of VO management, authorisation, accounting and quotas. Also there is the never-ending battle over outbound ip access from

worker nodes.

• The security model places a lot of weight on checking by the VOs – CAs only check identity and will issue certificates to ~anyone. Experiments may not yet have taken this on board.

• The EDG security group said that accounting and quotas weren’t in its area – but someone needs to consider them.

Page 16: EGEE is a project funded by the European Union under contract IST-2003-508833 EGEE Middleware Requirements Are we happy with HEPCAL? Stephen Burke, CCLRC.

EB-TB Open Meeting, 13/5/04 - 16/17

EGEE NA4

• The EGEE NA4 Activity represents all application groups HEP, biomed, …

• There is an NA4 HEP sub-group, currently led by Frank Harris. So far this is strongly coupled to LCG/GAG/ARDA. It’s not entirely

clear how non-LHC HEP experiments participate, and the UK is not a partner for NA4.

• NA4 has to produce a requirements document by May/June In practice, for HEP this is likely to be based on HEPCAL - if non-

LHC HEP experiments want to give any input they need to do it quickly.

• Timescales are short, this may be the only opportunity to influence the direction of the EGEE middleware in a significant way. Need to identify major missing items and prioritise

Page 17: EGEE is a project funded by the European Union under contract IST-2003-508833 EGEE Middleware Requirements Are we happy with HEPCAL? Stephen Burke, CCLRC.

EB-TB Open Meeting, 13/5/04 - 17/17

Summary

• EDG and LCG have developed requirements and Use Cases over several years, but largely with input from the LHC experiments.

• The HEPCAL Use Cases are fairly basic, but even so many are not implemented.

• EGEE is collecting requirements now, this is an opportunity to influence the direction of development. GridPP, particularly the non-LHC experiments, should consider

whether it wants to add anything to HEPCAL.

• There is an open NA4 meeting in Catania on July 14-16:

http://egee-na4.ct.infn.it/na4_open_meeting/