Setting up a Pan- European Datagrid using QCDgrid technology Chris Johnson, James Perry, Lorna Smith and Jean-Christophe Desplat EPCC, The University Of Edinburgh The ENACTS “Demonstrator”
Setting up a Pan-European Datagrid using
QCDgrid technology
Chris Johnson, James Perry, Lorna Smith and Jean-Christophe Desplat
EPCC, The University Of Edinburgh
The ENACTS “Demonstrator”
This Talk
• A summary of the title:– ENACTS Demonstrator– Pan-European Datagrid– QCDgrid.
ENACTS• European Network for Advanced
Computing Technology for Science.• EC-funded project with 14 members.• Started in 2000.• Attempt to ensure that Europe did not
lag behind US in grid technology.• ENACTS originally consisted of many
reports reports with little technical work.
14 ENACTS partners
Please see:www.enacts.org
ENACTS Demonstrator: Partners involved
• EPCC, Edinburgh, UK– Chris Johnson– Jean-Christophe Desplat– James Perry
• Parallab, Bergen, Norway– Jacko Koster– Jan-Frode Myklebust– Csaba Anderlik
• TCD, Dublin, Ireland– Geoff Bradley– Bob Crosbie.
ENACTS Demonstrator…• Objective
– “To enable the formation of a pan-European HPC metacentre…”.
• The Demonstrator is part of Phase II of the activity and its specific objective is – “to draw together the results from all of the
Phase I technology studies and evaluate their practical consequences for operating a pan-European metacentre and constructing a best-practice model for collaborative working amongst individual facilities”.
ENACTS Demonstrator• ENACTS Phase I
– consisted mainly of reports• ENACTS Phase II
– contained the Demonstrator activity• Phase I identified technologies such
as– Globus, replica management, LDAP
database and XML metadata.• All these technologies are inherent
in the QCDgrid system.
metacentre
• A “virtual organisation” with data described by metadata.– Users submit data from any site– The data is stored on “the grid”– All data is stored reliably– All data is easy to retrieve.
Our Demonstrator
• Set-up QCDgrid across the 3-sites to create our metacentre.
• Use a genuine scientific scenario.• Use an XML schema for meta-data.• Ensure the data is portable
between the systems involved.
Summary so far…
• ENACTS demonstrator project is an EC funded project involving 3 partners
• attempting to set up a pan-European metadata centre
• using QCDgrid technology.
What is QCDgrid?• It’s not QCD-specific!!• QCDgrid was written to manage the QCD
data belonging to the UK QCD community (UKQCD)– 2 previous All-Hands talks (James Perry,EPCC).
• The original grid consisted of 6 geographically dispersed sites.
• Around 5 terabytes of data.• The amount of data is expected to grow
dramatically when QCDOC comes online later in 2004.
What is QCDgrid?
• QCDgrid is a layer of software written on top of the Globus Toolkit.– Uses security infrastructure and basic
grid operations such as data transfer– also uses more advanced features
such as the replica catalogue.
How does QCDgrid work?
• A control thread runs on one storage element– constantly scans grid – ensures all storage elements are
working– ensures all files are stored in at least 2
suitable locations.
• When a new file is added it is rapidly replicated across grid onto 2 or more geographically separate sites.
QCDgrid:Dealing with node loss
• If a storage element is lost unexpectedly, all files that were held on the failed system are replicated elsewhere.
• QCDgrid can cope with loss of entire site.
• If the control node is lost – control reverts to a secondary node.
How is QCDgrid used?
• Assuming QCDgrid is set up and the control thread and metadata database are running…
• and that each user has a valid certificate…
• User submits the usual initialisation commands– grid-proxy-init– source the correct set up files– sets a few paths, classpaths, etc.
Submitting files
Command line• User submits a file (datafile.dat)
– put-file-on-qcdgrid datafile.dat
• AND an accompanying metadata file which describes the above data file– put-file-on-qcdgrid datafile.xml– exist:/db>put datafile.xml
Submitting files
(better still…) Using the GUI • User runs the Java GUI and submits
both data file and metadata file at the same time.
• metadata file has a tag for the name of the corresponding data file– marries the two– every data file should have an
associated metadata file.
Metadata Browser
• Can submit, search and retrieve data using this Java browser.
Metadata/Datagrid Integration
• QCDgrid software deals with storage and replication of data.
• eXist database deals with cataloging of data using metadata.
• All can be controlled using command line or GUI.
More on QCDgrid
• Other commands available with QCDgrid– qcdgrid-list lists all files on grid.– get-file-from-qcdgrid retrieves
files from grid.– i-like-this-file attempts to store a
file local to the user.
• There are also several commands for administering nodes, etc.
QCDgrid Sites
?
QCDgrid ENACTS ?
Our depoyment of QCDgrid
• UK (all using e-science CA) -> Europe (all using different CAs).
• Moving from a homogenous Linux environment to a mixed one (Linux/Solaris).
• Moving from Globus Toolkit (GT) 2.0 -> GT2.4.
How difficult was it?
• Certificates– Some certificate issuers took several
weeks to issue certificates.– Different policies on issuing certificates,
e.g. non-human users (project accounts).
– Not too many difficulties using multiple certificates.
How difficult was it?
• Moving to a heterogeneous environment.– Installing of Globus 2.x is difficult on
Solaris – led to the Solaris node being unable to submit data.
• A few minor problems getting system specific functions to work (e.g. df command).
• Usual minor compilation issues – did require gcc compiler.
How difficult was it?
• Globus– This presented the biggest difficulty!– Installation difficulties and firewall
issues• several months before a “helloworld” job
would run from any site to any other.
– Migrating from GT 2.0 -> GT 2.4• Major difficulties!• Had to re-write the replica schema.• Remove some error-handling functionality.
Users – Scientific scenario
• Appealed to QCD– Given more time we could have found a
different discipline.
• Two users from TCD as well as those involved in the project itself.
• Code used was MILC code.• Monte-Carlo simulation to investigate
string-breaking.• Three Monte-Carlo chains, one on each
node.
User feedback
• Generally impressed with functionality.• Some frustration in getting certificates.• Difficulty persuading users to use
metadata, although agreed it was useful.
• Would make more use of file-sharing using such a system.
• Liked machine-independent data.• Wanted grid to do job submission.
Conclusions• We have created a pan-European datagrid
(metacentre) using QCDgrid technology.• The systems works well…• …Globus is the limiting factor.• Users were impressed with the system in
use.• The system was not tested with many users
but we can see no reason why it would not scale to many users/nodes if Globus allows.
Acknowledgements
• James Perry.• Jean-Christophe Desplat, Jacko
Koster, Jan-Frode Myklebust, Csaba Anderlik, Geoff Bradley, Bob Crosbie.
• Craig McNeile and Bálint Joó.• Mike Peardon and Jimmy Juge.
References• ENACTS
– http://www.enacts.org
• QCDgrid– http://www.gridpp.ac.uk/qcdgrid– code:
http://forge.nesc.ac.uk/projects/qcdgrid
• MILC code– http://physics.indiana.edu/~sg/milc.html