Top Banner
43

SUMS ( STAR Unified Meta Scheduler )

Jan 07, 2016

Download

Documents

creola

SUMS ( STAR Unified Meta Scheduler ). SUMS is a highly modular meta-scheduler currently in use by STAR at there large data processing sites (ex. RCF / PDSF). It is also used by other organizations such as Stony Brook University and as a back end to some PHENIX GUI applications. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: SUMS ( STAR Unified Meta Scheduler )
Page 2: SUMS ( STAR Unified Meta Scheduler )

SUMS (STAR Unified Meta Scheduler)• SUMS is a highly modular meta-scheduler currently in use by STAR

at there large data processing sites (ex. RCF / PDSF). It is also used by other organizations such as Stony Brook University and as a back end to some PHENIX GUI applications.

• STAR has been using SUMS for 3.5 years now on both production and simulation jobs, but more importantly as a tool for users submission of request (jobs).

• The function of SUMS:– Run processes on large datasets (many files) that may be distributed

across many nodes, clusters sites and batch systems.– Resolve the users abstract requests in to an actual set of jobs that can be

run on a farm (s).– Resolve request for datasets. This is done using a catalog plug-in which

resolves the users request for a data-set into LFN or PFN.– Write scripts and submit them to the batch system(s).– Imbed resource handling information for the batch system to use.– Group and Split work in the most efficient way possible.

Page 3: SUMS ( STAR Unified Meta Scheduler )

Who contributes to SUMS research, development and administration

• PPDG – funding• Jerome Lauret and Levente Hajdu – development and administration of SUMS

at BNL • Lidia Didenko – Testing for grid readiness• David Alexander and Paul Hamill (Tech-X corp) - RDL deployment and

prototype client and web service• Eric Hjort, Iwona Sakrejda, Doug Olson – administration of SUMS at PDSF• Valeri Fine – Job tracking.• Andrey Y. Shevel - administration of SUM at Stony Brook University• And Others

• Gabriele Carcassi – development and administration of SUMS• Efstratios Efstathiadis – Queue monitoring, research

Page 4: SUMS ( STAR Unified Meta Scheduler )

Benefits of using SUMS over submitting directly

• No knowledge of scripting required for splitting and submitting jobs.

• No knowledge of how to use the batch system needed.

• Datasets are resolved and chopped for the users. • The user is totally shielded from the complications

of using the distributed file system.• There are safety measures in place to prevent users

from downing the batch system by over using resources.

Page 5: SUMS ( STAR Unified Meta Scheduler )
Page 6: SUMS ( STAR Unified Meta Scheduler )
Page 7: SUMS ( STAR Unified Meta Scheduler )

STARDATA24

QUEUENODES JOBS

STARDATA05

STARDATA02

RCF

Page 8: SUMS ( STAR Unified Meta Scheduler )

STARDATA24

QUEUENODES JOBS

STARDATA05

STARDATA02

RCF

Page 9: SUMS ( STAR Unified Meta Scheduler )

STARDATA24 STARDATA02STARDATA05

SD24 virtual resource

(102 units total) SD02 virtual resource

(800 units total) SD05 virtual resource

(2040 units total)

Queued jobs

Running jobs

SD05 = 500SD24 = 50

SD05 = 500SD24 = 10

SD05 = 1000SD24 = 2

SD05 = 30SD24 = 3

SD05 = 30SD02 = 600

SD05 = 800SD24 = 50

SD05 = 700SD02 = 450

SD05 = 750

Page 10: SUMS ( STAR Unified Meta Scheduler )

Variables generated on the fly for users

• $JOBID – a unique ID is given to all jobs that SUMS will ever run.

– Example 62338C856E6B2B0ABF0344116F94CEA3_0

• $ PROCESSID – The number of that job in the request, numbered 0,1,2,…n.

• $ SCRATCH – A area on the local system that users can use for temporary files. (temp space)

– Example /tmp/$USER/$JOBID

• $ FILELIST – The location of a subset of data that SUMS has chopped from the dataset for processing by a given job.

• Others

Page 11: SUMS ( STAR Unified Meta Scheduler )

JDL job XSD tree View

Page 12: SUMS ( STAR Unified Meta Scheduler )

Job Parameters • Required

– Command - The command(s) to be run on the files– stdout

• Optional – Name– stderr– maxFilesPerProcess (max files per job)– minFilesPerProcess (min files per job)– minMemory– maxMemory– simulateSubmission– filesPerHour– minWallTime– maxWallTime– fileListSyntax– fromScratch

Page 13: SUMS ( STAR Unified Meta Scheduler )

Sample job<?xml version="1.0" encoding="utf-8" ?>

<job maxFilesPerProcess="10" fileListSyntax="rootd" minMemory= "15" >

<command>root4star -q –b /star/macro/runMuHeavyMaker.C\(\"$SCRATCH/heavy.MuDst.root\",\"$FILELIST\"\)

</command>

<stdout URL="file:/star/u/lbhajdu/temp/heavy.$JOBID.out" />

<input URL="catalog:star.bnl.gov? production=P04ik, trgsetupname=proHigh,

filetype=daq_reco_MuDst, tpc=1,ftpc=1, sanity=1“, nFiles="all"/>

<output fromScratch=“*.MuDst.root” toURL="file:/star/data02/heavy.$JOBID.root" />

</job>

Page 14: SUMS ( STAR Unified Meta Scheduler )
Page 15: SUMS ( STAR Unified Meta Scheduler )

Configuring SUMS• SUMS uses java standardized xml de-serialization for its configuration. Over the years we have found

this to be the ideal balance between ease of use and the power to define complex systems abstractly. • Pre-initialized scheduler objects are defined by the administrator. • One configuration file can hold many different instances of the same object.• By default the user will be given the default objects, or they can specify other objects that have been

customized for the special needs of there jobs.• Objects include:

– JobInitializer– Policy– Queue– Dispatcher– Application– Statistics recorder– Others

Page 16: SUMS ( STAR Unified Meta Scheduler )

JobInitializer

• The job initializer is the module through which the user submits his job.

• JobInitializers currently available: – Local command line– command line (web service)

• Tested still in beta

– GUI (web service)• Tested still in beta

Page 17: SUMS ( STAR Unified Meta Scheduler )
Page 18: SUMS ( STAR Unified Meta Scheduler )

Dispatchers

• A scheduler plug-in module, that implements the dispatcher interface, that converts job objects to a “real” job actually submitted to the batch system

• Currently available dispatchers:– Boss– Condor– CondorG– Local (new)– LSF– PBS– SGE (new but heavily tested by PDSF)

Page 19: SUMS ( STAR Unified Meta Scheduler )

Virtual Queues• Defines a “place” (queue, pool, meta queue, service ,etc.)

that a job can be submitted to.

• Defines properties of that place.

• Each Virtual Queue points to one dispatcher object.

Page 20: SUMS ( STAR Unified Meta Scheduler )

Virtual Queues

Page 21: SUMS ( STAR Unified Meta Scheduler )

Virtual Queues A typical queue configuration:

<object id="NSFlocalQueueObj" class="gov.bnl.star.offline.scheduler.Queue"> <void method="setID"> <string>localQueue</string> </void> <void method="setName"> <string>star_cas_dd</string>

</void> <void method="setAssociatedDispatcher"><object idref="RCASDispatcher"/> </void> <void method="setCluster"> <string>rcas.rcf.bnl.gov</string> </void> <void method="setTimeLimit"> <int>90</int> </void> <void method="setMaxMemory"> <int>440</int> </void> <void method="setSearchOrderPriority"> <int>1</int> </void> <void method="setType"> <string>LSF</string> </void> <void method="setImplementation"> <string>local</string> </void> </object>

Page 22: SUMS ( STAR Unified Meta Scheduler )

Policies

• Resolves request for data sets.• Chops dataset and creates jobs to work on each peace.

– Tries to split in most optimal way– Groups files based on where they have to be processed, in case of

a files on distributed disk. – The size of each sub-data set is based on the users min and max

data set size requirements and the time requirements of the queue calculated from files per hour, if the users supplies this parameter.

• Brakes request into jobs • Assigns job objects to queue objects by using a algorithm

unique to each policy class.

Page 23: SUMS ( STAR Unified Meta Scheduler )

Policies

Page 24: SUMS ( STAR Unified Meta Scheduler )

PoliciesExample of a custom policy used by the STAR resonance group. The algorithm for deciding where jobs

go is “PassivePolic” the queues used are NSFlocalQueueObj, NFSQueueObj, HBT_group_Queue

<object class="gov.bnl.star.offline.scheduler.policy.PassivePolicy">

<void method="addQueue"> <object idref="NSFlocalQueueObj"/> </void> <void method="addQueue"> <object idref="NFSQueueObj"/> </void> <void method="addQueue"> <object idref="HBT_group_Queue"/> </void>

</object>

Page 25: SUMS ( STAR Unified Meta Scheduler )
Page 26: SUMS ( STAR Unified Meta Scheduler )

Policies• PassivePolicy – A simplistic policy that allows the administrator to set

the order in which queues will be tried. The order is set by a property of the queue called “search order priority”. If two or more queues have the same search order priority they will be tried in a round robin fashion.

• ClusterAssignmentByMonitorPolicy – The first “monitoring policy” every tested. It detects the load of each cluster and then uses an equation to determine what percentage of jobs should go to that cluster.

• AssignmentByQueueMonitorPolicy – A “monitoring policy” that works at the queue level. Performance is better then ClusterAssignmentByMonitorPolicy. It monitors the waiting time and throughput of each queue using a plug-in developed for MonaLisa, to determine the best (fastest) queue to submit to. Unlike other schedulers that attempt to model every single variable. This policy only uses a handful of variables that reflect the state of possibly hundreds or thousands of factors.

Page 27: SUMS ( STAR Unified Meta Scheduler )
Page 28: SUMS ( STAR Unified Meta Scheduler )
Page 29: SUMS ( STAR Unified Meta Scheduler )
Page 30: SUMS ( STAR Unified Meta Scheduler )

Passive Policy

Monitoring Policy

Page 31: SUMS ( STAR Unified Meta Scheduler )
Page 32: SUMS ( STAR Unified Meta Scheduler )
Page 33: SUMS ( STAR Unified Meta Scheduler )

Reports, Logs and Statistics

• Logs and statistics collection is optional and the users report file is always generated.

• Reports– Reports are put in the users directory they give information about the

internal workings of SUMS to the user.– Reports information about every job that was processed.– The user decides when to delete these.

• Logs– Holds information in a central area more valid to the administrator, for

diagnosing problems.– The administrator decides when to delete these.

• Statistics– General information about how many people are using SUMS and what

options there using.

Page 34: SUMS ( STAR Unified Meta Scheduler )
Page 35: SUMS ( STAR Unified Meta Scheduler )
Page 36: SUMS ( STAR Unified Meta Scheduler )
Page 37: SUMS ( STAR Unified Meta Scheduler )
Page 38: SUMS ( STAR Unified Meta Scheduler )

Job tracking / monitoring /crash recovery

• Dispatchers in SUMS currently provide 3 functions:– Submit Job(s) – Get Status of job(s)– Kill Job(s)

Page 39: SUMS ( STAR Unified Meta Scheduler )

Job tracking / monitoring /crash recovery

• To implement this in the most simplest care free way possible it was decided no central data base should be used to store this information. The information should be given to the users directly.

• The benefits are:– No db’s need to be set up on sites running SUMS. This

automatically eliminates all securely and administration considerations.

– The user decides when they no longer need this data. As the data is now in the user file system. As a file generated by SUMS

Page 40: SUMS ( STAR Unified Meta Scheduler )
Page 41: SUMS ( STAR Unified Meta Scheduler )

RDL

Request Definition Language – An XML based language under development by STAR in collaboration with other scientific groups and private industry for describing not only one job, but many jobs and the relationships between them geared towards web services with advanced gui clients.

Page 42: SUMS ( STAR Unified Meta Scheduler )

RDLTerminology on the layers of abstraction are not very clear all inclusive definitions are hard to

come by. Note: These are only guide lines.

Abstract / Meta / Composite request – defines a group of requests performing a common task. The order in which they run many be important. The output of one request may be the input to another request in the same meta request. example: Make a new dataset by running a program. When it is done sum the output and render a histogram.

Request or meta job – defines a group of [0 to many] jobs that have a common function and can be run simultaneously. example: Take a data set and run an application on it.

Physical Job – The unit of work the batch system deals with. example: Take a dataset and run an application on it.

Page 43: SUMS ( STAR Unified Meta Scheduler )

RDL V.S. JDL

• Submitting on a grid landscape

• Supports submit of multiple jobs

• Supports submit of multiple request

• Separates task and application

• Supports work flow

• XML format

RDL JDL