Top Banner
NOVA Networked Object-based EnVironment for Analysis P. Nevski, A. Vaniachine, T. Wenaus NOVA is a project to develop distributed object oriented physics analysis components, adaptable to different experiments. A job configuration manager uses a scripting interface to provide web- based editing, submission and cataloguing of analysis jobs, both user-level and experiment-wide, centrally managed in a database. A client/server system distributed over compute nodes provides job submission and monitoring across facilities, which may span several sites. A file catalog records production relationship of data files generated by an experiment. NOVA provides database tools for geometry and parameter object storage. A NOVA web- based browser navigates a relational database storing hierarchically structured dataObjects. Clients may access database information from the code or through a CORBA-specified interface. NOVA components have been tested and deployed in the STAR and ATLAS environments. February 7, 2000
20

NOVA N etworked O bject-based En V ironment for A nalysis

Jan 01, 2016

Download

Documents

mona-deleon

NOVA N etworked O bject-based En V ironment for A nalysis. P. Nevski, A. Vaniachine, T. Wenaus - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: NOVA N etworked   O bject-based  En V ironment  for   A nalysis

NOVANetworked Object-based EnVironment for Analysis

P. Nevski, A. Vaniachine, T. Wenaus

NOVA is a project to develop distributed object oriented physics analysis components, adaptable to different experiments. A job configuration manager uses a scripting

interface to provide web-based editing, submission and cataloguing of analysis jobs, both user-level and experiment-wide, centrally managed in a database. A client/server

system distributed over compute nodes provides job submission and monitoring across facilities, which may span several sites. A file catalog records production relationship of data files generated by an experiment. NOVA provides database tools for geometry and parameter object storage. A NOVA web-based browser navigates a relational database storing hierarchically structured dataObjects. Clients may access database information from the code or through a CORBA-specified interface. NOVA components have been

tested and deployed in the STAR and ATLAS environments.

February 7, 2000

Page 2: NOVA N etworked   O bject-based  En V ironment  for   A nalysis

February 7, 2000 CHEP in Padova

Outline

• Goals• Requirements• Architecture• Components• Details• Summary

Page 3: NOVA N etworked   O bject-based  En V ironment  for   A nalysis

February 7, 2000 CHEP in Padova

Motivations

• Unprecedented data volume and software complexity in new large High Energy and Nuclear Physics experiments at RHIC (BNL) and LHC (CERN)New approaches to analysis and data handling

softwareDistributed computing environment (DCE) is vital

and increasingly powerfulExperience in developing DCE solutions for STARBuild on experience to develop DCE tools for use

in similarly challenging environments

Page 4: NOVA N etworked   O bject-based  En V ironment  for   A nalysis

February 7, 2000 CHEP in Padova

Goals

• Develop software tools for– coordination and control of widely distributed analysis

development and physics analysis activity– distributed management and analysis of very large

datasets– enhanced robustness, reusability and maintainability

of analysis software• For application in many global computing environments

(ATLAS, STAR, …)– generic tools not tied to specific implementation

choices– select, templatable implementations provided such

that NOVA components can be used in a baseline framework

Page 5: NOVA N etworked   O bject-based  En V ironment  for   A nalysis

February 7, 2000 CHEP in Padova

Requirements

• Support wide area data intensive analysis• Define middleware services are required to permit

analysis applications to effectively run over wide area networks

• Provide a rich set of features that applications can select and use to obtain the level of service they need to operate

• Define the features and the API's necessary to allow the application and middleware to communicate

• Integrate the middleware API's with the applications

Page 6: NOVA N etworked   O bject-based  En V ironment  for   A nalysis

February 7, 2000 CHEP in Padova

Design Approach

• Small, modular components; application-neutral interfaces– Can be used as a coherent framework or in

isolation to extend existing analysis systems• Focused on support for C++ based analysis

– Used for all RHIC, LHC, other large experiments• Emphasis on user participation in iterative

development; real-world prototyping and testing (STAR, ATLAS)

• Extensive use of existing tools and technologies– Must be readily available, true or de facto

standards, well supported, widely used or showing good growth

Page 7: NOVA N etworked   O bject-based  En V ironment  for   A nalysis

February 7, 2000 CHEP in Padova

Component-based Architecture

R egionalC enter

N O VA Architecture

R em oteC lients

Data Management

Analysis Server

Middlew are Components

Remote Analysis

A pplica tion specific; sam pleim plem enta tion provided

N O V A com ponent

Th ird party too l custom ized forand in tegrated in to N O V A

E xisting th ird party too l em ployed by N O V A

P roto typedS ta tus: P lannedIm plem ented

O ffline C ontro lF ram ework

C V S C odeR eposito ry

A nalysisD aem on

D ynam ica llyloaded apps

M yS Q L A na lysisC ata logue

M onitoringM odule

H yperN ewsB ug system

S tateS erver

M obileA nalysis

C lien t

W ebbrowser

V isua lisa tionG C A Q uerynanoD S T

D ata R eposito ry

G randC hallenge

A rch itecture(G C A )

M yS Q L D ataC ata logue

C ata logIn terface

C lien tD ata B inder

M odule

S erverD ata B inder

M odule

P aram etersR eposito ry

M yS Q L C lien tS ta te D B

C lien tD ata B inder

M odule

W eb S erverD atabaseN avigator

Page 8: NOVA N etworked   O bject-based  En V ironment  for   A nalysis

February 7, 2000 CHEP in Padova

Tools and Technologies

• Third party tools and technologies used in NOVA:– MySQL: relational database for catalogs, state

information and simple objects: C-structs– Perl: Unix scripting and web development tool– Apache: customizable (Perl & PHP) web server for

communication and monitoring – CORBA: low-volume interprocess data exchange– ROOT: visualization and analysis tools

Page 9: NOVA N etworked   O bject-based  En V ironment  for   A nalysis

February 7, 2000 CHEP in Padova

Components

NOVA components fall into four domains– Regional Center

• Central management and execution of analysis– Remote Client

• Mobile Analysis– Middleware Components

• Data exchange and navigation tools• Client/Server object request brokerage

– Data Management• Data repository, catalogue, and interface• Data model for simple objects (C-structs)

Page 10: NOVA N etworked   O bject-based  En V ironment  for   A nalysis

February 7, 2000 CHEP in Padova

Dynamic Binding

• Problem:– A user has a new idea that was not foreseen at the

beginning. User modifies the structure of one object in his application. Application stores new objects in the database.

– Remote applications unaware of a new functionality may request objects in old format.

• Solution:– Application: provides metadata request (name, time,

selectors...) and the application dataObject dictionary– Database server: provides dataObject and the dictionary– Object Request Broker module: converts dataObject

according to the application dictionary

Page 11: NOVA N etworked   O bject-based  En V ironment  for   A nalysis

February 7, 2000 CHEP in Padova

Dynamic Object Broker

Central Database Server

Middleware Services

Remote Application Clients

ApplicationDataObject

DatabaseDataObject

DatabaseDictionary

ApplicationDictionary

Parameters Repository

Object

Request

Broker

Page 12: NOVA N etworked   O bject-based  En V ironment  for   A nalysis

February 7, 2000 CHEP in Padova

Forward Compatibility

• Benefits:– Separation of database and analysis applications– Robust interface (via built-in type checking)– Dictionary built from C-header files or IDL-files – Database access is independent of application

code version: user can read new dataObjects with an old executable

• Usage:– Parameters data management (versioned

geometry and reconstruction constants support)

Page 13: NOVA N etworked   O bject-based  En V ironment  for   A nalysis

February 7, 2000 CHEP in Padova

Static Binding

• Problem:– Remote application (web browser) navigates

current database hierarchy.• Solution:

– Object Request Broker at the Regional Center serves dynamic HTML dataObjects in format tailored according to application ID: Netscape or MS Internet Explorer

Page 14: NOVA N etworked   O bject-based  En V ironment  for   A nalysis

February 7, 2000 CHEP in Padova

Remote Application Client

Static Object Broker

NOVA Browser

Regional Center Database Server

Middleware Services

DatabaseAPI

Module

ApplicationDataObject

DatabaseDataObject

DatabaseAPI Call

ApplicationID

Parameters Repository

Apache

WebServer

Page 15: NOVA N etworked   O bject-based  En V ironment  for   A nalysis

February 7, 2000 CHEP in Padova

Layered Interface

Page 16: NOVA N etworked   O bject-based  En V ironment  for   A nalysis

February 7, 2000 CHEP in Padova

Data Model

structure relation parameter

Array of structures Array of parameters

Page 17: NOVA N etworked   O bject-based  En V ironment  for   A nalysis

February 7, 2000 CHEP in Padova

Job monitoring system

Cataloguing Analysis Workflow

fileCatalog

Job configuration manager

Page 18: NOVA N etworked   O bject-based  En V ironment  for   A nalysis

February 7, 2000 CHEP in Padova

Grand Challenge Interface

database

GC System

StIOMaker

fileCatalog

tagDB

QueryMonitor

CacheManager

QueryEstimator

gcaClient

FileCatalog

IndexFeeder

GCA Interface STAR Components

IndexBuilder

Page 19: NOVA N etworked   O bject-based  En V ironment  for   A nalysis

February 7, 2000 CHEP in Padova

Limiting Dependencies

Experiment-specific

• IndexFeeder server

– IndexFeeder read the “tag database” so that GCA “index builder” can create index

• FileCatalog server

– FileCatalog queries the “file catalog” database of the experiment to translate fileID to HPSS & disk path

& GCA-dependent

• gcaClient interface– Experiment sends queries and get back filenames

through the gcaClient library calls

Page 20: NOVA N etworked   O bject-based  En V ironment  for   A nalysis

February 7, 2000 CHEP in Padova

Summary

What is NOVA?• Framework components for distributed computing

What are NOVA components?• Configuration manager for analysis jobs• Distributed job submission and monitoring system • Analysis workflow catalog • Database for versioned dataObjects• Brokered extraction of dataObjects • Web-based database navigation tool