Top Banner
11/27/2003 IVOA Small Projects Meeting 2003 1 China-VO Data Access Service Based on OGSA Jian Sang National Astronomical Observatory of China Chinese Virtual Observatory
45

China-VO Data Access Service Based on OGSA

Jan 15, 2016

Download

Documents

saxton

Chinese Virtual Observatory. China-VO Data Access Service Based on OGSA. Jian Sang National Astronomical Observatory of China. Outline. VO,Grid and OGSA Build the catalog data service Build the image mosaic service Faced technical difficulties. The total area of astro telescopes in m**2. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 1

China-VO Data Access Service Based on OGSA

Jian SangNational Astronomical Observatory of China

Chinese Virtual Observatory

Page 2: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 2

Outline

• VO,Grid and OGSA• Build the catalog data service• Build the image mosaic service• Faced technical difficulties

Page 3: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 3

The Increase Of Astronomical Data

The number of pixels and the data double every year!

The total area of astro telescopes in m**2

The total Gigapixels of CCDs.

Page 4: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 4

Challenges

• The quantity of data nearly amounts to PB.

• The data is distributed and stored in heterogeneous DBMSs in heterogeneous

host environments.

Page 5: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 5

The VO’s Goal

• The VO’s initial goal is to federate existing astronomical data archives and provide standard services for manipulating these data.

HOW TO REACH THIS GOAL?

The Grid technology can solve the problem!

Page 6: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 6

What is Grid

• Grid technology has been driven by genesis from metacomputing, but…

• In practice, the Grid is about resource sharing and coordinated problem solving in dynamic, multi-institutional virtual organizations

• Focus on how to enable, maintain and control the sharing of resources to achieve a common goal

Page 7: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 7

What “Grid“ offers:

Resource management protocols and services that support secure remote access to shared data resources and computing and the co-allocation of multiple resources.

Security solutions that support management of credentials and policies.

Information query protocols and services that provide configuration and status information about resources,organizations and services.

Data Management services that locate and transport datasets between storage systems and applications.

Page 8: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 8

What is OGSA

• The Open Grid Services Architecture (OGSA) represents an evolution towards a Grid system architecture based on Web services concepts and technologies. 

• The OGSA integrates key Grid technologies (including the Globus Toolkit with Web services mechanisms to create a distributed system framework based around the Open Grid Services Infrastructure (OGSI).

In Grids ,Everything is Service

Page 9: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 9

The Open Grid Services Architecture

• Service orientation to virtualize resources• From Web services:( everything is service) -Standard interface definition mechanisms:multiple protocol

bindings,multiple implementations,local/remote transparency

• Building on Globus Toolkit: -Grid service :semantics for service interactions -Management of transient instances -Factory,Registry,Discovery,other services -Reliable and secure transport

• Multiple host environments:J2EE,.NET,C,…

Page 10: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 10

The Structure of Grid Service

Page 11: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 11

Grid service interfaces

Page 12: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 12

Construct The Astronomical Data Grid

The astronomical data service is the most fundamental and important component in Virtual Observatory.

In the aspect of data share, the VO can be think as a astronomical Data Grid

VO=Astronomical Data Grid

Page 13: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 13

Outline

• VO,Grid and OGSA• Build the catalog data access service• Build the image mosaic service• Faced difficulties

Page 14: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 14

The Classification of Astronomical Data Service

• Astronomical Catalog Service• Image Mosaic Service• Spectrum Data Service • Simulation Data Service• •

Page 15: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 15

Class DataSet Name Data Amount ( zipped )

Catalog CDS/ADC Catalogs About 30G

Other Catalogs About 120G

Survey RealSky 5G

ROSAT X-ray Survey 10G

BATC 360G

DSS I 60G

DSS II About 620G

SDSS EDR 30G

SDSS DR1 (part) 65G

2dF 2003 /2QZ 7G

Archive ROSAT X-ray Point 28G

Einstein X-ray Data 5G

Library ADS 350G

Total >1700GB

Existing Astronomical Datasets we have

Page 16: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 16

Build Catalog Data Service

How to federate the catalog data into VO,that is, how to build Data Service using the existing databases and programs?

Page 17: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 17

Define Catalog Service Interface

• Input Query Language : SQL(now),ADQL (plan)• Output Data Format: VOTable 1.0• Catalog resource metadata registry protocol: VOResource 0.9

Some standards we used:

input: ADQL query sentence

output: VOTable format result

it makes service interface/API simple.

Page 18: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 18

How to create a catalog data service that can understand ADQL and generate VOTable format result??

we adopt two ways!

• Reconstruct the existing catalog DBMS• Encapsulate search program,like pmm

The CDS has offered search program for big catalog like USNO A2,0…..

How to use existing databases and programs to create catalog data service

Page 19: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 19

Catalog data service based on DB

Catalog/metadata

VOTable Wrapper

ADQL/SQL Translator

GT3 Interface

ADQL VOTable

JDBC

SQL ResultSet

DBMS

Page 20: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 20

Advantage and disadvantage

• Can sufficiently use the functions of SQL language and implement complex query.

• DBMSs offer the most powerful functions for data management and maintenance.

• Need many works to reconstruct the DBs.• To big catalogs, like USNOB1.0,2MASS

PSC, query efficiency is low

Page 21: China-VO Data Access Service  Based on OGSA

catalog_table

value_option

table_files

table_coordinate

has

table_field

field_value

UCD_field

field_link

catalog_metadata

hascatalog_acronym

has

belong

catalog

catal og_i dpath_nameobsolute_bytitleshort_nameidentifier

<pi > IVA80IVA200VA250VA250

<M>

Identifier_1 <pi>

Coordinate

coord_i dRA_idDec_idepochsystemequinoxepoch_RAepoch_Dec

<pi > IIIVA20VA10VA20II

<M>

Identifier_1 <pi>

Table

table_idtable_namepropertydescription

IVA200VA40TXT2000

<M>

Field_link

l i nk_i dcontent_rolecontent_typetitlevaluehrefgrefaction

<pi > IVA40VA40VA200VA80VA250VA250VA250

<M>

Identifier_1 <pi>

Field

fi el d_i dcolumn_nameoriginal_positioncolumn_typeoption_namedatatypeUCDunittypewidtharraysizeprecisiondescriptionref

<pi > IVA80SIVA80VA80VA80VA80VA80VA250IVA80VA20TXT2000VA80

<M>

Identifier_1 <pi>

Option

opti on_i doption_namevalue

<pi > IVA80VA80

<M>

Identifier_1 <pi>

Mysql_files

file_idoption_namedb_namemax_Decmin_Decmax_RAmin_RA

IVA80VA80LFLFLFLF

<M>

Identifier_1 <pi>

Value

val ue_i dvalues_nullvalues_typeinvalidmin_valuemin_inclusivemax_valuemax_inclusive

<pi > IVA80VA10BLVA80BLVA80BL

<M>

Identifier_1 <pi>

metadata

publisherpublisherIDcreatorcreater_logocontributorversiondateref_URLcontact_namecontact_Emailsubjectdescriptionsourcetypecontent_levelrelationshiprelationshipIDfacil ityinstrumentcov_spatialcov_regioncov_speccov_spec_bandcov_spec_mincov_spec_maxcov_tem_startcov_tem_stopcov_depthcov_obj_dencov_obj_countcov_sky_fracres_spatialres_specres_tempUCDrightsuncer_spatialdata_qualityuncer_photouncer_specuncer_temp

VA250VA250VA250VA250VA250VA80DLVA250VA250VA250VA250TXT2000VA40VA250VA40VA40VA250VA250VA250LVA2000LA2000VA40VA250LFLFDDVA80VA80LILFLFLFLFVA80VA40LFLFLFLFLF

UCD

option_namedescription

VA80TXT2000

Acronym

nameacronym <pi >

VA80I <M>

Identifier_1 <pi>

Page 22: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 22

Data service based on search program

VOTable Wrapper

ADQL Translator

GT3 Interface

ADQL VOTable

JNI/

stream

program

Data Files

parameters

Page 23: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 23

Advantage and disadvantage

• Positional search is quicker than DB

• Only offer search functions that programs could offer. Many programs only offer position search functions,no statistical functions.

Page 24: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 24

Catalog Access Service Provided by usBand Name Num of objects Amount

X-ray RASS-BSC 18806 0.03GB

RASS-FSC 105924 0.10GB

optical

USNO B1.0 1045913669 38 GB

USNO A2.0 526280881 7 GB

GSC 2.2.1 455851237 40 GB

GSC 1.2 25241730 1.4 GB

UCAC 1 27425433 >0.5 GB

UCAC 2 48330571 4.5 GB

Tycho2 2539913 0.5 GB

Hipparcos 118218 0.05GB

infrared 2MASS PSC 470992970 127 GB

2MASS ESC 1647599 3 GB

radio NVSS 1773484 0.44 GB

FIRST 811117 0.1 GB

Total About 110 catalogs

About 220GB

Page 25: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 25

How to call a Catalog data service

Grid Client

ResourceRegistry

Data ServiceFactory

Data Service

Instance

CreateData

service

Database

1.<Find Factory>

2.<Factory GSH>

<registry>

3.<create data service>

4.<Data service GSH>

5.<data request(ADQL)>

6.<result (VOTable)>

Page 26: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 26

Use Data Service to build www service for end user

End Users

Data serviceData service Data service

ServicesRegisterServicesRegister

MySQL Oracle 9i Files

Grid Client

Web server

Web Client

ResourcesRegister

ResourcesRegister

http

Data Mining

Service

Data processing

Service

Data Visualization

Service

End user don’t know where the data services are

Page 27: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 27

Use data service to create other service

Our next work is to build a multi-wavelength cross-identification service

(MWCI)based on the catalog data service.

What is multi-wavelength cross-identification ?

To cross-identify datasets by positional consistency, we can understand objects from different wavelength properties.

Page 28: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 28

The steps of multi-wavelength cross-identification

• Cross-identify datasets from different wavelengths within error radius.

• Divide the result of cross-identification into three situations: one-to-one, one-to-two, one-to-many.

• Choose the one-to-one entry for data mining• The other two situations need statistical

analysis to determine which source are the true counterpoint.

Page 29: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 29

Requirements

• Locate the datasets that users want to use. (dataset discovery)• How to cross-match the datasets in

heterogeneous DBMSs at different locations effectively and efficiently.

• Find storage resource to store the results

Page 30: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 30

UserApplication

NVSS

Storage Service Provider

storageFactory

MWCIFactory

Registry

DataService

2MASS

DataService

.

.

.

MWCI Service Provider...

storage

MWCI1

2

3

4

4

5

6

7

5

6

Page 31: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 31

Outline

• VO,Grid and OGSA• Build the catalog data access service• Build the image mosaic service• Faced technical difficulties

Page 32: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 32

Build The Image Mosaic Service

• Use DSS-I sky image build our first image mosaic service.

Page 33: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 33

the definition of interface of service

• Input parameters: 1.RA,2.Dec,3.image height,4.image

width• transport protocols :gridFTP• Output Data format :fits

Page 34: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 34

Realization of DSS-I image mosaic service

GT3 Interface

JNI/ Fits file

GetImage

parameters

DSS-I ImageFiles

GridFTP

Page 35: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 35

Outline

• VO,Grid and OGSA• Build the catalog data access service• Build the image mosaic service• Faced technical difficulties

Page 36: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 36

Technical Difficulties

• service/resource registry and discovery!• ADQL2SQL translator• protocol shortcoming

Page 37: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 37

protocol shortcomings

•The shortcomings of VOTable 1.0 protocol

1.How to encapsulate result of join query!!

2.The standard to encapsulating spectrum data

3.the definition of FIELD element is not strict and uncompleted

•The shortcoming of UCD

1.Can’t express concrete meaning,such as “ERROR” ,Error for what??

2. incomplete, example:HTMID has no UCD

•Lack of standard for Unit

Page 38: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 38

Q & A

?www. .org

Thank You

Page 39: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 39

Our provided catalogs in Catalog ServiceBand Name Num of objects Amount

X-ray RASS-BSC 18806 0.03GB

RASS-FSC 105924 0.10GB

optical

USNO B1.0 1045913669 38 GB

USNO A2.0 526280881 7 GB

GSC 2.2.1 455851237 40 GB

GSC 1.2 25241730 1.4 GB

UCAC 1 27425433 >0.5 GB

UCAC 2 48330571 4.5 GB

Tycho2 2539913 0.5 GB

Hipparcos 118218 0.05GB

infrared 2MASS PSC 470992970 127 GB

2MASS ESC 1647599 3 GB

radio NVSS 1773484 0.44 GB

FIRST 811117 0.1 GB

Total About 110 catalogs

About 220GB

Page 40: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 40

The Step Of Calling A Data Service

Page 41: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 41

Transparencies for Astro Data Access

• Heterogeneity Transparency• Name Transparency• Distribution Transparency

Page 42: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 42

What is Grid Service?

Page 43: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 43

What Is The Data Grid

• DataGrid : A dynamic logical namespace that enables coordinated sharing of heterogeneous distributed storage resources and digital entities based on local and global policies across administrative domains in a virtual enterprise.

• DataGrid

– Logical name space for location independent identifiers

– Abstractions for storage repositories, information repositories, and access APIs

– Latency management

Page 44: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 44

Data GridData GridData GridData Grid

Using a Data Grid – in Abstract

Ask for d

ata

•User asks for data from the data grid

Data d

elivere

d

•The data is found and returned•Where & how details are managed by data grid

Page 45: China-VO Data Access Service  Based on OGSA

11/27/2003IVOA Small Projects Meeting 2003 45