Top Banner
17 Nov 2003 Australia VO - ATNF 1 Metadata and Registries: Describing and Finding VO Resources R. Hanisch 1 , R.Plante 2 , G. Greene 1 , A.E. Linde 3 , T. McGlynn 4 , W. O’Mullane 5 , A.M.S. Richards 6 , Williams 7 , R. Williamson 2 , E. C. Auden 8 , K. T. Nodd 1) Space Telescope Science Institute 2) National Center for Supercomputing Applications 3) University of Leicester 4) NASA Goddard Space Flight Center 5) The Johns Hopkins University 6) Jodrell Bank Observatory 7) California Institute of Technology 8) Mullard Space Science Laboratory THE US NATIONAL VIRTUAL OBSERVATORY
24

Metadata and Registries: Describing and Finding VO Resources

Dec 30, 2015

Download

Documents

malik-rodriguez

T HE US N ATIONAL V IRTUAL O BSERVATORY. Metadata and Registries: Describing and Finding VO Resources. R. Hanisch 1 , R.Plante 2 , G. Greene 1 , A.E. Linde 3 , T. McGlynn 4 , W. O’Mullane 5 , A.M.S. Richards 6 , R. Williams 7 , R. Williamson 2 , E. C. Auden 8 ,K. T. Noddle 3 - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Metadata and Registries:  Describing and Finding VO Resources

17 Nov 2003Australia VO - ATNF 1

Metadata and Registries: Describing and Finding

VO ResourcesR. Hanisch1, R.Plante2, G. Greene1, A.E. Linde3, T. McGlynn4, W. O’Mullane5, A.M.S. Richards6,

R. Williams7, R. Williamson2, E. C. Auden8,K. T. Noddle3

1) Space Telescope Science Institute2) National Center for Supercomputing Applications

3) University of Leicester4) NASA Goddard Space Flight Center

5) The Johns Hopkins University6) Jodrell Bank Observatory

7) California Institute of Technology8) Mullard Space Science Laboratory

THE US NATIONAL VIRTUAL OBSERVATORY

Page 2: Metadata and Registries:  Describing and Finding VO Resources

17 Nov 2003Australia VO - ATNF 2

Resource Metadata

• A resource is any VO entity that can be described and given a name and unique identifier– Data collection (archive)– Catalog or collection of catalogs– Organization– Software packages– Bandpass filter functions– Services

• Services are VO resources that can be invoked by a user or software agent to perform some action on their behalf

• Metadata describes VO resources.  This metadata generally includes information the user or a computer program needs to determine if a resource is of interest and how a service is invoked.

Page 3: Metadata and Registries:  Describing and Finding VO Resources

17 Nov 2003Australia VO - ATNF 3

Resource Metadata

• Resource metadata is described by– A prose document that defines concepts independent

of an encoding scheme– XML Schemas that encode metadata and metadata

relationships

• Draws on Dublin Core metadata– An interdisciplinary standard for core resource

metadata http://dublincore.org

• Can be categorized– Identity– Curation– General content

– Collection/service content– Data quality– Service invocation

Page 4: Metadata and Registries:  Describing and Finding VO Resources

17 Nov 2003Australia VO - ATNF 4

Resource Metadata

Page 5: Metadata and Registries:  Describing and Finding VO Resources

17 Nov 2003Australia VO - ATNF 5

Resource Metadata ExampleIdentity metadataTitle Sloan Digital Sky SurveyShortName SDSSIdentifier ivo://stsci.edu/mast/sdss Curation metadataPublisher Space Telescope Science Institute/MASTPublisherID ivo://stsci.edu/mastCreator Sloan Digital Sky Survey ConsortiumCreator.Logo http://archive.stsci.edu/images/sdss_logo.gifContributor Sloan Digital Sky Survey ConsortiumDate 2001-06-15Version SDSS EDRReferenceURL http://archive.stsci.edu/sdss/index.htmlContact.Name Archive Branch, Space Telescope Science InstituteContact.Address3700 San Martin Drive, Baltimore, MD 21218 USAContact.Email [email protected] +1-410-338-4547 General content metadataSubject galaxies, quasars, stars, CCD photometry,

spectroscopy, redshift, sky surveysDescription The Sloan Digital Sky Survey is using a dedicated

2.5-m telescope and a large format CCD camera to obtainimages of over 10,000 square degrees of high Galactic latitude sky in five broad bands (u', g', r', i' and z', centeredat 3540, 4770, 6230, 7630, and 9130 Å, respectively)…

Source 2002AJ….123..485SType Survey, Catalog, EPOResourceContentLevel ResearchRelationship mirror-ofRelationshipID ivo://sdss.org/sdss/edr

Required keywords shown in red 

 Collection and service content metadataFacility Apache Point Observatory, Sloan 2.5-m TelescopeInstrument Five-band clocked CCD cameraCoverage.Spatial polygon (FK5, 145.17, 1.25, 235.9, 1.25, 235.9, -1.25, 145.17, 1.25) or polygon (FK5, 250.71, 66.29, 267.0, 66.29,

267.0,52.15, 250.71, 66.29) or polygon (FK5, 350.43, 1.17, 360.0, 1.17,360.0, -1.25, 350.43, -1.25) or polygon (FK5, 0.0, 1.17, 56.37, 1.17, 56.37, -1.25, 0.0, -1.25)

Coverage.RegionOfRegard 0.0001Coverage.Spectral OpticalCoverage.Spectral.Bandpass u’, g’, r’, i’, z’Coverage.Spectral.MinimumWavelength 400.e-9Coverage.Spectral.MaximumWavelength 850.e-9Coverage.Temporal.StartTime 1999-12-25Coverage.Temporal.StopTime 2001-07-15Coverage.Depth 3.e-6Coverage.ObjectDensity 6.e4Coverage.ObjectCount 2.e7Coverage.SkyFraction 0.01Resolution.Spatial 0.00028Resolution.Spectral 5000Resolution.Temporal 120UCD Not ProvidedFormat text/xmlRights Public Data quality metadataDataQuality AUncertainty.Photometric 3.e-7Uncertainty.Spatial 0.00003Uncertainty.Spectral 1.e-11Uncertainty.Temporal 0.1 

Page 6: Metadata and Registries:  Describing and Finding VO Resources

17 Nov 2003Australia VO - ATNF 6

Resource Metadata Example Service metadataService.InterfaceURL http://archive.stsci.edu/cgi-bin/sdss/catalog.htmlService.BaseURL http://archive.stsci.edu/cgi-bin/sdss/catalogService.HTTPResults text/xmlService.StandardID ivo://ivoa.net/Services/ConeSearchService.StandardURL ivo://www.ivoa.net/Documents/REC/ConeSearch.htmlService.MaxSearchRadius 0.2Service.MaxReturnRecords 5000 

Page 7: Metadata and Registries:  Describing and Finding VO Resources

17 Nov 2003Australia VO - ATNF 7

Resource Metadata: XML Schema

• Classes of ResourcesOrganization, DataCollection, Service, Registry– Specific classes inherit from generic <Resource>

• Organized into separate schemas:– Core resource metadata: VOResource

– Various extensions schemas containing specific types

• Capable of describing…– Data centers, research organizations, missions,

observatories– Data collections, archives – VO standard services: Cone Search, Simple Image

Access– Existing Browser/CGI-based services

Page 8: Metadata and Registries:  Describing and Finding VO Resources

17 Nov 2003Australia VO - ATNF 8

The Role of Resource Registries

• Used to discover and locate resources—data and services—that can be used in a VO application

• Registry: a list of resource descriptions– Expressed as structured metadata

to enable automated processing and searching

• Registries are themselves VO Resources

Page 9: Metadata and Registries:  Describing and Finding VO Resources

17 Nov 2003Australia VO - ATNF 9

Registry Requirements

• Allow user to select resources that are likely to pertain to a scientific question

• Select resources based on characteristics…– Type of resource: catalogs, image archives, EPO, services– Coverage in space, time, and frequency– Where data comes from, who curates it

• Dynamic: resources will come and go

• Distributed: Should not depend on a single point of failure or single view of the VO.

• Preserve the data providers’ control over their data– Curators control what gets registered, content, updates– Allow integration with existing resource management

• Allow extension to new types of resources

Page 10: Metadata and Registries:  Describing and Finding VO Resources

17 Nov 2003Australia VO - ATNF 10

IVOA Registry Working Group (RWG)

• Common approach to registries

• Work packages– Science requirements and use cases– Resource metadata– Registry interfaces– Prototyping

• Distributed model for registries

Page 11: Metadata and Registries:  Describing and Finding VO Resources

17 Nov 2003Australia VO - ATNF 11

Local PublishingRegistry Local

SearchableRegistry

FullSearchableRegistry

Local PublishingRegistry

FullSearchableRegistry

DataCenters

VOProjects

SpecializedPortals & Services

Registry Model

Page 12: Metadata and Registries:  Describing and Finding VO Resources

17 Nov 2003Australia VO - ATNF 12

Local PublishingRegistry Local

SearchableRegistry

FullSearchableRegistry

Local PublishingRegistry

FullSearchableRegistry

DataCenters

VOProjects

SpecializedPortals & Services

Registry Model

harvest(pull)

Page 13: Metadata and Registries:  Describing and Finding VO Resources

17 Nov 2003Australia VO - ATNF 13

Local PublishingRegistry Local

SearchableRegistry

FullSearchableRegistry

Local PublishingRegistry

FullSearchableRegistry

DataCenters

VOProjects

SpecializedPortals & Services

Registry Model

harvest(pull)

replicate

Page 14: Metadata and Registries:  Describing and Finding VO Resources

17 Nov 2003Australia VO - ATNF 14

Local PublishingRegistry Local

SearchableRegistry

FullSearchableRegistry

Local PublishingRegistry

FullSearchableRegistry

DataCenters

VOProjects

SpecializedPortals & Services

Registry Model

harvest(pull)

replicate

selectiveharvesting

Page 15: Metadata and Registries:  Describing and Finding VO Resources

17 Nov 2003Australia VO - ATNF 15

Local PublishingRegistry Local

SearchableRegistry

FullSearchableRegistry

Local PublishingRegistry

FullSearchableRegistry

DataCenters

VOProjects

SpecializedPortals & Services

ClientApplications

searchqueries

Registry Model

Page 16: Metadata and Registries:  Describing and Finding VO Resources

17 Nov 2003Australia VO - ATNF 16

Local PublishingRegistry Local

SearchableRegistry

FullSearchableRegistry

Local PublishingRegistry

FullSearchableRegistry

DataCenters

VOProjects

SpecializedPortals & Services

ClientApplications

searchqueries

Registry Model

Page 17: Metadata and Registries:  Describing and Finding VO Resources

17 Nov 2003Australia VO - ATNF 18

NVO Prototype Registry

• To support a Data Inventory Service (DIS)

What is known about a position in the sky?

– Use a registry to locate and query standard services:• Cone Search Services: querying catalogs• Simple Image Access Services:

querying image archives and cutout services

Components – Publishing Registries– Searchable Registry– Resource Metadata– Harvesting Protocol– Populated with service descriptions

Page 18: Metadata and Registries:  Describing and Finding VO Resources

17 Nov 2003Australia VO - ATNF 19

Publishing Registries: getting information into registries

• Two publishing registries established at Caltech and NCSA.

• Motivation: – Register Simple Image

Access Services– Develop techniques for

easy registration

• Resource descriptions stored as XML documents using VOResource schema

Page 19: Metadata and Registries:  Describing and Finding VO Resources

17 Nov 2003Australia VO - ATNF 20

Harvesting Interface

• Adopted Open Archives Initiative (OAI) Protocol for Metadata Harvesting– HTTP/CGI-based protocol for exposing metadata to

harvesters (e.g. searchable registries)

• Advantages:– Existing, field-tested design we didn’t have to re-invent– Fairly easy to implement– Existing tools for emitting and harvesting metadata– Exposes our metadata to larger digital library

community

Page 20: Metadata and Registries:  Describing and Finding VO Resources

17 Nov 2003Australia VO - ATNF 21

• Curator uses another site’s registry– Good for a few resources whose descriptions are fairly

statice.g. @NCSA: http://nvo.ncsa.uiuc.edu/nvoregistration.html

• VORegistry-in-a-box:– Deployable package that allows a data provider to run

own registry “out of the box”http://nvo.ncsa.uiuc.edu/VO/software

– Good for larger number of resources that might be updated often

• Curator builds own OAI interface– Good for very large number of resources – Automate XML generation using site’s existing

information management tools

Models for Registering Resources

Page 21: Metadata and Registries:  Describing and Finding VO Resources

17 Nov 2003Australia VO - ATNF 22

Searchable Registry

• Searchable Registry was set up at JHU/STScI http://skyserver.pha.jhu.edu/devel/registry

• OAI harvester collects resource descriptions – from Publishing Registries at Caltech & NCSA– Loads data into relational database

• SOAP Web Service interfacehttp://skyserver.pha.jhu.edu/devel/registry/registry.asmx

– Searching• Currently provides specialized querying useful for DIS

– Re-harvest request• To get updated records from publishing registries

Page 22: Metadata and Registries:  Describing and Finding VO Resources

17 Nov 2003Australia VO - ATNF 23

Local PublishingRegistry

FullSearchableRegistry

Local PublishingRegistry

Caltech

JHU/STScI

harvest(pull)

DataInventory Service

search forservices

Registry Model

NCSADIS

Page 23: Metadata and Registries:  Describing and Finding VO Resources

17 Nov 2003Australia VO - ATNF 24

ConeSearchService

ConeSearchService

Simple ImageAccess

Simple ImageAccess Local

PublishingRegistry

FullSearchableRegistry

Local PublishingRegistry

Caltech

JHU/STScI

harvest(pull)

DataInventory Service

search forservices

Registry Model

NCSADIS

ConeSearchService

Simple ImageAccess

DataProviders

Page 24: Metadata and Registries:  Describing and Finding VO Resources

17 Nov 2003Australia VO - ATNF 25

Summary

• We built a working prototype registry system to support an end-user VO service– Distributed Publishing and Searchable components– Encoded descriptions using emerging VO XML standard

schemas– OAI Harvesting Standard deployed easily– Used to discover Cone Search and SIA services

• What’s next: Interoperable registries IVOA-wide – Implement newly agreed-upon Resource Metadata standard

and VOResource XML schema– Demonstrate harvesting and replication– Populate registries with broad base of VO resources– Standardize registry query interfaces