1 Unidata THREDDS*: Integrating Environmental Data into Digital Libraries *THematic Real-time Environmental Distributed Data Services Sponsored by the National Science Foundation http://www.nsf.gov Ben Domenico November 2003
11
Unidata THREDDS*: Integrating Environmental Data into Digital Libraries*THematic Real-time Environmental Distributed Data Services
Sponsored by the National Science Foundation http://www.nsf.gov
Ben DomenicoNovember 2003
22
Topics• Traditional Unidata Approach
– Mainly meteorological data– Subscription system pushes data to user sites– Unidata Program Center provides data
analysis tools for use on data at user sites
• THREDDS Enhancements– Broader menu of Earth system data– Local client access from remote servers– Less arcane, more general and accessible
tools– Integration of data and analysis tools into
educational modules and digital libraries
33
Unidata Community Today• More than160 institutions
– Includes over 100 academic departments plus government agencies and private sector research groups
– Does not count separate installations, e.g. Spanish weather service IDD, US Weather Service radar data system
• Interdisciplinary from the outset: 1996 survey showed over 2/3 of institutions had some uses outside meteorology (oceanography, hydrology, climatology, civil engineering, environmental science…)
44
Community Impact Survey
• Over 21,000 college students per year use Unidata tools and data in classrooms and labs
• Nearly 4,000 women/minority students• More than 1,800 faculty and research staff• Over 55,000 K-12 students involved through
Unidata-connected university programs• Informal education: in excess of 1 million hits
at Unidata-based university web sites per day• 97% of community report being satisfied or
very satisfied
55
Principal Activities of the Unidata Program Center
• Facilitating Data Access to a broad spectrum of observations & forecasts (in near real time)
• Providing Tools to visualize, analyze, organize, receive, & share data at university sites
• Supporting Faculty who use Unidata systems at colleges & universities (most in the U.S.)
• Building and Advocating for a Community where data, tools, & best practices in education/research are shared
66
Traditional Unidata Data Types
• Individual observations from weather stations around the globe
• Satellite imagery• Radar data from 160 NEXRAD radars• Output from weather forecast model runs
at the National Centers for Environmental Prediction
• Lightning strike data• Measurements from sensors on
commercial aircraft
88
IDD: The Community in Action
• The Internet-based system by which universities acquire huge quantities of weather data in near-real time (i.e. ASAP) typifies Unidata’s community orientation.
• The system has no data center -- all tasks are performed on the participants’ own (small) computers.
• Currently the most used “advanced application” on the Abilene network (2-3% in terms of packets and bytes transferred)
99
Internet Data Distribution (IDD)with Multiple Sources (Injecting 17 Gigabytes per Day)
Using LDM software for instant data relaying, ~160 institutions cooperate to acquire a wide range of real-time, global, atmospheric & oceanic observations, model outputs, remotely sensed images..., in a coordinated community effort.
Source
LDM
Source
Source
LDM LDM
LDMLDM
LDM LDM
LDM
LDM
Internet
1111
Lightning, aircraft, GPSmet, etc.
Unidata user running local
analysis and display tools
Decoders
Decoders
Decoders
Decoders
Typical Data Handlingat a Unidata Site
Unidata user running local
analysis and display tools
Local data decoded into application
specific formats
IDD
Application specific protocols
Decoders
Forecast Forecast Model OutputModel Output Weather station Weather station
observationsobservations
Satellite Satellite imageryimagery
Radar dataRadar data
1212
Thematic Data Servers (combining IDD “push” with several forms of “pull” and DL discovery)
Local user applications: e.g., LAS, McIDAS,
IDV, VGEE, IDL, MatLab...
DLESEDigital Library for
Earth-System Education
HydrologyData, e.g.
Geophysical Data, e.g.
Satellite Images, e.g.Satellite
Images, e.g.Satellite
Images, e.g.Satellite
Imagery...
Client/server data access protocols, e.g. OpenDAP, ADDE,
WCS, FTP
IDD
DLinterchange
protocol
IDD
Discovery
IDD IDD IDD
1313
THREDDS THematic Real-time Environmental Distributed Data Services
Connecting people, documents and data
PeoplePeople
DocumentsDocuments DataData
1414
THREDDS Overview• National Science Digital Library (NSDL)
“collections” project
• Integrating real-time environmental data into – Online educational materials– Digital libraries (DLESE, NSDL)
• Two-year grant from NSF Department of Undergraduate Education (DUE)
• Second generation under negotiation
• Led by Unidata Program Center (UPC)
1515
THREDDS Data Providers• University of Alabama Huntsville (Sara Graves, Rahul Ramachandran, Steve Tanner, Ken Keiser) • ARM (Atmospheric Radiation Measurement, Chris Klaus) • CDC, the Climate Diagnostic Center (Roland Schweitzer) • COLA, Center for Oceans Land Atmosphere (Joe Wielgosz) • University of Florence (Stefano Nativi) • GMU, George Mason University (Menas Kafatos and Ruixin Yang) • IRI/LDEO, International Research Institute/Lamont Doherty Earth Observatory (Benno Blumenthal)• ESG, the Earth System GRID (Luca Cinquini, NCAR/SCD)• IRIS DMC, Incorporated Research Institutes for Seismology Data Management Center (Rob Casey) • NCAR, the National Center for Atmospheric Research (Don Middleton)• NCDC, the National Climatic Data Center (Ben Watkins)• NGDC, National Geophysical Data Center (Ted Habermann) • NOMADS,NOAA Operational Model Archive and Distribution System, (Glenn Rutledge, NCDC) • University of Oklahoma (Kelvin Droegemeier) • PMEL, the Pacific Marine Environment Laboratory (Steve Hankin) • FNMOC, Fleet Numerical Meteorological and Oceanographic Center (Phil Sharfstein) • SSEC, the Space Science and Engineering Center., U. of Wisconsin-Madison (Steve Ackerman, Tom
Whittaker) • Unidata Community ADDE servers (Tom Yoksas, Unidata Program Center) • CIESIN (Consortium for International Earth Science Information Network, Bob Downs) • CUAHSI (Consortium of Universities for Advancement of Hydrologic Science, David Maidment) • ESIG/NCAR (NCAR Environmental Societal Impacts Group, Bob Harriss) • Earthscope (UCAR UNAVCO, Chuck Meertens) • GEON (GEOphysical Network, Chaitan Baru, UCSD San Diego Supercomputer Center) • ESRI GIS Community
1616
THREDDS Analysis/Display Tool Builders
• Data Discovery Toolkit and Foundry based on EDMI (Earth Data Multimedia Instrument, New Media Studio, Bruce Caron).
• GDS, GrADS/DODS Server (COLA, Center for Oceans Land Atmosphere, Joe Wielgosz)
• IDV, Integrated Data Viewer (Unidata Program Center, Don Murray) • INGRID (IRI/LDEO, International Research Institute/Lamont Doherty Earth
Observatory, Benno Blumenthal)• LAS, Live Access Server (PMEL, the Pacific Marine Environment
Laboratory, Steve Hankin) • VGEE, Virtual Geophysical Exploration Environment (NCAR, DLESE, U. of
Illinois, Unidata, many collaborators)• WXWISE Applets (SSEC, the Space Science and Engineering Center., U.
of Wisconsin-Madison, Tom Whittaker) • ESRI GIS Clients (ESRI, Inc., Jack Dangermond, President) • OGC Clients (Open GIS Consortium, David Schell, President) • MyWorld (Northwestern educational GIS Client, Danny Edelson)
1717
THREDDS Interoperability Partners
• ADDE, Abstract Data Distribution Environment (University of Wisconsin – Madison, Tom Yoksas)• DIMES, DIstributed MEtadata System (George Mason University, Ruixin Yang) • DODS/OPeNDAP/Aggregation Server, Distributed Oceanographic Data System/Open source
Project for a Network Data Access Protocol (University of Rhode Island, Unidata, Ethan Davis) • DLESE, Digital Library for Earth System Education (Rajul Pandya)• ESML, Earth System Markup Language (University of Alabama-Huntsville, Rahul Ramachandran) • ESRI, Environmental Science Research Institute (various)• GCMD, Global Change Master Directory (Gene Major)• OGC and ISO Standards (University of Florence, Stefano Nativi)• ADL (Gazetteer Services The University of California, Santa Barbara, Linda Hill and Michael
Goodchild) • DLESE Evaluation Services (The University of Colorado CIRES, Susan Buhr) • DLESE Data Services (Tamara Ledley) • DLESE Program Center Digital Library for Earth System Education (Mary Marlino) • ESRI (Jack Dangermond, President) • OPeNDAP (The University of Rhode Island Open source Project for a Network Data Access Protocol
-- formerly DODS, Peter Cornillon) • LAITS (Laboratory for Advanced Information Technology and Standards,Liping Di, George Mason
University) • NSDL Evaluation Services (University of Colorado, Tamara Sumner) • OGC (Open GIS Consortium, David Schell, President) • SWEET (Semantic Web for Earth and Environmental Terminology, Rob Raskin)
1818
Unidata’s Contributions• A large, (inter)national, active, cooperative academic
user community• Coordination of many disparate contributors
(universities, government agencies, digital libraries, commercial vendors, standards bodies…)
• Reliable, automated, real-time data systems• Platform-independent 5D visualization with HTML
document integration• Basic inventory catalog generator and server
software• Client-side catalog access modules
1919
Funding Sources• Unidata 2003/2008 (NSF Atmospheric
Science Division)
• THREDDS NSDL Collections Grant (NSF Department of Undergraduate Education)
• DODS/OPeNDAP (University of Rhode Island subcontract on Naval Ocean Partnership Program Grant and NASA Earth Science Enterprise)
• NWS/COMET Case Studies (NOAA NWS)
2121
DocumentsDocuments
PeoplePeople
DataData
• Well-developed connections– Document references– Embedded multimedia– Embedded interactive
applets
• Powerful tools– Google– Dreamweaver– Web-site management
tools– Web services
The Web
2323
DocumentsDocuments
PeoplePeople
DataData
Data Access Technologies• Web-based data interactions
with passive gif images -- most analysis work done on remote server
• Traditional Unidata IDD with analysis on local clients
• Combinations with Web browse and FTP delivery for local analysis,
• Client/server, e.g., DODS/OPeNDAP
• All lack sophisticated, text-based Web search/discovery tools and coherent integration
2626
PeoplePeople
DocumentsDocuments DataData
THREDDS is the Bottom line • Associate words of the science
with available datasets• Create “compound” documents
pointing to datasets• Connect analysis tools to
documents and datasets
• Wide range of compound documents– Lists of datasets available on server with brief
description of dataset classes– Online publications pointing to datasets illustrating
concepts
• Massive arsenal of Web and Digital Library search/discovery tools can be applied to compound documents
2929
PeoplePeople
DocumentsDocuments DataDataC
atalog
Generation Tools
Analysis andVisualization Tools
Data Services
Discovery andPublication Tools
Discovery and Publication Services
Dat
a C
atal
ogS
ervi
ces
THREDDSTHREDDSMiddlewareMiddleware
3131
Basic Compound DocumentTHREDDS Server Inventory Catalog
• Inventory list of datasets on server
• Generated automatically with minimal human input
• Viewed from within analysis and display application
• Can be harvested for inclusion in GCMD, DLESE, NSDL for use by module builders
3232
Compound Publication: Educational Module within Interactive Analysis Tool
• Discovery at DLESE
• module at DPC• VGEE tool at
Unidata • Datasets at NCAR• Lends itself well to
Web discovery tools, DL integration
• Can be:– education module– online scientific
publication
3333
Browser-base Thin Client Access
• LDEO/IRI web site publishes catalog of datasets available on server at UCAR
• Catalog resides and is updated at UCAR
• Browsing of datasets on UCAR server from LDEO server
• Also enables analysis and display of datasets on UCAR server using tools on LDEO server
4545
Future Directions• Standards-based web services approach
to providing both data and metadata• Integrate GIS clients and servers into
THREDDS for access to societal impacts, infrastructure, hydrology data, etc.
• Work with OGC and ISO to incorporate emerging standard access protocols into THREDDS
• Actively participate in future DLESE Data Access Working Group and Data Services workshops to create more compound document educational module.
4646
THREDDS, GIS, DL InteroperabilityGIS Client
ApplicationsTHREDDS Client
Applications
OpenGIS Protocols:WMS, WFS, WCS
OGC or proprietary GIS
protocols
OGC or OPeNDAPADDE. FTP…
protocols
GIS ServerGIS ServerGIS Servers
Demographic, infrastructure, societal impacts, …
datasets
THREDDS ServerTHREDDS Server
THREDDS ServersSatellite, radar,
forecast model output, … datasets
Digital Library Discovery Systems
Metadatacrosswalk
Open Archives Initiative (OAI) Metadata Harvesting
Metadatacrosswalk
4747
Summary• Universities have used Unidata tools to
acquire, analyze, and display real-time atmospheric data for nearly 20 years
• THREDDS – along with related client/server access and display technologies-- makes an even broader menu of Earth system data to a more diverse community of users
• THREDDS technologies enable the creation of compound educational modules and scientific publications with embedded pointers to datasets and tools.
5555
More Information
• http://my.unidata.ucar.edu/
• http://www.unidata.ucar.edu/projects/THREDDS/