Top Banner
06/21/22 Department of Civil, Architectural & Environmental Engineering 1 HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture M. Piasecki November, 2007
32

HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

Jan 30, 2016

Download

Documents

Melody

HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture. M. Piasecki November, 2007. Lecture. Demo of HydroSeek What are the search criteria? Functionality of the Engine Interface Data Sources Common Sources - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

04/22/23 Department of Civil, Architectural & Environmental Engineering 1

HYDROSEEK and HYDROTAGGERA Search Engine for Hydrologists

GIS in Water Resources Lecture

M. Piasecki

November, 2007

Page 2: HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

04/22/23 Department of Civil, Architectural & Environmental Engineering 2

Lecture Demo of HydroSeek What are the search criteria? Functionality of the Engine Interface

Data Sources Common Sources Common Problems (Completeness, Syntax, Semantics)

Ontologies Ontology details Concept-to-data variable tagging

Architecture Flow Chart Technologies used

Demo of HydroTagger Why the Tagging? Technologies

Page 3: HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

04/22/23 Department of Civil, Architectural & Environmental Engineering 3

www.HydroSeek.org

Page 4: HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

04/22/23 Department of Civil, Architectural & Environmental Engineering 4

HIS Goals Hydrologic Data Access System – better access

to a large volume of high quality hydrologic data Support for Observatories – synthesizing

hydrologic data for a region Advancement of Hydrologic Science – data

modeling and advanced analysis Hydrologic Education – better data in the

classroom, basin-focused teaching

Page 5: HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

04/22/23 Department of Civil, Architectural & Environmental Engineering 6

Search multiple heterogeneous data sources simultaneously regardless of semantic or structural differences between them

Objective

NWIS

NARR

NAWQANAM-12

request

request

request

request

request

requestrequest

request

request

return

return

return

return

return

returnreturn

return

return

What we are doing now …..

Page 6: HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

04/22/23 Department of Civil, Architectural & Environmental Engineering 7

Semantic Mediator

What we would like to do …..

NWIS

NAWQA

NARR

generic

request

GetValues

GetValues

GetValues

GetValues

GetValues

GetValuesGetValues

GetValues

GetValues HODM

Page 7: HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

04/22/23 Department of Civil, Architectural & Environmental Engineering 8

Data sources…USGS

EPA

CIMS

TCEQ

NADP

Page 8: HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

04/22/23 Department of Civil, Architectural & Environmental Engineering 10

Spatial Coverage

STORET has 758 sites in Texas, TCEQ has 8407.

STORET has 47,602 sites in Florida, NWIS has 27,906.

NWIS has 121,545 in Minnesota, STORET has 22,260.

Page 9: HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

04/22/23 Department of Civil, Architectural & Environmental Engineering 11

Data Availability

Page 10: HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

04/22/23 Department of Civil, Architectural & Environmental Engineering 12

1957-19771977-20032003-2007

Nitrogen

Temporal Coverage

Page 11: HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

04/22/23 Department of Civil, Architectural & Environmental Engineering 13

Interface Problem

NWIS ~175 form elements on a single page

STORET + NWIS + TCEQ + CIMS = ???A drop down menu ∞

String search across parameter list? How about synonyms?‘Elevation, water surface’ vs. ‘stage height’

Page 12: HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

04/22/23 Department of Civil, Architectural & Environmental Engineering 14

Completeness Problem: Metadata Catalog• Better query performance• Freedom• Fewer errors

Total Number of Sites 274,918

Sites with geographic coordinates 274,435

Sites with State/County information 273,113

Sites with Hydrologic Unit Codes 128,646

Availability of geographic identifiers for stations in EPA STORET

Page 13: HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

04/22/23 Department of Civil, Architectural & Environmental Engineering 15

Heterogeneity Problem

Syntax E.g. date & time formats, Gregorian versus Julian

Data format/structure E.g. XML, HTML, tab/tilde/comma separated

text, gunzipped tar balls…

Semanticsmore …..

Page 14: HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

04/22/23 Department of Civil, Architectural & Environmental Engineering 16

Issues with Semantics Hyponymy Parameter “Groundwater level”, “Stream stage”, “Reservoir level” versus “Water level”

Pseudo hyponymy due to lack of metadata Parameter “Manganese, 6N hydrochloric acid extracted, recoverable, dry weight, milligrams per kilogram” versus “Manganese, milligrams per kilogram”

Synonymy ‘Total Kjeldahl Nitrogen’ vs. ‘Ammonia+Organic Nitrogen’

Page 15: HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

04/22/23 Department of Civil, Architectural & Environmental Engineering 17

Search Fine tune Retrieve

rather than

Search Retrieve

avoid ‘high precision, low recall’ and ‘low precision, high recall’

problems.

Search Strategy

Page 16: HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

04/22/23 Department of Civil, Architectural & Environmental Engineering 18

Layered Ontology Model

Page 17: HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

04/22/23 Department of Civil, Architectural & Environmental Engineering 19

NavigationCompound

Core

Page 18: HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

04/22/23 Department of Civil, Architectural & Environmental Engineering 20

Knowledge Base OWL Ontologies

‘Escherichia coli’ = ‘E. coli’‘E. coli’ is-a ‘Indicator Organism’

‘Copper’ is-a ‘Micronutrient’‘Copper’ isMeasuredIn ‘Medium’‘Medium’ = {Water, Soil…}‘Micronutrient’ is-a ‘Nutrient’

• Supports classification of search results

• Entities in the ontology are associated with measured variables in a relational database

• Helps solving semantic heterogeneity issues between data repositories

Page 19: HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

04/22/23 Department of Civil, Architectural & Environmental Engineering 21

Page 20: HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

04/22/23 Department of Civil, Architectural & Environmental Engineering 22

Point Observations Information ModelData Source

Network

Sites

Variables

Values

{Value, Time, Qualifier, Offset}

USGS

Streamflow gages

Neuse River near Clayton, NC

Discharge, stage (Daily or instantaneous)

206 cfs, 13 August 2006

• A data source operates an observation network• A network is a set of observation sites• A site is a point location where one or more variables are measured• A variable is a property describing the flow or quality of water• A value is an observation of a variable at a particular time• A qualifier is a symbol that provides additional information about the value• An offset allows specification of measurements at various depths in water

http://www.cuahsi.org/his/webservices.html

GetSites

GetSiteInfo

GetVariables

GetVariableInfo

GetValues

Page 21: HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

04/22/23 Department of Civil, Architectural & Environmental Engineering 23

Hydroseek Webservices Most Hydroseek functions are available as web services (SOAP)

Support for queries using GlobalChangeMasterDirectory GCMD keywords

Supports output in GeographyMarkupLanguage GML as well as WaterML

Drexel Server

HydroSeek

Native Services

MicroSoft Server

VirtualEarth MapSan Diego Supercomputer

Center Server

USGSDaily

EPASTORET

USGSRealtime

WaterOneFlow

WaterOneFlow

WaterOneFlow

WaterOneFlow TCEQ

WaterOneFlow CIMS

Page 22: HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

04/22/23 Department of Civil, Architectural & Environmental Engineering 24

GetStationsRequest

Response

BoundingBox

Page 23: HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

04/22/23 Department of Civil, Architectural & Environmental Engineering 25

GetStationsByHU

HUC_Code

Request

Response

Page 24: HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

Request

Response

04/22/23 Department of Civil, Architectural & Environmental Engineering 26

GetStationCatalogueFiltered

Page 25: HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

Request

Response

04/22/23 Department of Civil, Architectural & Environmental Engineering 27

GetStationCatalogue

Page 26: HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

04/22/23 Department of Civil, Architectural & Environmental Engineering 28

Allows searching multiple heterogeneous data sources simultaneously regardless of semantic or structural differences between them

Modular & extensible

Architecture Outline Inside the CUAHSI HOD Module

Page 27: HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

04/22/23 Department of Civil, Architectural & Environmental Engineering 30

The Database-Ontology Link

www.HdyroTagger.org

Page 28: HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

04/22/23 Department of Civil, Architectural & Environmental Engineering 31

1) MappingsApproved_Table

HydroSeek ODM neededan upgrade, i.e. additionaltables.

2) FrequentUpDates_Table

Page 29: HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

04/22/23 Department of Civil, Architectural & Environmental Engineering 32

How does the Tagging work?Step 1Users need to register on the web-site first before they can use the HydroTagger.

When registering select the testbed site you are affiliated with. Each testbed site needs ONE administrator who can then admit additional users for that specific testbed site.

Please send an email to identify the designated tagger site administrator so we can promote that person to the role.

Page 30: HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

04/22/23 Department of Civil, Architectural & Environmental Engineering 33

How does the Tagging work?

WATERS Network Information System

Step 2The “Sniffer” jumps into action and trawls through the testbed sites to find and identify new variablenames (once a week, currently every Sunday night)

It does so by using the regular web-services published through the WSDL (no “hacking”!!!)

It returns i) data updating information and ii) variablenames used and compares these to those used by HydroSeek.

Page 31: HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

04/22/23 Department of Civil, Architectural & Environmental Engineering 34

How does the Tagging work?Step 3The Tagger now updates the HydroSeek catalogue (an amalgamation of all 10 testbed catalogues) with the newly found data entries.

If it finds a new variablename (introduced during the dataloading process using the Data-Loader), it puts it into a table and offers it up to he HydroTagger GUI for semantic Tagging.

Test-Bed VarName Siteexist? VarName? content ActionCCBay DOConcSuf Y Y new data update Cat (Time)CCBay DOConcBot Y N new variable place in TaggerBin => DOCCBay DOConcMid N Y new data upudate Cat (Site+Time)

SRBHOS DO_Water Y Y new data update Cat (Time)

Minnehaha TempSurf Y N new variable place in TaggerBin => TempMInnehaha StreamDOCon Y N new variable place in TaggerBin => DO

SantaFe WaterDOCon Y N new variable place in TaggerBin => DOSantaFe GoldConc Y N new var/no conc place in TaggerBin => ??

Page 32: HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

04/22/23 Department of Civil, Architectural & Environmental Engineering 35

Thank you…Questions?