B. Brodaric—GIN Cyberra Summit 2010 Banff, 22 Sept. 2010
1
GIN: A Cyberinfrastructure and GeoPortal for Canadian Groundwater Data
Boyan Brodaric Geological Survey of Canada
Natural Resources Canada
B. Brodaric—GIN Cyberra Summit 2010 Banff, 22 Sept. 2010
2
Themes
Data Cyberinfrastructure (CI) web-based resources for data interoperability
Spatial Data (cyber)Infrastructure (SDI) open standards for geographically located features and observations
Groundwater Information Network (GIN) Canadian network for groundwater data
B. Brodaric—GIN Cyberra Summit 2010 Banff, 22 Sept. 2010
3
GW data in Canada
Distributed, Uncoordinated data Feds (< 10), provs & terrs (<50), municipalities (100s?),
watershed authorities (100s?)
Heterogeneous data Data use, content, structure, systems (dbs, sensors)
Variable Volume Use (e.g. extraction, vulnerability): ? Budgets (e.g. regional recharge): 10s? Reservoirs (e.g. aquifers): 100s Observations (e.g. wells, monitoring): 1Ms-10Ms
Variable Quality Completeness, consistency, location
Use
Budgets
Reservoirs
Observations
B. Brodaric—GIN Cyberra Summit 2010 Banff, 22 Sept. 2010
4
Ontario & Quebec schematic and semantic heterogeneity
in water-well data
GW data in Canada
Quebec rock type
Ontario rock type
B. Brodaric—GIN Cyberra Summit 2010 Banff, 22 Sept. 2010
5
Recent calls for action
More online access
Consolidate access
Better data quality
More data (use, monitoring)
GW Data Access
GW Data Management
More online access
Consolidate access
Better data quality
More data (use, monitoring)
B. Brodaric—GIN Cyberra Summit 2010 Banff, 22 Sept. 2010
6
Groundwater Information Network (GIN) NRCan, 9 prov/terr (YK, BC, AB, SK, MB, ON, QC, NS, NL), USGS Seamless access to GW information
Start with water well databases then sensors GeoConnections seed funding Jan2008-Mar2009
Principles Distributed: data stays with owners Seamless: acts as one virtual database Multi-access: multiple portals, tools Standards-based: nat’l CGDI & int’l OGC/ISO standards e.g. Groundwater ML (GWML) WaterML GeoSciML
Approach
B. Brodaric—GIN Cyberra Summit 2010 Banff, 22 Sept. 2010
8
GW Ontology (data content)
GWML, WaterML (data structure)
WFS, WMS,… (data systems)
GW Practices (data usage)
schema
semantic
system
syntax GML (data language)
pragmatic
Groundwater
OGC
Overcome levels of data heterogeneity
Approach: data interoperability
B. Brodaric—GIN Cyberra Summit 2010 Banff, 22 Sept. 2010
9
Approach: interop architectures
Catalog central registry unconsolidated access common standards fragmented data e.g. US-CUAHSI
ON QC
OGC OGC
registry
Warehouse central database consolidated access common standards duplicate, delayed data e.g. AU-AWRIS, EU-WISE
Network central mediator, registry consolidated access common standards virtual, real-time data e.g. GIN
ON QC
OGC OGC
OGC
mediator registry
ON QC
OGC OGC
OGC
registry
B. Brodaric—GIN Cyberra Summit 2010 Banff, 22 Sept. 2010
10
GML WMS, WFS, SOS
Groundwater Information Network
Approach: design
GML GWML WaterML WMS, WFS, SOS
GIN Advanced: 3D, analysis
GWML GML-BC GML-AB GML-SK GML-ON GWML GML
B. Brodaric—GIN Cyberra Summit 2010 Banff, 22 Sept. 2010
11
Client! Wrapper!
global"
local"
Wrapper!
global"
local"
SABL ARGL TERR
sand clay soil Mediator!
global"
ON
QC
Typical mediator architecture
translate query (globallocal)
translate results (localglobal)
distribute query integrate results distribute results
“find all water wells with unconsolidated materials”!
<RockMaterial> <geneticCategory> <CGI_TermValue> <value…>Sedimentary</value> </CGI_TermValue> </geneticCategory> <lithology> … <name…>Sand</name> </lithology>
send query receive results
Registry!metadata"
Ontology!reasoner"matcher"
B. Brodaric—GIN Cyberra Summit 2010 Banff, 22 Sept. 2010
12
W*S!SOS!
local"
W*S, SOS!
Mediator!global"
local"W*S!SOS!
local"
Client!
sand clay soil
SABL ARGL TERR
GIN Mediator architecture receive & translate query distribute query receive results translate & integrate results distribute results
“find all water wells with unconsolidated material”!
<RockMaterial> <geneticCategory> <CGI_TermValue> <value…>Sedimentary</value> </CGI_TermValue> </geneticCategory> <lithology> … <name…>Sand</name> </lithology>
ON
QC
CSW!
send query receive results
Ontology!
GML O&M
WaterML GWML GeoSciML
B. Brodaric—GIN Cyberra Summit 2010 Banff, 22 Sept. 2010
13
schematic
GIN simple lithology ontology
GIN translation of results
Lithology GWML <lithology> … <name…>Sand</name> </lithoogy>
syntactic
semantic
ON Sand
QC Sand
B. Brodaric—GIN Cyberra Summit 2010 Banff, 22 Sept. 2010
15
GIN Basic Portal
<gsml:lithology> <gsml:ControlledConcept gml:id="gin.cc.2d-2"> <gsml:identifier codeSpace="urn:ietf:rfc:2141">urn:x-ngwd:vocabulary:gin:2d-2"</gsml:identifier> <gsml:name codeSpace="urn:x-ngwd:classifierScheme:GIN:Lithology:2008" xml:lang="fr">Argile</gsml:name> <gsml:name codeSpace="urn:x-ngwd:classifierScheme:GIN:Lithology:2008" xml:lang="eng">Clay</gsml:name> <gml:description>A naturally occurring material composed primarily of fine-grained minerals. It is generally plastic at appropriate water contents and will harden when dried of fired (Neuendorf et al. 2005)</gml:description> </gsml:lithology> <gsml:material> <gsml:UnconsolidatedMaterial> <gsml:consolidationDegree> <gsml:CGI_TermValue> <gsml:value codeSpace="urn:cgi:classifierScheme:BGS:consolidationTerms">UNCONSOLIDATED</gsml:value> </gsml:CGI_TermValue> </gsml:consolidationDegree> <gsml:physicalProperty> <gwml:HydrogeologicDescription> <gwml:hydraulicConductivity> <gsml:CGI_NumericValue> <gsml:qualifier>approximate</gsml:qualifier> <gsml:principalValue uom="y_K_md-1">0.001</gsml:principalValue> </gsml:CGI_NumericValue> </gwml:hydraulicConductivity> </gwml:HydrogeologicDescription> </gsml:physicalProperty> </gsml:UnconsolidatedMaterial> </gsml:material>
GWML
Google Earth
Excel
ESRI Shape, GeoDb XML
B. Brodaric—GIN Cyberra Summit 2010 Banff, 22 Sept. 2010
17
GIN Example Performance (2 provs) 50 wells = 2.17 secs, 1.08 Mb
500 wells = 15.01 secs, 7.74 Mb 2500 wells = 69.97 secs, 40.80 Mb 5000 wells = 142.27 secs, 80.41 Mb
B. Brodaric—GIN Cyberra Summit 2010 Banff, 22 Sept. 2010
18
Conclusions
Groundwater data interoperability achieved for water well information and preliminarily sensors
Dynamic mediation effective and efficient modest data volumes are realistic within wait-times
Open geospatial standards for schemas and systems are essential
B. Brodaric—GIN Cyberra Summit 2010 Banff, 22 Sept. 2010
19
URLs Groundwater Information Network (GIN) www.gw-info.net
Groundwater Markup Language (GWML) http://ngwd-bdnes.cits.rncan.gc.ca/gwml
GeoSciML www.geosciml.org
WaterML http://external.opengis.org/twiki_public/bin/view/HydrologyDWG
GIN Mediator http://ngwd-bdnes.cits.rncan.gc.ca/service/api_ngwds/en/mediator.html
Thank you!
B. Brodaric—GIN Cyberra Summit 2010 Banff, 22 Sept. 2010
21
Semantics: types of ontologies Global Ontology!
Application ontology!
(QC ʻmatprimʼ, QC ʻSABLʼ)"
Application ontology!
(ON ʻmaterial1ʼ, ON ʻsandʼ)"
SABL ARGL TERR
sand clay soil
Upper-Level ontology !
(DOLCE ʻamount-of-matterʼ)"
Domain ontology !
(GeoSciML ʻlithologyʼ, "GeoSciML ʻsandʼ)"
local schema local vocabulary
public schema public vocabulary
general concepts
B. Brodaric—GIN Cyberra Summit 2010 Banff, 22 Sept. 2010
22
standard content standard content
Schematics: GWML example standard structure
<gsml:lithology> <gsml:ControlledConcept gml:id="gin.cc.2d-2"> <gsml:identifier codeSpace="urn:ietf:rfc:2141">urn:x-ngwd:vocabulary:gin:2d-2"</gsml:identifier> <gsml:name codeSpace="urn:x-ngwd:classifierScheme:GIN:Lithology:2008" xml:lang="fr">Argile</gsml:name> <gsml:name codeSpace="urn:x-ngwd:classifierScheme:GIN:Lithology:2008" xml:lang="eng">Clay</gsml:name> <gml:description>A naturally occurring material composed primarily of fine-grained minerals.
It is generally plastic at appropriate water contents and will harden when dried of fired (Neuendorf et al. 2005)</gml:description> </gsml:lithology> <gsml:material> <gsml:UnconsolidatedMaterial> <gsml:consolidationDegree> <gsml:CGI_TermValue> <gsml:value codeSpace="urn:cgi:classifierScheme:BGS:consolidationTerms">UNCONSOLIDATED</gsml:value> </gsml:CGI_TermValue> </gsml:consolidationDegree> <gsml:physicalProperty> <gwml:HydrogeologicDescription> <gwml:hydraulicConductivity> <gsml:CGI_NumericValue> <gsml:qualifier>approximate</gsml:qualifier> <gsml:principalValue uom="y_K_md-1">0.001</gsml:principalValue> </gsml:CGI_NumericValue> </gwml:hydraulicConductivity> </gwml:HydrogeologicDescription> </gsml:physicalProperty> </gsml:UnconsolidatedMaterial> </gsml:material>
B. Brodaric—GIN Cyberra Summit 2010 Banff, 22 Sept. 2010
23
Approach: users 1. Portal users: end-users (water managers, scientists, consultants, public)
2. Pipeline users: data processors (portal and tool developers)
GIN Basic GIN Advanced
Troo Corp OGSR Trust
OGSR Library Atlantic ENV
B. Brodaric—GIN Cyberra Summit 2010 Banff, 22 Sept. 2010
24
Mediator implementation
Open source Cocoon, Java, SAX, XML, XSLT
Re-usable Customizable: plug and play data sources and mappings
Efficient Multi-threaded, parallel, cached data stream
Tested GIN, GeoSciML Testbed, OneGeology
Freely available http://ngwd-bdnes.cits.rncan.gc.ca/service/api_ngwds/en/mediator.html
B. Brodaric—GIN Cyberra Summit 2010 Banff, 22 Sept. 2010
25
Semantics
GIN lithology ontology (subset of GeoSciML)
language-neutral concepts (URN), multi-lingual terms, defs - concept = urn:x-ngwd:vocabulary:gin:2c - terms = “sand” (English), “sable” (French) - definition =
enables: multi-lingual query and data download
need to represent definitions in an ontology
B. Brodaric—GIN Cyberra Summit 2010 Banff, 22 Sept. 2010
26
Semantic mapping
<map:rule global="urn:x-ngwd:vocabulary:gin:2c" local="Sand" /> <map:rule global="urn:x-ngwd:vocabulary:gin:2c" local="sand" /> <map:rule global="urn:x-ngwd:vocabulary:gin:2c" local="sadn" /> <map:rule global="urn:x-ngwd:vocabulary:gin:2c" local="sad" />
<map:rule global="urn:x-ngwd:vocabulary:gin:2c" local="Fine Sand" /> <map:rule global="urn:x-ngwd:vocabulary:gin:2c" local="medium fine sand" /> <map:rule global="urn:x-ngwd:vocabulary:gin:2c" local="Medium Sand" /> <map:rule global="urn:x-ngwd:vocabulary:gin:2c" local="Coarse Sand" />
<map:rule global="urn:x-ngwd:vocabulary:gin:2c" local="Sandy" /> <map:rule global="urn:x-ngwd:vocabulary:gin:2c" local="Ssandy" /> <map:rule global="urn:x-ngwd:vocabulary:gin:2c" local="sand silt" /> <map:rule global="urn:x-ngwd:vocabulary:gin:2c" local="Quicksand" />
Semantic mapping LAV: local terms mapped to global concepts mapping specification: XML file (moving to OWL)
e.g. ON ‘sand’ mapping
B. Brodaric—GIN Cyberra Summit 2010 Banff, 22 Sept. 2010
27
Schema mapping
<gsml:lithology> <gsml:ControlledConcept>
<gsml:identifier>ont:geostratumlog/ont:GeologyStratum/ont:material_1</gsml:identifier>
<gsml:identifier>ont:geostratumlog/ont:GeologyStratum/ont:material_2</gsml:identifier>
<gsml:identifier>ont:geostratumlog/ont:GeologyStratum/ont:material_3</gsml:identifier> </gsml:ControlledConcept> </gsml:lithology>
Schema mapping LAV: local schema mapped to global schema mapping specification: modified GWML data file
B. Brodaric—GIN Cyberra Summit 2010 Banff, 22 Sept. 2010
28
GWML scope water water properties water budget , reservoirs aquifers wells observations
B. Brodaric—GIN Cyberra Summit 2010 Banff, 22 Sept. 2010
29
parts of GWML extend GeoSciML, O&M
GeologicUnit EarthMaterial PhysicalDescription
GWML lineage
Observation
B. Brodaric—GIN Cyberra Summit 2010 Banff, 22 Sept. 2010
30
<owl:Class rdf:about="#GeologicUnit"> <rdfs:subClassOf> <owl:Restriction> <owl:onProperty rdf:resource="http://www.loa-cnr.it/ontologies/ExtendedDnS.owl#plays"/> <owl:allValuesFrom rdf:resource="#GeologicUnitPart"/> </owl:Restriction> </rdfs:subClassOf>
conceptual model: OWL/UML, no GML
GWML/GeoSciML design
ConceptualLogicalPhysical GML schema design
concept to GML
logical model: GML-UML
<LithodemicUnit gml:id="GSV53"> <gml:description>Granite, syenite, volcanogenic sandstone, conglomerate, minor trachyte lava</gml:description> <gml:name>Mount Leinster Igneous Complex</gml:name> <purpose>typicalNorm</purpose> <age> <GeologicAge> <value> <CGI_TermRange> <lower> <CGI_TermValue> <value codeSpace="http://www.iugs- cgi.org/geologicAgeVocabulary">Triassic</value> </CGI_TermValue> </lower> <upper> <CGI_TermValue> <value codeSpace="http://www.iugs- cgi.org/geologicAgeVocabulary">Triassic</value> </CGI_TermValue> </upper> </CGI_TermRange> </value> <event> <CGI_TermValue> <value codeSpace="http://www.iugs- cgi.org/geologicAgeEventVocabulary">intrusion</value> </CGI_TermValue> </event> </GeologicAge> </age> <age> physical model: GML-XML
GML-UML to XML
B. Brodaric—GIN Cyberra Summit 2010 Banff, 22 Sept. 2010
31
Next Steps
More geographic coverage other Canadian partners
Higher quality data time-indexed data: water levels, flow rates, quality… SOS
More types of data aquifers, geology, 3D,… WCS
More tools 3D Modeling,…
More infrastructure CWS, OWL Reasoner/Service!