Constructing Semantic Gazetteers: Managing GeoSpatial Vocabularies Using Open Semantic Web Standards Stephane Fellah Barry J. Glick Yaser Bishr smartRealm LLC [email protected]Association of American Geographers (AAG) Annual Conference Washington DC, April 15, 2010
43
Embed
Constructing Semantic Gazetteers: Managing GeoSpatial Vocabularies Using Open Semantic Web Standards
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Constructing Semantic Gazetteers: Managing GeoSpatial Vocabularies Using
Demonstrate the value of geo-enabling a librarian database….by
Geocoding and spatially indexing a complete librarian database; i.e. ASFA
Implementing geographic search of documents integrated with topic and author search and map-based visualization of results
Assisting users in discovering relevant information by surfacing the controlled vocabularies of ASFA
Testing of prototype by users to assess utility, ease of use, etc
Demonstrate the value of linked data and semantic by: Enabling geospatial reasoning Encoding taxonomies in machine processable format Resolve ambiguity of terms Reusability of linked data
ASFA: Aquatic Sciences and Fisheries Abstracts
ASFA series is the premier reference in the field of aquatic resources.
Input to ASFA is provided by a growing international network of information centers monitoring over 5,000 serial publications, books, reports, conference proceedings, translations and limited distribution literature.
ASFA is a component of the Aquatic Sciences and Fisheries Information System (ASFIS), formed by four United Nations agency sponsors of ASFA and a network of international and national partners.
1.3 million records encoded in XML.
ASFIS 6
Descriptors used for subject indexing and retrieval of information on all aspects of aquatic sciences and technology
6267 vocabulary terms allowing the
We used existing SKOS encoding of the taxonomy
ASFIS 7
Geographic descriptors used in ASFA system Not officially standardized Inconsistencies due manual entries Hardwired in system Goal of this project:
Encode semantically ASFIS7 taxonomy Geocoding of the taxnomy Enable spatial search in ASFA database.
11
Support Multiple Use Cases
Researcher has a specific research goal: provide a quicker, simpler way to filter results to get to the relevant documents I’m looking for research on coral reef diseases in the western Caribbean region
Researcher has a specific area of interest: allow user to use map or geographic terms to define area of interest and use it to find relevant research I am studying the Danube delta region…what research is available in ASFA for
this area? (and what topics does the research address?) Geo-exploration of research: researcher is interested in a specific topic and
uses the map to explore relevant document. My research interest is oyster farming. Where in the world has research been
conducted on this topic? Others:
Where does a specific author conduct his/her research? Which authors have published the most research on a specific area of interest? What is the geographic distribution of research on a specific topic? (and where
are gaps?)
Open Standards used
RDF: Graph Representation
Equivalent in relational model
Model minimalist: the TRIPLE Model
association
attribute
LiteralObject
Object
Linked Open Data
Geospatial Semantic Web Architecture
Source: Berners-LeeAAAI July 2006
Geospatial Datatypes
Geospatial Functions
Geospatial Ontology
Extensions
Geospatial Logic
SKOS
SKOS = Simple Knowledge Organization System
A common data model for sharing and linking knowledge organization systems (KOS) via the Semantic Web.
Arrange geographic places in an order from most general to most specific, e.g. World/Continent/Country/State or Province/City World/Ocean/Ocean Region/Sea/Bay World/Continent/Country/River or Lake
This allows user to move up and down hierarchy in search and to find related, more specific and more general terms
Also helps in distinguishing geographic place names that are ambiguous, e.g. Mississippi as river vs. Mississippi as state, etc.
Geo-SKOS
Define an extension of SKOS for geospatial concept.
GeoConcept is a subclass of Concept
GeoConcept has location propertyies
Specialization of narrower and broader
Narrower => Narrower-partitive,… Broader => Broader-partitive,… Related => Nearby, SW of, west of,…
Geocoding process
Geocoding Process
ASFAXML
Q3list
Q3extraction
SKOSEncoding
Top Concepts(Countries, Sea Zones)
ASFIS7SKOS
GeocoderGeocoded
ASFIS7SKOS
PostProcessing
(bbox, centroid)
Reasoning
Post-processedGeocoded
ASFIS7SKOS
Indexing
InferredGeocoded
ASFIS7SKOS
ASFIS7Index
IndexingMapping
SmartRealmGazetteer
OracleSpatialIndex
Approach
Encode legacy data from q3 fields in ASFA
Not using Authoritative list because no direct matching between terms
<ti>Divergence Among Barking Frogs (Eleutherodactylus Augusti) In The
Southwestern United States</ti>
<ab>Barking frogs (Eleutherodactylus augusti) are distributed from southern Mexico along the Sierra Madre Occidental into Arizona and the SierraMadre Oriental into Texas and New Mexico. .... </ab>
Field indexed in Lucene/Solr Id Type Preferred labels, alternate labels Geometry Centroid Bounding box Narrower, narrower transitive Broader, broader transitive Related Feature types Equivalent terms
Id, centroid and geometries are spatially indexed in Oracle spatial
Prototype Application
Advantages of Faceted Search
Lets the user decide how to start, and how to explore and group
After refinement, categories that are not relevant to the current results disappear
Seamlessly integrates keyword search with the organizational structure.
Very easy to expand out (loosen constraints)
Very easy to build up complex queries
Advantages of Faceted Search
Can’t end up with empty results sets (except with keyword search)
Helps avoid feelings of being lost Easier to explore the collection
Helps users infer what kinds of things are in the collection. Evokes a feeling of “browsing the shelves”
Is preferred over standard search for collection browsing in usability studies (Interface must be designed properly)
41
Geospatial Hierarchical facet
Benefit of semantic approach
Unique identifier for place
Distinction in search between direct place and indirect place (by transitivity)
Multilingual search
Alternate names search still point to same uri (New York, NYC, Big Apple)
Linkable to other data (reusable for different applications)
Reasoning
Easy integration
43
Accomplishments
Geo-semantic enabled ASFA prototype is a breakthrough Not just pins on a map – fully integrated geo-spatial and semantic
search with GIS display and operations Uses geographic knowledge base and map interface to aid search and
discovery Unique aspects:
Tagging research document not just to points, but to linear features and areal regions on the earth’s surface
Allowing for user-defined areas of interest, including polygons Creating a geo-semantic structure for the locations to enable enhanced
search because of inheritance and inference: e.g. if something is tagged with “Naked Island, Alaska” we know that it is part
of North America and USA but also that it is within Prince William Sound which is within the Gulf of Alaska, which is part of the eastern North Pacific ocean region. Thus a search for research on oil spills in Prince William Sound will also include any documents tagged with Naked Island, Alaska even without any explicit mention of Pr. Wm. Sound in the document