Selected Semantic Web Trends, Progress, and Directions Deborah McGuinness Acting Director and Senior Research Scientist Knowledge Systems, AI Laboratory Stanford University http://www.ksl.stanford.edu/people/dlm CEO McGuinness Associates (soon Tetherless World Constellation Chair RPI)
61
Embed
Selected Semantic Web Trends, Progress, and Directions
Selected Semantic Web Trends, Progress, and Directions. Deborah McGuinness Acting Director and Senior Research Scientist Knowledge Systems, AI Laboratory Stanford University http://www.ksl.stanford.edu/people/dlm CEO McGuinness Associates (soon Tetherless World Constellation Chair RPI). - PowerPoint PPT Presentation
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Selected Semantic Web Trends, Progress, and Directions
Deborah McGuinness
Acting Director and Senior Research ScientistKnowledge Systems, AI Laboratory
• Trust (examples from Wikipedia study for explainable knowledge aggregation with extensions to text analytics, connection to NSF TAMI)
• Semantic Integration of Scientific Data– Virtual Observatories (e.g., NSF-funded Virtual Solar Terrestrial
Observatory)– Semantically-Enabled Scientific Data Integration (NASA-funded)
• Conclusion / Discussion
June 22, 2007 Deborah L. McGuinness 5
Virtual Observatories
Scientists should be able to access a global, distributed knowledge base of scientific data that:• appears to be integrated• appears to be locally available
But… data is obtained by multiple instruments, using various protocols, in differing vocabularies, using (sometimes unstated) assumptions, with inconsistent (or non-existent) meta-data. It may be inconsistent, incomplete, evolving, and distributed
June 22, 2007 Deborah L. McGuinness 6
Virtual Observatory Defined
• Workshop: A Virtual Observatory (VO) is a suite of software applications on a set of computers that allows users to uniformly find, access, and use resources (data, software, document, and image products and services using these) from a collection of distributed product repositories and service providers. A VO is a service that unites services and/or multiple repositories.
• VxOs - x is one discipline
June 22, 2007 Deborah L. McGuinness 7
Virtual Observatories in Practice
Make data and tools quickly and easily accessible to a wide audience.
Operationally, virtual observatories need to find the right balance of data/model holdings, portals and client software that a researchers can use without effort or interference as if all the materials were available on his/her local computer using the user’s preferred language.
They are likely to provide controlled vocabularies that may be used for interoperation in appropriate domains along with database interfaces for access and storage and “smart” search functions and tools for evolution and maintenance.
June 22, 2007 Deborah L. McGuinness 8
Virtual Solar Terrestrial Observatory (VSTO)
• a distributed, scalable education and research environment for searching, integrating, and analyzing observational, experimental, and model databases.
• subject matter covers the fields of solar, solar-terrestrial and space physics
• it provides virtual access to specific data, model, tool and material archives containing items from a variety of space- and ground-based instruments and experiments, as well as individual and community modeling and software efforts bridging research and educational use
• 3 year NSF-funded project just beginning the second year
June 22, 2007 Deborah L. McGuinness 9
Content: Coupling Energetics and Dynamics of Atmospheric Regions WEB
Community data archive for observations and models of Earth's upper atmosphere and geophysical indices and parameters needed to interpret them. Includes browsing capabilities by periods, instruments, models, …
June 22, 2007 Deborah L. McGuinness 10
Content: Mauna Loa Solar ObservatoryNear real-time data from Hawaii from a variety of solar instruments.
Source for space weather, solar variability, and basic solar physics
Other content used too – CISM – Center for Integrated Space Weather Modeling
• Determine the statistical signatures of both volcanic and solar forcings on the height of the tropopause From paleoclimate researcher – Caspar Ammann – Climate and Global
Dynamics Division of NCAR - CGD/NCAR
Layperson perspective:
- look for indicators of acid rain in the part of the atmosphere we experience…
(look at measurements of sulfur dioxide in relation to sulfuric acid after volcanic eruptions at the boundary of the troposphere and the stratosphere)
Nasa funded effort with Fox - NCAR, Sinha - Va. Tech, Raskin - JPL
June 22, 2007 Deborah L. McGuinness 15
Use Case detail: A volcano erupts• Preferentially it’s a tropical mountain (+/- 30 degrees of the equator) with ‘acidic’ magma; more
SiO2, and it erupts with great intensity so that material and large amounts of gas are injected into the stratosphere.
• The SO2 gas converts to H2SO4 (Sulfuric Acid) + H2O (75% H2SO4 + 25% H2O). The half life of SO2 is about 30 - 40 days.
• The sulfuric acid condensates to little super-cooled liquid droplets. These are the volcanic aerosol that will linger around for a year or two.
• Brewer Dobson Circulation of the stratosphere will transport aerosol to higher latitudes. The particles generate great sunsets, most commonly first seen in fall of the respective hemisphere. The sunlight gets partially reflected, some part gets scattered in the forward direction.
• Result is that the direct solar beam is reduced, yet diffuse skylight increases. The scattering is responsible for the colorful sunsets as more and more of the blue wavelength are scattered away.in mid-latitudes the volcanic aerosol starts to settle, but most efficient removal from the stratosphere is through tropopause folds in the vicinity of the storm tracks.
• If particles get over the pole, which happens in spring of the respective hemisphere, then they will settle down and fall onto polar ice caps. Its from these ice caps that we recover annual records of sulfate flux or deposit.
• We get ice cores that show continuous deposition information. Nowadays we measure sulfate or SO4(2-). Earlier measurements were indirect, putting an electric current through the ice and measuring the delay. With acids present, the electric flow would be faster.
• What we are looking for are pulse like events with a build up over a few months (mostly in summer, when the vortex is gone), and then a decay of the peak of about 1/e in 12 months.
• The distribution of these pulses was found to follow an extreme value distribution (Frechet) with a heavy tail.
June 22, 2007 Deborah L. McGuinness 16
Use Case detail: … climate• So reflection reduces the total amount of energy, forward scattering just changes the
beam, path length, but that's it. • The dry fogs in the sky (even after thunderstorm) still up there, thus stratosphere not
troposphere. • The tropical reservoir will keep delivering aerosol for about two years after the
eruption.• The particles are excellent scatterers in short wavelength. They do absorb in NIR and
in IR. Because of absorption, there is a local temperature change in the lower stratosphere.
• This temperature change will cause some convective motion to further spread the aerosol, and second: Its good factual stuff. Once it warms up, it will generate a temperature gradient. Horizontal temperature gradients increase the baroclinicity and thus storms, and they speedup the local zonal winds. This change in zonal wind in high latitudes is particularly large in winter. This increased zonal wind (Westerly) will remove all cold air that tries to buildup over winter in high arctic.
• Therefore, the temperature anomaly in winter time is actually quite okay.• Impact of volcanoes is to cool the surface through scattering of radiation. • In winter time over the continents there might be some warming. In the stratosphere,
the aerosol warm. • The amount of GHG emitted is comparably small to the reservoir in the air. • The hydrologic cycle responds to a volcanic eruption.
June 22, 2007 Deborah L. McGuinness 17
Atmosphere (portions from SWEET)
June 22, 2007 Deborah L. McGuinness 18
Atmosphere II
June 22, 2007 Deborah L. McGuinness 19
June 22, 2007 Deborah L. McGuinness 20
A few observations worth noting
• CMAPS have been convenient knowledge capture tools
• We facilitate knowledge acquisition meetings AND provide a starting point
• We are experiencing good reuse of ontologies and infrastructure
• Next – Quick VSTO walk thru
June 22, 2007 Deborah L. McGuinness 21
www.vsto.org
June 22, 2007 Deborah L. McGuinness 22
•
June 22, 2007 Deborah L. McGuinness 23
Partial exposure of Instrument class hierarchy - users seem to like this
Semantic filtering by domain or instrument hierarchy
June 22, 2007 Deborah L. McGuinness 24
June 22, 2007 Deborah L. McGuinness 25
Inferred plot type and return required axes data
June 22, 2007 Deborah L. McGuinness 26
VSTO • Conceptual model and architecture developed by combined
team; KR experts, domain experts, and software engineers• Semantic framework developed and built with a small,
cohesive, carefully chosen team in a relatively short time (deployments in 1st year)
• Production portal released, includes security, etc. with community migration (and so far endorsement)
• VSTO ontology version 1.0, (vsto.owl)• Web Services encapsulation of semantic interfaces • More Solar Terrestrial use-cases to drive the completion of
the ontologies - filling out the instrument ontology• Using ontologies in other applications (volcanoes, climate, …)
June 22, 2007 Deborah L. McGuinness 27
Semantic Web Methodology and Technology Development Process
• Establish and improve a well-defined methodology vision for Semantic Technology-based application development
Use Case
Small Team, mixed skills
Analysis
Adopt Technology Approach
Leverage Technology
Infrastructure
Rapid PrototypeOpen World:
Evolve, Iterate, Redesign, Redeploy
Use Tools
Expert Review & Iteration
Develop model/
ontology
Joint with P. Fox
June 22, 2007 Deborah L. McGuinness 28
Benefits• Unified query workflow• Decreased input requirements for query: in one base reducing the
number of selections from eight to three• Interface generates only syntactically correct queries: which was not
always true in previous implementations without semantics• Semantic query support: by using background ontologies and a
reasoner, our application has the opportunity to only expose coherent queries
• Semantic integration: in the past users had to remember (and maintain codes) to account for numerous different ways to combine and plot the data whereas now semantic mediation provides the level of sensible data integration required– understanding of coordinate systems, relationships, data synthesis,
transformations, etc.
• A broader range of potential users (PhD scientists, students, professional research associates and those from outside the fields)
June 22, 2007 Deborah L. McGuinness 29
Explanation Transition
June 22, 2007 Deborah L. McGuinness 30
Interoperability – as systems use varied sources and multiple information manipulation engines, they benefit more from encodings that are shareable & interoperable
Provenance – if users (humans and agents) are to use and integrate data from unknown, unreliable, or evolving sources, they need provenance metadata for evaluation
Explanation/Justification – if information has been manipulated (i.e., by sound deduction or by heuristic processes), information manipulation trace information should be available
Trust – if some sources are more trustworthy than others, representations should be available to encode, propagate, combine, and (appropriately) display trust values
Provide interoperable knowledge provenance infrastructure that supports explanations of
sources, assumptions, learned information, and answers as an enabler for trust.
Provide interoperable knowledge provenance infrastructure that supports explanations of
sources, assumptions, learned information, and answers as an enabler for trust.
General Motivation
June 22, 2007 Deborah L. McGuinness 31
Requirements gathered from…DARPA Agent Markup Language (DAML)
Enable the next generation of the webDARPA Personal Assistant that Learns (PAL)
Enable computer systems that can reason, learn, be told what to do, explain actions, reflect on their experience, & respond robustly to surprise
DARPA Integrated Learning (IL)Enable learning general plans or processes from human users by being shown one example by opportunistically assembling knowledge from many different sources, including generating it by reasoning, in order to learn.
DARPA Rapid Knowledge Formation (RKF)Allow distributed teams of subject matter experts to quickly and easily build, maintain,
and use knowledge bases without need for specialized trainingDTO Novel Intelligence for Massive Data (NIMD)
Avoid strategic surprise by helping analysts be more effective (focus attention on critical information and help analyze/prune/refine/explain/reuse/…)
DTO IKRIS – Interoperable knowledge representation for intelligence apps NSF & NASA Scientific Data Integration (NSF Virtual Observatories (VSTO), NSF GEON,
NASA SESDI, NASA SKIF, …)NSF Cybertrust Transparent Accountable Data Mining (TAMI)Govt Classified applications that must defend their conclusions
• DARPA’s PAL program – explaining cognitive assistant suggestions.
• Video
June 22, 2007 Deborah L. McGuinness 36
Explainer Strategy (for cognitive assistants)
Present – Query– Answer– Abstraction of justification (using PML encodings)– Provide access to meta information– Suggests drill down options (also provides
feedback options)
June 22, 2007 Deborah L. McGuinness 37
Architecture for Explaining Task Processing
Collaboration Agent
Justification Generator
Task Manager (TM)
TM WrapperExplanation Dispatcher
Task State Database
TM Explainer
TaskLearner1
TaskLearner2
TaskLearner3
June 22, 2007 Deborah L. McGuinness 38
Task Explanation
Ability to ask “why” at any point…
Context appropriate follow-up questions are presented
June 22, 2007 Deborah L. McGuinness 39
IWBrowser - Browse & Debug (TAMI)
June 22, 2007 Deborah L. McGuinness 40
Multiple Interfaces… Browsing Proofs
June 22, 2007 Deborah L. McGuinness 41
Browsing & Debugging
June 22, 2007 Deborah L. McGuinness 42
Example Abstraction (using tactics)
June 22, 2007 Deborah L. McGuinness 43
Intelligence Tool Explanation(similar to other applications that reason with statements
that may not be 100% correct)
June 22, 2007 Deborah L. McGuinness 44
Follow-up : Metadata
June 22, 2007 Deborah L. McGuinness 45
Follow-up: Assumptions
June 22, 2007 Deborah L. McGuinness 46
Explaining Extracted Entities (Techies)
Sentences in English
Sentences in annotated English
Sentences in logical format, i.e., KIF
June 22, 2007 Deborah L. McGuinness 47
Trustworthiness of Extracted Entities
The combined conclusion is highly
trustworthy
A trustworthy conclusion from IBM STAG KDD-model
Annotator
A highly trustworthy
conclusion from IBM EAnnotator
June 22, 2007 Deborah L. McGuinness 48
Estimated trustworthiness of the IBM extraction and integration components
IBM Cross-Annotator Coreference Resolver 0.82 IBM Cross-Document Coreference Resolver 0.63 IBM EAnnotator 0.91 IBM GlossOnt 0.33 IBM JResporator 0.31 IBM KANI holdsDuring Relation Detector 0.20 IBM Knowledge Integrator 0.88 IBM Knowledge Structures Group's Relation Detector 0.94 IBM Statistical Text Analytics Group's ACE-model Annotator0.80 IBM Statistical Text Analytics Group's KDD-model Annotator0.73 IBM TAF/Talent plus a collection of miscellaneous TFST grammars 0.78 IBM Talent time annotator 0.83
June 22, 2007 Deborah L. McGuinness 49
Trustworthiness of Ramazi report
From CIA
From FBI
From intercepts
June 22, 2007 Deborah L. McGuinness 50
Trustworthiness of revised Ramazi report
This text fragment changed from “Neutral” to “Trustworthy” after it was
revised by an analyst (the phone number was corrected).
51
Trust Representation, Calculation, & Propagation
- Begin with simple representation of trust
- Present trust coloring view
- Calculation & propagation research options
52
fragment
A Sample PML encodinghttp://inferenceweb.stanford.edu/2006/02/example1-iw-wiki.owl
fragment trust
author trust
<iw:NodeSet rdf:about="http://foto.stanford.edu/mediawiki-1.4.12/index.php/Natural_number"> <In mathematics, a natural number is either a positive integer … </iw:hasConclusion> <iw:hasLanguage rdf:resource="http://inferenceweb.stanford.edu/registry/LG/English.owl#English"/> <iw:isConsequentOf> <iw:InferenceStep> <iw:hasRule rdf:resource="http://inferenceweb.stanford.edu/registry/DPR/Told.owl#Told"/> <iw:hasInferenceEngine rdf:resource="http://inferenceweb.stanford.edu/registry/IE/CitationTrust.owl#CitationTrust"/> <iw:hasSourceUsage> <iw:SourceUsage> <iw:hasSource> <iw:Source rdf:about="http://inferenceweb.stanford.edu/wp/registry/PER/Alexandrov.owl#Alexandrov"/> </iw:hasSource> </iw:SourceUsage> </iw:hasSourceUsage> </iw:InferenceStep> </iw:isConsequentOf></iw:NodeSet>
For more info on talk topics:- Inference Web - iw.stanford.edu (OWL - www.w3.org/TR/owl-features/ )- Virtual Solar Terrestrial Observatory- www.vsto.org - Semantic Technology Conference - www.semantic-conference.com/ - AAAI workshops: Explanation-Aware Computing, Semantic eScience, IAAI- Special Issue for Elsevier Computers and GeoSciences, New Springer Journal – Journal of Earth
http://iw.stanford.edu/2.0/publications.html -Best Inference Web Paper: Deborah L. McGuinness and Paulo Pinheiro da Silva. Explaining Answers from the Semantic Web: The Inference Web Approach. Journal of Web Semantics. Vol.1 No.4., pages 397-413, October 2004.
-Best PML paper: Paulo Pinheiro da Silva, Deborah L. McGuinness and Richard Fikes. A Proof Markup Language for Semantic Web Services. Information Systems. Volume 31, Issues 4-5, June-July 2006, Pages 381-395.
Best Requirements paper (from interviewing Intelligence Analysts)Cowell, McGuinness, Varley, &Thurman. Knowledge-Worker Requirements for Next Generation Query Answering and Explanation Systems. Workshop on Intelligent User Interfaces for Intelligence Analysis, (IUI 2006), Sydney, Australia
Best UI paper - Explanation Interfaces for the Semantic Web. SWUI ’06. http://www.ksl.stanford.edu/KSL_Abstracts/KSL-06-14.html
Wine Agent receives a meal description and retrieves a selection of matching wines available on the Web, using an ensemble of emerging standards and tools:
• OWL for representing a domain ontology of foods, wines, their properties, and relationships between them• JTP theorem prover for deriving appropriate pairings• OWL-QL for querying a knowledge base consisting of the above• Inference Web for explaining and validating the response• [Web Services for interfacing with vendors]• Utilities for conducting and caching the above transactions
June 22, 2007 Deborah L. McGuinness 57
June 22, 2007 Deborah L. McGuinness 58
Processing
• Given a description of a meal,– Use OWL-QL to state a premise (the meal) and query the
knowledge base for a suggestion for a wine description or set of instances
– Use JTP to deduce answers (and proofs)– Use Inference Web to explain results (descriptions, instances,
provenance, reasoning engines, etc.) – Access relevant web sites (wine.com, …) to access current
information– Use OWL-S for markup and protocol*This general scheme used in other projects like TAMI, IL, etc.
Originally from AAAI 1999- Ontologies Panel – updated by McGuinness
Markup such as DAML+OIL, OWL can be used to encode the spectrum
June 22, 2007 Deborah L. McGuinness 60
• Scaling to large numbers of data providers• Crossing disciplines• Security, access to resources, policies• Branding and attribution (where did this data come
from and who gets the credit, is it the correct version, is this an authoritative source?)
• Provenance/derivation (propagating key information as it passes through a variety of services, copies of processing algorithms, …)
• Data quality, preservation, stewardship, rescue• Interoperability at a variety of levels (~3)
Issues for Virtual Observatories
Semantics can help with many of these
June 22, 2007 Deborah L. McGuinness 61
Impact: Virtual Observatories Changing Science
Scientists: What if you…- could not only use your data and tools but remote colleague’s data and tools?- understood their assumptions, constraints, etc and could evaluate
applicability?- knew whose research currently (or in the future) would benefit from your
results?- knew whose results were consistent (or inconsistent) with yours?…
Funders/Managers: What if you …- could identify how one research effort would support other efforts?- (and your fundees/employees) could reuse previous results?- (and your fundees/employees) could really interoperate?
CS: What if you…- could apply your techniques across very large distributed teams of people
with related but different apps?- could compare your techniques with colleagues trying to solve similar