Top Banner
23

Streaming Knowledge Bases Onkar Walavalkar, Anupam Joshi Tim Finin and Yelena Yesha University of Maryland, Baltimore County 27 October 2008.

Jan 03, 2016

Download

Documents

Solomon Adams
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Streaming Knowledge Bases Onkar Walavalkar, Anupam Joshi Tim Finin and Yelena Yesha University of Maryland, Baltimore County 27 October 2008.
Page 2: Streaming Knowledge Bases Onkar Walavalkar, Anupam Joshi Tim Finin and Yelena Yesha University of Maryland, Baltimore County 27 October 2008.

Streaming Streaming Knowledge BasesKnowledge Bases

Onkar Walavalkar, Anupam JoshiOnkar Walavalkar, Anupam JoshiTim Finin and Yelena YeshaTim Finin and Yelena Yesha

University of Maryland, Baltimore CountyUniversity of Maryland, Baltimore County27 October 200827 October 2008

Page 3: Streaming Knowledge Bases Onkar Walavalkar, Anupam Joshi Tim Finin and Yelena Yesha University of Maryland, Baltimore County 27 October 2008.

Streaming Streaming Knowledge BasesKnowledge Bases

Onkar Walavalkar, Anupam JoshiOnkar Walavalkar, Anupam JoshiTim Finin and Yelena YeshaTim Finin and Yelena Yesha

University of Maryland, Baltimore CountyUniversity of Maryland, Baltimore County27 October 200827 October 2008

Page 4: Streaming Knowledge Bases Onkar Walavalkar, Anupam Joshi Tim Finin and Yelena Yesha University of Maryland, Baltimore County 27 October 2008.

Streaming Streaming Knowledge BasesKnowledge Bases

Onkar Walavalkar, Anupam JoshiOnkar Walavalkar, Anupam JoshiTim Finin and Yelena YeshaTim Finin and Yelena Yesha

University of Maryland, Baltimore CountyUniversity of Maryland, Baltimore County27 October 200827 October 2008

Page 5: Streaming Knowledge Bases Onkar Walavalkar, Anupam Joshi Tim Finin and Yelena Yesha University of Maryland, Baltimore County 27 October 2008.

Overview

• Motivation• Streaming databases• Streaming knowledge bases• Experiments and results• Conclusions

Motivation Stream DBs Stream KBs Experiments Conclusions

Page 6: Streaming Knowledge Bases Onkar Walavalkar, Anupam Joshi Tim Finin and Yelena Yesha University of Maryland, Baltimore County 27 October 2008.

Operating Room of the Future

• ORs will be awash in low-level data, much of it noisy or incomplete• Challenges include coping with the noise and interpreting the low-

level data to recognize high-level events and activities

ORF

drugs

patient Monitors

staff

tools

RFID

AwarePoint

RFID

RFID

Bluetooth

Bluetooth

WIFI

WIFI

devices

Motivation Stream DBs Stream KBs Experiments Conclusions

Page 7: Streaming Knowledge Bases Onkar Walavalkar, Anupam Joshi Tim Finin and Yelena Yesha University of Maryland, Baltimore County 27 October 2008.

Initial work in OR training

• UMD Mastri Center is experimenting with OR technologies and training environments

• The Human Patient Simulator from METI– Designed to react like a human– Responds to medical treatment

• Generates continuous streams of data, moderated by– Initial conditions (e.g. blunt trauma multiple injuries scenario)– human interactions

Motivation Stream DBs Stream KBs Experiments Conclusions

Page 8: Streaming Knowledge Bases Onkar Walavalkar, Anupam Joshi Tim Finin and Yelena Yesha University of Maryland, Baltimore County 27 October 2008.

Efficient Data Stream Management

• Data is stored/indexed in system• Queries applied to stored data as they

“stream through”

Queries

Ind

ex

Results

Dat

a

Query

Index

Results

Data

Traditional DBMS Stream Management System• Queries stored/indexed in system• Data applied to stored queries as they

“stream through”

Several efforts: Tapestry, Aurora, TelegraphCQ Motivation Stream DBs Stream KBs Experiments Conclusions

Page 9: Streaming Knowledge Bases Onkar Walavalkar, Anupam Joshi Tim Finin and Yelena Yesha University of Maryland, Baltimore County 27 October 2008.

Stream Processor(TelegraphCQ)

ContinuousQueries

Patient Monitor

RFID System

MedicinesTools

Staff

Trend Analyzer Physiological

Data

Low-LevelEvent Processor

Database

Patient History

Medical Supplies

Staff

Rule Base

Assert facts

MedicalEncounterRecord

Video Clipper

Assert facts

Event Detection - Level 3

Event Detection - Level 2

Event Detection - Level 1

Events

Events

Motivation Stream DBs Stream KBs Experiments Conclusions

Page 10: Streaming Knowledge Bases Onkar Walavalkar, Anupam Joshi Tim Finin and Yelena Yesha University of Maryland, Baltimore County 27 October 2008.

What’s wrong with this picture?

• We need to enhance this to support semantic interoperability for medical data & knowledge

• The medial community has a long history developing & using standard ontologies & metadata

• Incoming streams of data can be in rdf• And reference terms in appropriate ontologies

Motivation Stream DBs Stream KBs Experiments Conclusions

Page 11: Streaming Knowledge Bases Onkar Walavalkar, Anupam Joshi Tim Finin and Yelena Yesha University of Maryland, Baltimore County 27 October 2008.

What’s wrong with this picture?

• Streaming Database systems use continuous queries specified over a sliding time window– e.g., [range by ‘30 seconds’ slide by ‘10

seconds’]• Issues:

– Where do we we do reasoning?– How do we answer queries against a sliding

window of data?

Motivation Stream DBs Stream KBs Experiments Conclusions

Page 12: Streaming Knowledge Bases Onkar Walavalkar, Anupam Joshi Tim Finin and Yelena Yesha University of Maryland, Baltimore County 27 October 2008.

RDF Stream Processing

Static Data Store

RangeInfo

PropertyTree

DomainInfo

InverseInfo

Classtree

input streamhandler

Special domainrules & queries

Input Triple Stream

Enhanced Stream

Query for Class of Concern

Detected Instances

Motivation Stream DBs Stream KBs Experiments Conclusions

Page 13: Streaming Knowledge Bases Onkar Walavalkar, Anupam Joshi Tim Finin and Yelena Yesha University of Maryland, Baltimore County 27 October 2008.

Experiments and results

• Three simple reasoners– Jena, in core– Pre-computed custom hash tables– Using tables in TelegraphCQ

• Various scenarios– Ontology size: 118 - 23.1 MB– Number of subclasses: 49 - 57,000– Subclass depth: 2 - 9– Data rate: 1 - 50 triples per second

Page 14: Streaming Knowledge Bases Onkar Walavalkar, Anupam Joshi Tim Finin and Yelena Yesha University of Maryland, Baltimore County 27 October 2008.

Domain Example

• Monitor data stream looking for observations of invasive species from Bioblitz and eco-blogging data streams

• Uses our Ethan ontologies for ecoinformatics• Tree of life (~340K taxons from ITIS and other sources)• Species profiles• Invasive species definitions• Observation

Page 15: Streaming Knowledge Bases Onkar Walavalkar, Anupam Joshi Tim Finin and Yelena Yesha University of Maryland, Baltimore County 27 October 2008.

Reasoning delay comparison for all approaches

Page 16: Streaming Knowledge Bases Onkar Walavalkar, Anupam Joshi Tim Finin and Yelena Yesha University of Maryland, Baltimore County 27 October 2008.

Reasoning delay comparison for all approaches

Page 17: Streaming Knowledge Bases Onkar Walavalkar, Anupam Joshi Tim Finin and Yelena Yesha University of Maryland, Baltimore County 27 October 2008.

Reasoning delay comparison for all approaches

Page 18: Streaming Knowledge Bases Onkar Walavalkar, Anupam Joshi Tim Finin and Yelena Yesha University of Maryland, Baltimore County 27 October 2008.

Reasoning delay comparison for all approaches

Page 19: Streaming Knowledge Bases Onkar Walavalkar, Anupam Joshi Tim Finin and Yelena Yesha University of Maryland, Baltimore County 27 October 2008.

VM Usage comparison of all 3 approaches

Page 20: Streaming Knowledge Bases Onkar Walavalkar, Anupam Joshi Tim Finin and Yelena Yesha University of Maryland, Baltimore County 27 October 2008.

VM Usage for Jena for different classes

Page 21: Streaming Knowledge Bases Onkar Walavalkar, Anupam Joshi Tim Finin and Yelena Yesha University of Maryland, Baltimore County 27 October 2008.

VM usage comparison for Hashtable and TCQ

Page 22: Streaming Knowledge Bases Onkar Walavalkar, Anupam Joshi Tim Finin and Yelena Yesha University of Maryland, Baltimore County 27 October 2008.

Conclusions

• If the incoming triple data rate goes beyond a certain limit, the reasoning speed starts to lag and tends to slow down the incoming stream.

• The speedup achieved by using TCQ and a hashtable prove the value of pre-processing an ontology, particularly for fast streaming facts.

Page 23: Streaming Knowledge Bases Onkar Walavalkar, Anupam Joshi Tim Finin and Yelena Yesha University of Maryland, Baltimore County 27 October 2008.

http://ebiquity.umbc.edu/http://ebiquity.umbc.edu/