Geospatial Stream Query Processing Geospatial Stream Query Processing i Mi f SQL S S I ih using Microsoft SQL Server StreamInsight using Microsoft SQL Server StreamInsight 1 1 2 1 1 Seyed Jalal Kazemitabar 1 Ugur Demiryurek 1 Mohamed Ali 2 Afsin Akdogan 1 Cyrus Shahabi 1 1 It td M di S t C t 2 Mi ft SQL S 1 Integrated Media Systems Center 2 Microsoft SQL Server University of Southern California Microsoft Corporation ICampus IWatch CT ICampus IWatch CT Streaming Engine Introduction Streaming Engine GeoInsight • StreamInsight Architecture • A real-world data-driven framework which enables: A real world data driven framework which enables: – Fast query processing over stream data using Microsoft – Fast query processing over stream data using Microsoft StreamInsight TM StreamInsight Running spatial queries over geospatial data – Running spatial queries over geospatial data O li l i d di ti b d hi t i dt i i – Online analysis and prediction based on historic data using our in- k t hi t hi memory sketching technique • Stream flow in demo Q er Average er Q 3 dapte Value Filter Spatial Filter PCA PCA Predict Refine Q 1 Q 2 Q 5 Average Adapte Q 4 Q 6 Q 7 put Ad Value Filter Spatial Filter PCA PCA, Predict Refine Average tput A Inp Out Application Approach O li A l ti l R fi t dP di ti (OARP) Ui I Sk t h Online Analytical Refinement and Prediction (OARP) Using In-memory Sketches Hybrid queries over spatio-temporal windows provide great analysis • Instead of storing the whole data in DB, store the sketches in memory functionality including: • Principal component Analysis (PCA): a mathematical approach for analyzing • Refinement functions • Principal component Analysis (PCA): a mathematical approach for analyzing correlated data • Refinement functions correlated data – Smoothing noisy input data according to previously observed patterns A b f t ith ti fl Dt ti f li h t i db di th t hi hl • A number of components with great influence – Detection of anomalies characterized by sensor readings that are highly d itdf hi t i l l selected as coordinates deviated from historical mean values • Improving PCA performance for aggregate queries by • Prediction functions Improving PCA performance for aggregate queries by calculating the query result in transformed space P di ti ft t d b d i l b d tt calculating the query result in transformed space – Predicting near future trends based on previously observed patterns – Responding to anomalies and deliberately attempting to change future conditions Contribution/Experiments Contribution/Experiments PCA for Traffic Data PCA for Traffic Data Hi hd t i t • High data compression rate – 98% for highway data • Extra short response time Challenges – 2 milliseconds (compare to 58 sec.) Challenges 2 milliseconds (compare to 58 sec.) • Highly accurate for Traffic Data Large Datasets and Spatial Queries • Highly accurate for Traffic Data MSE for same query: 10 -4 Mph • Large response time caused by disk I/O limits the availability of hybrid – MSE for same query: 10 -4 Mph Large response time caused by disk I/O limits the availability of hybrid queries in real-time streaming applications Real Data Transformed Data queries in real time streaming applications “What was the average speed in I-10 in LA county during summer 2009 from 4:00-5:00 pm?” 98% ta 98% eed e in dat Spe ariance % of Va Response Time for the indexed % Components Database Response Time for the indexed table containing data of one Time Time Components year (150 GB) : 58 Seconds! Conclusion and Future Work • Limited support for geostreaming (continuous spatial queries) in current D li ti f f tf t hi h ti l i database technologies Demo application as a proof of concept for a system which runs spatial queries over real time data real-time data Implementing the fundamentals of Clever Transportation (CT) project as a platform for monitoring, querying, and analyzing real-time Los Angeles traffic data • Devising a scalable spatial alarm continuous query suitable for location-based Devising a scalable spatial alarm continuous query suitable for location based services