I1.1 Fundamentals for Context-aware Real-time Data Fusion. Lead: Roth (UIUC) Abdelzaher (UIUC) Huang (UIUC) Lei (IBM) Presented by: Tarek Abdelzaher. Task Goal and Overview. Goal: Foundations for utilizing context and prior knowledge in fusion - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Fundamentals of Context-aware Real-time Data FusionLead: Roth
(UIUC)
Prior Knowledge
?
Latency
Foundations for analysis of fusion latency.
Data Fusion Threads
Thread 1: Enable exploitation of prior knowledge and information
network links in the design of algorithms for data fusion (Dan
Roth: UIUC)
Thread 2: Enhance ability to uncover links between heterogeneous
content items, such as text and video (Huang, UIUC)
Thread 3: Advance latency analysis of distributed data fusion
algorithms (Abdelzaher, UIUC)
Thread 4: Validate the results on viable platforms and
crowd-sourcing applications (Lei, IBM Research)
Outline
Prior Knowledge
?
Latency
Prior Knowledge
?
Latency
Fundamentals of Context-aware Real-time Data Fusion
Advances in Learning & Inference of Constrained Conditional
Models
CCM: A computational framework for learning and inference with
interdependent variables in constrained settings
Formulating Information Fusion as CCMs.
Preliminary theoretical and experimental work on Information
Fusion
Key Publications:
R. Samdani and D. Roth, Efficient Learning for Constrained
Structured Prediction, submitted.
M. Chang, M. Connor and D. Roth, The Necessity of Combining
Adaptation Methods, EMNLP’10.
M. Chang, V. Srikumar, D. Goldwasser and D. Roth, Structured Output
Learning with Indirect Supervision, ICML’10.
M. Chang, D. Goldwasser, D. Roth and V. Srikumar, Discriminative
Learning over Constrained Latent Representations, NAACL’10
G. Kundu, D. Roth and R. Samdani, Constrained Conditional Models
for Information Fusion, submitted.
6
7
Predict values of multiple, interdependent labels (in contexts as
diverse as information extraction, information trustworthiness,
information fusion, etc.)
Modeling complex dependencies leads to intractability of learning
& inference (decision making)
Leads to over-simplification & unjustified independence
assumptions
Constrained conditional models (CCMs) pair relatively simple
learning models with expressive prior knowledge in the form of
declarative constraints in supporting global decisions.
Learn models for sub-problems; incorporate models’ information,
along with prior knowledge/constraints, in making globally coherent
decisions
Fusion as a Decision Problem
Learn models; Acquire knowledge/constraints; Make decisions.
Recent Progress: LoCL (Locally Consistent Learning): a scheme which
is consistent with Global Learning under certain conditions while
being efficient.
Theoretical contribution and experimental confirmation on info
extraction tasks.
Illustrative Example
A
B
C
ABopt
BCopt
Illustrative Example
LoCL: Using local models + constraints find global optima
Global Optimum
Global Optimum
Feedback
Predict output states of different locations over consecutive time
steps
Output space is spatially and temporally structured
Expressing this structure using constraints can help make coherent
predictions and boost accuracy.
Outline
Prior Knowledge
?
Latency
Prior Knowledge
?
Latency
Source instances
14
Inner product in latent space as translator
Two different domains are mapped into a common intermediate space
with the same dimension.
Inner product in this intermediate representation space is used to
bridge two heterogeneous space.
15
Published in WWW’11, collaboration with Dr. Charu Aggarwal,
IBM
Cross-Category Knowledge Sharing
Exploring the concept correlations to enhance the inference
accuracy
To appear in CVPR’11, collaboration with Dr. Charu Aggarwal,
IBM
Modeling Context-Aware Image Similarity
KDD’11, submitted
16
Outline
Prior Knowledge
?
Latency
Prior Knowledge
?
Latency
Thread 3: Latency Analysis In Collaboration with Aylin Yener,
CNARC
Goal:
Answer the question: How much work can be done “on time” (given
different data fusion workflows and different end-to-end
deadlines)
Derive the real-time capacity region (load region where deadlines
are met)
Model:
Different flows have different end-to-end deadlines (worst-case
allowable end-to-end latency)
Results:
A real-time capacity region for the canonical system
A Reduction Theory for Distributed Systems
In collaboration with CNARC (OICC)
Based on reduction of distributed systems to an “equivalent
uniprocessor”
C1,1 = 2
C1,2 = 1.1
C2,1 = 1
C2,2 = 1.8
Stage 1
Stage 2
C1,max = 2
C2,max = 1.8
Cmax,1 = 2
Cmax,2 = 1.8
Stage 2
Stage 1
(a) Original Pipeline Execution
1
2
2
1
3
1
1
F1
F2
2
1
2
2
2
F3
2
1
2
1
1
2
2
1
2
2
3
3
F1
F2
F3
(b) Equivalent
1
2
2
1
3
1
1
F1
F2
2
1
2
2
2
F3
2
1
2
1
1
2
2
1
2
2
3
3
F1
F2
F3
(b) Equivalent
The Real-time Capacity Region
The real-time capacity theorem:
In a system with a set, S, of processing workflows, where each
workflow Fi in S incurs an effective utilization uieffect on an
equivalent uniprocessor and has a job rate Ri and a per-job
end-to-end maximum latency constraint, Di, all jobs meet their
end-to-end deadlines if:
where:
The Real-time Capacity Region
The real-time capacity theorem:
In a system with a set, S, of processing workflows, where each
workflow Fi in S incurs an effective utilization uieffect on an
equivalent uniprocessor and has a job rate Ri and a per-job
end-to-end maximum latency constraint, Di, all jobs meet their
end-to-end deadlines if:
where:
Performance Evaluation
Theoretically predicted real-time capacity bound is very close to
empirical onset of deadline misses
Thread 4: Validation IBM, UIUC
Develop a general platform reusable for different mobile
crowd-sensing applications to experiment with data fusion
applications
Mobile Sensing
Analysis of trade-offs between timeliness and fusion quality
Investigation of the dependency of fusion quality and timeliness on
distributed resource allocation.
Integration of prior knowledge, constraints, and resource
distribution issues into future data fusion algorithms.
Improving quality/cost trade-offs via link discovery (between text
and video)
Information-network-aware real-time capacity of data fusion.
Validation, documentation and publications.
Capacity Task
Improved network QoI optimization for fusion systems
Improved effective operational capacity
(UIUC): Gourab Kundu, Rajhans Samdani, Dan Roth, “Constrained
Conditional Models for Information Fusion,” submitted to Fusion
2011
(UIUC): Dan Roth at al. “Efficient Learning for Constrained
Structured Prediction” Submitted to ICML 2011
Thread 2 (Q2):
More Papers
Thread 3 (I1.1-I1.2 Collaboration/Multi-institution):
(UIUC+IBM) G. Qi, C. Aggarwal, T. Huang, “Towards Semantic
Knowledge Propagation between text and web images,” WWW Conference,
2011.
(UIUC+IBM) Guo-Jun Qi, Charu Aggarwal, Yong Rui, Qi Tian,
Shiyu Chang and Thomas Huang, “Towards Cross-Category Knowledge
Propagation for Learning Cross-domain Concepts,” IEEE Conference on
Computer Vision and Pattern Recognition (CVPR 2011), Colorado
Springs, Colorado, June 21-23, 2011
(IBM+UIUC) C. Aggarwal, Y. Zhao, P. Yu. On Wavelet Decomposition of
Uncertain Text Streams, CIKM Conference, 2011.
(UIUC+IBM) G. Qi, C. Aggarwal, T. Huang, “Transfer learning with
distance functions between text and web images,” Submitted to the
ACM KDD Conference, 2011.
(UIUC+IBM) G. Qi, C. Aggarwal, H. Ji, T. Huang, “Exploring Content
and Context-based Links in Social Media: A Latent Space Method,”
Submitted to IEEE Transactions on Pattern Mining (TPAMI)
Thread 4 (Q3/Q4)
Raghu Ganti, Fan Ye, Hui Lei, “Mobile Crowdsensing: Current State
and Future Challenges,” in submission to IEEE Comm. Magazine
Military Relevance
Enhanced warfighter’s ability to interpret reports, sensory data,
and soft information sources for making the right decisions
Enhanced exploitation of semantic links between information items
to improve data fusion accuracy
Improved ability to utilize context and background knowledge in
interpreting data
Significantly improved situation assessment in the presence of
heterogeneous content
(
)
(
)
(
)
(
)
(
)
(
)
(
)
(
)
(
)
()()()()
,,'''