Top Banner
A Distributed Multimedia Data Management over the Grid Kasturi Chatterjee Distributed Multimedia Information System Laboratory School of Computing and Information Sciences Florida International University, Miami, FL 33199, USA
19

A Distributed Multimedia Data Management over the Grid Kasturi Chatterjee Distributed Multimedia Information System Laboratory School of Computing and.

Mar 31, 2015

Download

Documents

Stacey Freeby
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A Distributed Multimedia Data Management over the Grid Kasturi Chatterjee Distributed Multimedia Information System Laboratory School of Computing and.

A Distributed Multimedia Data Management over the Grid

Kasturi Chatterjee

Distributed Multimedia Information System Laboratory

School of Computing and Information Sciences

Florida International University, Miami, FL 33199, USA

Page 2: A Distributed Multimedia Data Management over the Grid Kasturi Chatterjee Distributed Multimedia Information System Laboratory School of Computing and.

Global Cyberbridges 2008 Proposal

2

Outline• Motivation

– Why multimedia data ?– Why handling and representing multimedia data challenging?– Why distributed environment ?– Why content based image/video retrieval ?

• Multimedia data management– Representation– Storage and Indexing– Popular retrieval strategies

• Proposed Work Outline– Issues to be addressed– Components and Related Work

• Conclusion

Page 3: A Distributed Multimedia Data Management over the Grid Kasturi Chatterjee Distributed Multimedia Information System Laboratory School of Computing and.

Global Cyberbridges 2008 Proposal

3

MotivationWhy multimedia data ?

– Attractive – Informative– Compact– Cheap memory makes storage easy

Why handling and representing multimedia data challenging?– Huge size (a typical 10 sec MPEG video ~4M)– Temporal and Spatial Information– High-level meaning and the semantic gap– Multidimensional representation– Traditional database incapable of accommodating above

characteristics

Page 4: A Distributed Multimedia Data Management over the Grid Kasturi Chatterjee Distributed Multimedia Information System Laboratory School of Computing and.

Global Cyberbridges 2008 Proposal

4

Motivation

Why distributed environment ?– Share storage – Share computing power– No single point of failure

Why content based image/video retrieval ?– unlike traditional data, temporal, spatial and semantic content should be considered during

query of multimedia data Can queries be issued textually for image/video databases? MAY BE NOT!

– Meta data– Keywords

• In Google Images: sunset Query By Example, Similarity Measurement, Content Interpretation, User Feedback etc. to be considered

Page 5: A Distributed Multimedia Data Management over the Grid Kasturi Chatterjee Distributed Multimedia Information System Laboratory School of Computing and.

Global Cyberbridges 2008 Proposal

5

Multimedia data management

Representation – Multidimensional : Unlike traditional data which is uni-

dimensional, multimedia data in the form of image or video is multidimensional.

– Semantic Interpretation : Multimedia data can have varied semantic interpretation.

– Feature Selection : Identifying feature space to represent the multimedia data is an important and crucial step in MDBMS. Features can be Color, Texture or Temporal information etc.

The atypical nature of multimedia data needs special representation in the form of multidimensional feature vectors

Page 6: A Distributed Multimedia Data Management over the Grid Kasturi Chatterjee Distributed Multimedia Information System Laboratory School of Computing and.

Global Cyberbridges 2008 Proposal

6

Multimedia data management

Storage and Indexing– Indexing is an integral part of designing a database system to reduce computation overhead and optimize retrieval.

Multimedia Data Indexing Requirements• Multimedia data stored as multidimensional feature vector.• Need to index a high dimensional feature space.• Index structure should map low level representation and high level

semantic relationship.• Index structure should handle popular multimedia data retrieval

strategies like content-based image retrieval (CBIR), relevance feedback (RF), video event retrievals etc.

Existing multidimensional indexing strategies fail to fulfill the aboverequirements efficiently!

Page 7: A Distributed Multimedia Data Management over the Grid Kasturi Chatterjee Distributed Multimedia Information System Laboratory School of Computing and.

Global Cyberbridges 2008 Proposal

7

Multimedia data management

• Popular Retrieval Strategies (Content-Based Image/Video Retrieval)

Image Database

Image Descriptor Space

Feature Descriptor Extraction

SimilarityMeasurement

Retrieval Results

Page 8: A Distributed Multimedia Data Management over the Grid Kasturi Chatterjee Distributed Multimedia Information System Laboratory School of Computing and.

Global Cyberbridges 2008 Proposal

8

Proposed Work OutlineA typical Grid Architecture

Source: http://gridcafe.web.cern.ch/gridcafe/gridatwork/architecture.html

Page 9: A Distributed Multimedia Data Management over the Grid Kasturi Chatterjee Distributed Multimedia Information System Laboratory School of Computing and.

Global Cyberbridges 2008 Proposal

9

Proposed Work Outline

Research Issues– Development of a technique to enable uniform representation of the

multimedia data

– Development of an efficient index structure, capable of handling multimedia data and support applications like CBIR/CBVR, spanning across multiple storages over a Grid/distributed environment

– Devising a mechanism by which users’ similarity concept across multiple network domains can be considered during providing query results

In short we envision to develop a distributed multimedia storage and

management system which will be capable of supporting popular retrieval

applications like CBIR/CBVR

Page 10: A Distributed Multimedia Data Management over the Grid Kasturi Chatterjee Distributed Multimedia Information System Laboratory School of Computing and.

Global Cyberbridges 2008 Proposal

10

Proposed Work Outline

The development and design of a multimedia data

management over grid has two critical components:

– Proper storage which prompts the requirement of a distributed multidimensional index structure and development of distributed retrieval algorithms (distributed k-NN or Range) supported by the index structure

– Efficient retrieval which prompts the introduction of techniques to map low level features with high level semantic concepts, over a distributed environment, to provide relevant query results

Page 11: A Distributed Multimedia Data Management over the Grid Kasturi Chatterjee Distributed Multimedia Information System Laboratory School of Computing and.

Global Cyberbridges 2008 Proposal

11

Proposed Work Outline

Concepts to be utilized and Related Works

– We have developed an index structure, called Affinity Hybrid Tree [1], for single node or stand alone applications, which is capable of indexing multidimensional images/videos and support CBIR/CBVR

• Plan to extend it as the basic indexing and storage framework since it proved itself very efficient in stand alone environments

– To capture the high level similarity concepts among the users in a distributed environment, we will develop a novel architecture called Distributed Affinity Capture Model (DACM) based on hierarchical markov model mediator [2].

Page 12: A Distributed Multimedia Data Management over the Grid Kasturi Chatterjee Distributed Multimedia Information System Laboratory School of Computing and.

Global Cyberbridges 2008 Proposal

12

Proposed Work OutlineComponents

• Affinity Hybrid TreeFeature based index mechanism filters the feature space and reduce the # of distance

computations to be performed

Distance based index mechanism incorporates the high-level image relationship as it is without translating it into its low-level equivalence

Reduce computational overhead

Increase retrieved image relevanceby capturing the user concept as it is

Page 13: A Distributed Multimedia Data Management over the Grid Kasturi Chatterjee Distributed Multimedia Information System Laboratory School of Computing and.

Global Cyberbridges 2008 Proposal

13

Proposed Work Outline Components

• Building AH-TreeFeature spacefiltering

Semanticrelationship introduction

Feature Vectors

feed

root

Space Index

Indexed subspace

Indexed subspace

Distance based indexing

Distance based indexing

Indexed data

Indexed data

Indexed data

Indexed data

Page 14: A Distributed Multimedia Data Management over the Grid Kasturi Chatterjee Distributed Multimedia Information System Laboratory School of Computing and.

Global Cyberbridges 2008 Proposal

14

Proposed Work Outline Components

Sample Results

• Computation CostFeature-space filtering reduces # of image

objects to be examined. Hence, reduces

# of distance computations manifold.

Accuracy:

AH-Tree – 80%

M-Tree – 10-20%

Page 15: A Distributed Multimedia Data Management over the Grid Kasturi Chatterjee Distributed Multimedia Information System Laboratory School of Computing and.

Global Cyberbridges 2008 Proposal

15

Proposed Work Outline Components

Hierarchical Markov Model Mediator (HMMM) [2]– A HMMM is represented by an 8-tuple

Where, d # levels in HMMM S multimedia objects in different levels F distinctive features or semantic concepts (depending upon the level) A Affinity Relationship between multimedia objects B Features/Concepts at each level Initial state probability distribution O Weights of importance for the lower level features and higher level concepts L Link condition between higher level and lower level states The model has been used successfully for several applications like CBIR and web document clustering

),,,,,,,( LOBAFSd

Page 16: A Distributed Multimedia Data Management over the Grid Kasturi Chatterjee Distributed Multimedia Information System Laboratory School of Computing and.

Global Cyberbridges 2008 Proposal

16

Tentative Road Map

• Details Literature Review for the following concepts:

– available data management tools and techniques in Grid computing

– peer-to-peer file sharing systems

• Development of the following algorithms and models– devise distributed k-NN search supporting CBIR/CBVR from

within an index structure – develop Distributed Affinity Capture Model (DACM) to capture

users’ concept of high-level similarity

• Implementation of the entire system

Page 17: A Distributed Multimedia Data Management over the Grid Kasturi Chatterjee Distributed Multimedia Information System Laboratory School of Computing and.

Global Cyberbridges 2008 Proposal

17

Conclusion

We propose to develop– An efficient multimedia data management framework over a

distributed environment like Grid– Develop distributed content-based retrieval algorithms which will

span across the grid to provide • semantically close query results

• quickly and efficiently

– Devise a way to capture users’ concept of similarity across the grid (bridging the gap between low-level features and high-level semantics is a challenge) with

• An architecture called Distributed Affinity Capture Model (DACM)

Page 18: A Distributed Multimedia Data Management over the Grid Kasturi Chatterjee Distributed Multimedia Information System Laboratory School of Computing and.

Global Cyberbridges 2008 Proposal

18

Questions

Page 19: A Distributed Multimedia Data Management over the Grid Kasturi Chatterjee Distributed Multimedia Information System Laboratory School of Computing and.

Global Cyberbridges 2008 Proposal

19

Selected References

[1] Kasturi Chatterjee and Shu-Ching Chen, "A Novel Indexing and Access Mechanism using Affinity Hybrid Tree for Content-Based Image Retrieval in Multimedia Databases," International Journal of Semantic Computing (IJSC), Vol. 1, Issue 2, pp. 147-170, June 2007.

[2] Mei-Ling Shyu, Shu-Ching Chen, Min Chen, Chengcui Zhang, and Chi-Min Shu, "MMM: A Stochastic Mechanism for Image Database Queries," Proceedings of the IEEE Fifth International Symposium on Multimedia Software Engineering (MSE2003), pp. 188-195, December 10-12, 2003, Taichung, Taiwan, ROC.

[3] M.-L. Shyu, S.-C. Chen, and C. Haruechaiyasak, C.-M. Shu, and S.-T. Li, “Disjoint Web Document Clustering and Management in Electronic Commerce,” the Seventh International Conference on Distributed Multimedia Systems (DMS’2001), pp. 494-497, 2001.[4] Mei-Ling Shyu, Shu-Ching Chen, Min Chen, Chengcui Zhang, Kanoksri Sarinnapakorn, "Image Database Retrieval Utilizing Affinity Relationships," accepted for publication, the First ACM International Workshop on Multimedia Databases (ACM MMDB'03), November 7, 2003, New Orleans, Louisiana, USA.