Top Banner
LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI
40
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

LEARNING SEMANTICS OF WORDS AND PICTURES

TEJASWI DEVARAPALLI

Page 2: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

CONTENT

• INTRODUCTION

• MODELING IMAGE DATASET STATISTICS

• HIERARCHICAL MODEL

• TESTING AND USING BASIC MODEL

• AUTO ILLUSTRATION

• AUTO ANNOTATION

• RESULTS

• DISCUSSIONS

Page 3: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

SEMANTICS

• LANGUAGE USES A SYSTEM OF LINGUISTIC SIGNS, EACH OF WHICH IS A

COMBINATION OF MEANING AND PHONOLOGICAL AND/OR ORTHOGRAPHIC

FORMS.

• SEMANTICS IS TRADITIONALLY DEFINED AS THE STUDY OF MEANING IN

LANGUAGE.

Page 4: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

ABSTRACT

• A STATISTICAL MODEL FOR ORGANIZING IMAGE COLLECTIONS.

• INTEGRATES SEMANTIC INFORMATION PROVIDED BY ASSOCIATED TEXT AND

VISUAL INFORMATION PROVIDED BY IMAGE FEATURES.

• PROMISING MODEL FOR INFORMATION RETRIEVAL TASKS LIKE DATABASE

BROWSING, SEARCHING FOR IMAGES.

• USED FOR NOVEL APPLICATIONS.

Page 5: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

INTRODUCTION

• METHOD FOR ORGANIZING IMAGE DATABASES.

• INTEGRATES TWO KINDS OF INFORMATION DURING MODEL CONSTRUCTION.

• LEARNS LINKS BETWEEN IMAGE FEATURES AND SEMANTICS.

• LEARNINGS USEFUL IN

BETTER BROWSING

BETTER SEARCH

NOVEL APPLICATIONS

Page 6: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

INTRODUCTION(CONTINUED)

• MODELS STATISTICS ABOUT OCCURRENCE AND CO-OCCURRENCE OF WORD AND

FEATURES.

• HIERARCHICAL STRUCTURE.

• GENERATIVE MODEL, IMPLICITLY CONTAINS PROCESSES FOR PREDICTING

IMAGE COMPONENTS

WORDS AND FEATURES

Page 7: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

COMPARISON

• THIS MODEL SUPPORTS BROWSING FOR THE IMAGE RETRIEVAL PURPOSES

• SYSTEMS FOR SEARCHING IMAGE DATABASES INCLUDES SEARCH BY QUERY.

TEXT

IMAGE FEATURE SIMILARITY

SEGMENT FEATURES

IMAGE SKETCH

Page 8: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

MODELING IMAGE DATASET STATISTICS

• GENERATIVE HIERARCHICAL MODEL

• COMBINATION OF

ASYMMETRIC CLUSTERING MODEL (MAPS DOCUMENTS INTO CLUSTERS)

SYMMETRIC CLUSTERING MODEL(MODELS JOINT DISTRIBUTION OF

DOCUMENTS AND FEATURES).

• DATA MODELED AS FIXED HIERARCHY OF NODES.

• NODES GENERATE WORD

IMAGE SEGMENT

Page 9: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

ILLUSTRATION

• DOCUMENTS MODELED AS SEQUENCE OF WORDS AND SEQUENCE OF SEGMENTS

USING BLOBWORLD REPRESENTATION.

• "BLOBWORLD" REPRESENTATION IS CREATED BY CLUSTERING PIXELS IN A JOINT

COLOR-TEXTURE-POSITION FEATURE SPACE.

• THE DOCUMENT IS MODELED BY SUM OVER THE CLUSTERS, TAKING ALL

CLUSTERS INTO CONSIDERATION.

Page 10: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

Higher level nodes emit more general words and blobs. (e. g . sky)

Moderately general words and blobs. (e. g . Sun,sea)

Lower level nodes emit more specific words and blobs. (e. g . Waves)

SunSkySeaWaves

HIERARCHICAL MODEL

• EACH NODE HAS A PROBABILITY OF

GENERATING A WORD/ IMAGE W.R.T

THE DOCUMENT UNDER

CONSIDERATION.

• CLUSTER DEFINES THE PATH.

• CLUSTER, LEVEL IDENTIFIES THE

NODE.

Page 11: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

Mathematical Process for generating set of observations ‘D’ associated with a document ‘d’ is described by

C – clusters, i – items, l – levels.

Page 12: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

GAUSSIAN DISTRIBUTIONS

• NUMBER OF FEATURES LIKE ASPECTS OF SIZE, POSITION, COLOR, TEXTURE AND

SHAPE ALL TOGETHER FORM FEATURE VECTOR ‘X’.

• PROBABILITY DISTRIBUTION OVER IMAGE SEGMENTS BY USUAL FORMULA:-

Page 13: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

MODELING IMAGE DATASET STATISTICS

• THIS MODEL USES HIERARCHICAL MODEL AS IT BEST SUPPORTS

BROWSING OF LARGE COLLECTIONS OF IMAGES

COMPACT REPRESENTATION

• PROVIDES IMPLEMENTATION DETAILS FOR AVOIDING OVER TRAINING.

• THE TRAINING PROCEDURE CLUSTERS A FEW THOUSAND IMAGES IN A FEW

HOURS ON A STATE OF THE ART PC.

Page 14: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

MODELING IMAGE DATASET STATISTICS

• RESOURCE REQUIREMENTS LIKE “MEMORY” INCREASE RAPIDLY WITH NO.OF

IMAGES. SO WE NEED EXTRA CARE.

• THERE ARE DIFFERENT APPROACHES FOR AVOIDING OVER-TRAINING AND

RESOURCE USAGE.

Page 15: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

FIRST APPROACH

• WE TRAIN ON RANDOMLY SELECTED SUBSET OF IMAGES UNTIL LOG

LIKELYHOOD

FOR HELD OUT DATA, RANDOMLY SELECTED FROM REMAINING DATA BEGINS TO

DROP.

• THE MODEL SO FOUND IS USED AS A STARTING POINT FOR NEXT TRAINING

ROUND USING SECOND RANDOM SET OF IMAGES.

Page 16: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

SECOND APPROACH

• SECOND METHOD FOR REDUCING RESOURCE USAGE IS TO LIMIT CLUSTER

MEMBERSHIP.

FIRST COMPUTE APPROXIMATE CLUSTERING BY TRAINING ON A SUBSET.

THEN CLUSTER ON ENTIRE DATASET, MAINTAIN PROBABILITY THAT A POINT IS

IN A CLUSTER FOR TOP TWENTY CLUSTERS.

REST OF THE MEMBERSHIP PROBABILITIES ASSUMED TO BE ZERO FOR NEXT

FEW ITERATIONS.

Page 17: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

TESTING AND USING BASIC MODEL

• METHOD STABILITY IS TESTED BY RUNNING FITTING PROCESS.

• FITTING PROCESS IS RUN ON SAME DATA SEVERAL TIMES WITH DIFFERENT

INITIAL CONDITIONS AS EXPECTATION MAXIMIZATION(EM) PROCESS IS

SENSITIVE TO THE STARTING POINT.

• THE CLUSTERING POINT DEPENDS MORE ON STARTING POINT THAN ON EXACT

IMAGES CHOSEN FOR TRAINING.

• THE SECOND TEST IS TO VERIFY WHETHER CLUSTERING ON BOTH IMAGE AND

TEXT HAS ADVANTAGE OR NOT.

Page 18: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

TESTING AND

USING THE BASIC MODEL

THIS FIGURE SHOWS 16

IMAGES FROM A CLUSTER

FOUND USING TEXT ONLY

Page 19: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

TESTING AND USING THE BASIC MODEL

THIS FIGURE SHOWS 16

IMAGES FROM A CLUSTER

FOUND USING ONLY IMAGE

FEATURES

Page 20: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

TESTING AND USING THE BASIC MODEL

Page 21: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

BROWSING

• MOST IMAGE RETRIEVAL SYSTEMS DO NOT SUPPORT BROWSING.

• THEY FORCE USER TO SPECIFY A QUERY.

• THE ISSUE IS WHETHER THE CLUSTERS FOUND THROUGH BROWSING MAKE

SENSE TO THE USER.

• IF THE USER FINDS THE CLUSTERS COHERENT THEN THEY CAN BEGIN TO

INTERNALIZE THE KIND OF STRUCTURE THEY REPRESENT.

Page 22: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

BROWSING

• USER STUDY

GENERATE 64 CLUSTERS FOR 3000 CLUSTERS.

GENERATE 64 RANDOM CLUSTERS FROM THE SAME IMAGES.

PRESENT RANDOM CLUSTER TO USER, ASK TO RATE COHERENCE(YES/NO).

94% ACCURACY

Page 23: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

IMAGE SEARCH

• SUPPLY A COMBINATION OF TEXT AND IMAGE FEATURES.

• APPROACH : COMPUTE FOR EACH CANDIDATE IMAGE, THE PROBABILITY OF

EMITTING THE QUERY ITEMS.

• Q = SET OF QUERY ITEMS D= CANDIDATE DOCUMENT.

Page 24: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

IMAGE SEARCH

THE FIGURE SHOWS THE RESULTS OF THE

“RIVER” AND “TIGER” QUERY.

Page 25: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

IMAGE SEARCH

• SECOND APPROACH

FINDING THE PROBABILITY THAT EACH CLUSTER GENERATES A QUERY AND

THEN SAMPLE ACCORDING TO WEIGHTED CLUSTERS.

CLUSTER MEMBERSHIP PLAYS IMPORTANT ROLE IN GENERATING DOCUMENTS,

WE CAN SAY CLUSTERS ARE COHERENT.

Page 26: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

IMAGE SEARCH

• PROVIDING MORE FLEXIBLE METHOD OF SPECIFYING IMAGE FEATURES IS AN

IMPORTANT NEXT STEP.

• THIS IS AS EXPLORED IN MANY “QUERY BY EXAMPLE” IMAGE RETRIEVAL

SYSTEMS.

EXAMPLE :-

WE CAN QUERY FOR A DOG WITH WORD DOG AND IF WE WANT BLUE SKY THEN

WE CAN GET IT BY ADDING IMAGE SEGMENT FEATURE TO THE QUERY.

Page 27: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

PICTURES FROM WORDS AND WORDS FROM PICTURES

• THERE ARE TWO TYPES OF APPROACHES FOR LINKING WORDS TO PICTURES AND

PICTURES TO WORDS.

AUTO ILLUSTRATION

AUTO ANNOTATION

Page 28: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

AUTO ILLUSTRATION

• “AUTO ILLUSTRATION” – THE PROCESS OF LINKING PICTURES TO WORDS.

• GIVEN A SET OF QUERY ITEMS, Q AND A CANDIDATE DOCUMENT D, WE CAN

EXPRESS THE PROBABILITY THAT A DOCUMENT PRODUCES THE QUERY BY:

Page 29: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

AUTO ANNOTATION

• GENERATE WORDS FOR A GIVEN IMAGE

CONSIDER THE PROBABILITY OF THE IMAGE BELONGING TO THE CURRENT

CLUSTER.

CONSIDER THE PROBABILITY OF THE ITEMS IN THE IMAGE BEING

GENERATED BY THE NODES AT VARIOUS LEVELS IN THE PATH ASSOCIATED TO

THE CLUSTER.

WORK THE ABOVE OUT FOR ALL CLUSTERS.

Page 30: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

AUTO ANNOTATION

WE ARE COMPUTING THE PROBABILITY THAT AN IMAGE EMITS A PROPOSED

WORD,

GIVEN THE OBSERVED SEGMENTS, B:

Page 31: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

AUTO

ANNOTATION

THE FIGURE SHOWS SOME

ANNOTATION RESULTS SHOWING THE

ORIGINAL IMAGE, THE BLOBWORLD

SEGMENTATION, THE COREL

KEYWORDS, AND THE PREDICTED

WORDS IN RANK ORDER.

Page 32: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

AUTO ANNOTATION

• THE TEST IMAGES WERE NOT IN THE TRAINING SET, BUT THEY COME FROM

SAME SET OF CD’S USED FOR TRAINING.

• THE KEYWORDS IN UPPER-CASE ARE IN THE VOCABULARY.

Page 33: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

AUTO ANNOTATION

• TESTING THE ANNOTATION PROCEDURE:

WE USE THE MODEL TO PREDICT THE IMAGE WORDS BASED ONLY ON THE

SEGMENTS, THEN COMPARE THE WORDS WITH SEGMENTS.

PERFORM TEST ON TRAINING DATA AND TWO DIFFERENT TEST SETS. THEY ARE

1ST SET - RANDOMLY SELECTED HELD OUT SET FROM PROPOSED TRAINING DATA

COMING FROM COREL CD’S.

2ND SET - IMAGES FROM OTHER CD’S

Page 34: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

AUTO ANNOTATION

• QUANTITATIVE PERFORMANCE

USE 160 COREL CD’S , EACH WITH 100 IMAGES(GROUPED BY THEME)

SELECT 80 OF THE CDS, SPLIT INTO TRAINING (75%) AND TEST (25%).

REMAINING 80 CDS ARE A ‘HARDER’ TEST SET.

MODEL SCORING:N = NUMBER OF WORDS FOR THE IMAGE , R= NUMBER OF WORDS RECTLY.

Page 35: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

RESULTS

ANNOTATION RESULTS ON

THREE KINDS OF TEST DATA,

WITH THREE DIFFERENT

SCORING METHODS.

Page 36: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

RESULTS

• THE ABOVE TABLE SUMMARIZES THE ANNOTATION RESULT USING THE THREE

SCORING METHODS AND THE THREE HELD OUT SETS.

• WE AVERAGE THE RESULTS OF 5 SEPARATE RUNS WITH DIFFERENT HELD OUT

SETS.

• USING THE COMPARISON OF SAMPLING FROM THE WORD PRIOR ,

WE SCORE 3.14 ON THE TRAINING DATA, 2.70 ON NON-TRAINING DATA FROM THE

SAME CD SET AS THE TRAINING DATA AND 1.65 FOR TEST DATA TAKEN FROM

COMPLETELY DIFFERENT SET OF CD’S.

Page 37: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

DISCUSSION

• PERFORMANCE OF THE SYSTEM CAN BE MEASURED BY TAKING ADVANTAGE OF

ITS PREDICTIVE CAPABILITIES.

• WORDS WITH NO RELEVANCE TO VISUAL CONTENT CAUSE RANDOM NOISE, BY

TAKING AWAY PROBABILITY FROM MORE RELEVANT WORDS.

• SUCH WORDS CAN BE REMOVED BY OBSERVING THEIR EMISSION PROBABILITIES

ARE SPREAD OUT OVER THE NODES.

• THIS IS AUTOMATIC IMAGE REDUCTION METHOD WORKS DEPENDING ON THE

NATURE OF THE DATA SET.

Page 38: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

REFERENCES

• LEARNING SEMANTICS OF WORDS AND PICTURES BY KOBUS BARNARD AND

DAVID FORSYTH, COMPUTER DIVISION, UNIVERSITY OF CALIFORNIA, BERKELEY

HTTP://WWW.WISDOM.WEIZMANN.AC.IL/~VISION/COURSES/2003_2/BARNARD00LEARNING.PDF

• C.CARSON, S.BELONGE, H. GREENSPAN AND J.MALIK, “BLOBWORLD: IMAGE

SEGMENTATION USING EXPECTATION MAXIMIZATION AND ITS APPLICATION TO

IMAGE QUERYING”, IN REVIEW.

HTTP://WWW.CS.BERKELEY.EDU/~MALIK/PAPERS/CBGM-BLOBWORLD.PDF

Page 39: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

QUERIES

Page 40: LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

THANK YOU