Exploiting User Interaction and Object Candidates for Instance Retrieval and Object Segmentation

1. Exploiting User Interaction and Object Candidates for Instance Retrieval and Object SegmentationAmaia Salvador AguileraAdvisors: Kevin McGuinness, Xavier Gir

2. MotivationObject segmentationInstance Search1 3. MotivationFrom rectangles to regionsExhaustive searchRegionsObject proposals2 4. MotivationFrom rectangles to regionsExhaustive searchRegionsObject proposalsArbelez, P., Pont-Tuset, J., Barron, J. T., Marques, F., & Malik, J. (2014). Multiscale Combinatorial Grouping. CVPR. 3 5. Motivation4 6. Motivation5 7. Motivation6 8. OutlineMotivationInteractive Object SegmentationoSegmentation algorithmoUser InterfaceoExperiments and ResultsoConclusionsInstance Retrieval7 9. Interactive Object SegmentationAmaia Salvador Xavier GirAxel Carlier Vincent Charvillat Oge MarquesCollaborationsponsoredbya scholarshipfromtheSpanishMinistry 8 10. OutlineMotivationInteractive Object SegmentationoSegmentation algorithmoUser InterfaceoExperiments and ResultsoConclusionsInstance Retrieval9 11. Object CandidatesAlgorithm 1: Finding the best maskLimitation: sometimes there is no mask with no errors! 10 12. Object CandidatesAlgorithm 2: Combination of masksUnion of masks containing at least one foreground point and no conflicting background points11 13. OutlineMotivationInteractive Object SegmentationoSegmentation algorithmoUser InterfaceoExperiments and ResultsoConclusionsInstance Retrieval12 14. ClicknCut13 15. OutlineMotivationInteractive Object SegmentationoSegmentation algorithmoUser InterfaceoExperiments and ResultsoConclusionsInstance Retrieval14 16. Dataset100 objects from BSDS500 Datasetwith their corresponding binary masks5 control images from Pascal VOC2012(for gold standard test)15 17. ClicknCut and Object CandidatesCrowd usersExpert volunteers16 18. Jaccard Index VS Time(*) BPT and GrabCut curves show the results obtained by McGuinness, K., & Oconnor, N. E. in A comparative evaluation of interactive segmentation algorithms. Pattern Recognition, 43(2), 434-444.VSMcGuinness(BPT)McGuinness(GrabCut)17 19. OutlineMotivationInteractive Object SegmentationoSegmentation algorithmoUser InterfaceoExperiments and ResultsoConclusionsInstance Retrieval18 20. Conclusions19 21. Conclusionse.g. GrabCut20 22. Conclusions21 23. OutlineMotivationInteractive Object SegmentationInstance RetrievaloFrameworkoUser interface and Relevance FeedbackoExperiments and ResultsoConclusions22 24. TRECVID Instance Search 2014Amaia SalvadorXavier GirCarles VenturaEva MohedanoKevin McGuinnessNoel OConnorContributionsponsoredbytheCatalanGovernment23 25. OutlineMotivationInteractive Object SegmentationInstance RetrievaloFrameworkoUser interfaceoExperiments and ResultsoConclusions24 26. Frameworkquery / query setRepresentationtarget databaseRepresentationFeature MatchingRank ListIDscorequery imagesmasks25 27. Frameworkquery / query setRepresentationtarget databaseRepresentationFeature MatchingRank ListIDscorequery imagesmasks26 28. Query Set27 29. Frameworkquery / query setRepresentationtarget databaseRepresentationFeature MatchingRank ListIDscorequery imagesmasks28 30. Target databaseCollection of 244 videos from BBC EastEnders29 31. Target databaseFull dataset647,628 keyframesGround Truth Subset23,614 keyframes30 32. Frameworkquery / query setRepresentationtarget databaseRepresentationFeature MatchingRank ListIDscorequery imagesmasks31 33. Convolutional Neural NetworksCaffe: a CNN implementation. http://caffe.berkeleyvision.org/Figure Source: Babenko, A., Slesarev, A., Chigorin, A., & Lempitsky, V. (2014). Neural CodesforImageRetrieval.arXivpreprintarXiv:1404.1777.32 34. Frameworkquery / query setRepresentationtarget databaseRepresentationFeature MatchingRank ListIDscorequery imagesmasks33 35. OutlineMotivationInteractive Object SegmentationInstance RetrievaloFrameworkoUser interfaceoExperiments and ResultsoConclusions34 36. User Interface35 37. OutlineMotivationInteractive Object SegmentationInstance RetrievaloFrameworkoUser interfaceoExperiments and ResultsoConclusions36 38. 1.Local FeaturesGlobal CNN features are nice, but...what if we use binary masks to compute local CNN features?Problem: we dont have binary masks for the target keyframes!37 39. Local Features: Object CandidatesArbelez, P., Pont-Tuset, J., Barron, J. T., Marques, F., & Malik, J. (2014). Multiscale Combinatorial Grouping. CVPR. 38 40. Local featuresGlobalLocal Only(square)Local Only(square+ padding)Local only(crop)Global OnlyCropSquareSquare+ paddingResultsontoysubset39 41. 2.Combination of featuresResults on Ground Truth Datasetwith N = 20 40 42. 2.Combination of featuresProblems:~100 object candidates/frame~ 600,000 keyframes41 43. 2.Combination of featuresSolution:42 44. 3.Relevance FeedbackSimulation of users interaction with the UI using the Ground Truth SubsetAnnotations from the ranking are used with Relevance Feedback techniques taking different percentages of the ranking.43 45. Ranking fromannotationsPositive annotationsare usedto createthenew ranking44 46. Ranking fromannotationsAddresultsto theranking byre-queryingusingpositive annotations45 47. Ranking fromannotationsLinear SVM withpositive/negativeannotations46 48. 4.Performance differences9074: a cigarette9075: a Vodka bottle9096: this womanBad performance for topics containing small objects or people47 49. 4.Performance differences9090: this wooden bench9097: these spheres9081: a black taxiGood performance for topics for which context is important48 50. OutlineMotivationInteractive Object SegmentationInstance RetrievaloFrameworkoUser interfaceoExperiments and ResultsoConclusions49 51. Conclusions50 52. Conclusions51 53. General ConclusionsObjectSegmentationInstanceSearch52 54. ContributionsObjectSegmentationInstanceSearch53

Exploiting User Interaction and Object Candidates for Instance Retrieval and Object Segmentation

Technology