Katerina Fragkiadaki Jianbo Shi Figure - Ground Image Segmentation Helps Weakly-Supervised Learning of Objects Input: An image collection containing a common object Output: Models for segmenting the common object and its background. Main Challenge: Lange variations of the common object. Features do not repeat consistently!! Previous Work Generative models: • Topic models • Hierarchical representations: suspicious coincidence Co-segmentation Discriminative models : MILboosting/MILSVM on segments or patches (use of a negative image collection) Recently: Discriminative clustering Our approach Co-occurrences not sufficient image saliency Figure-ground saliency of an image set from figure-ground saliency of single image model update figure-ground update sample figure- ground labels FG input image collection Saliency values sal are computed given the segmentation of each map. Flexible representation of image figure-ground! Each map captures different object! Co-occurrence will choose the right one Segment figure-ground labels constrain the co- occurrence model Multiple segmentations per image. Model parameters: • Φ 1 :foreground bag of words model • Φ 2 . . φ T :background bag of words models • M : foreground shape model Observed variables: • w: visual words • sal: single image figure-ground saliency Latent variables: • ρ Є [0,1] : figure-ground map score given image set • FG Є {0,1}: segment figure-ground label given image set • z:segment topic Figure-ground saliency of an image set from figure-ground saliency of single image •Use both figure-ground saliency of single image and feature co-occurrence across the image set to discover the common object. •Encode figure-ground given image set as multiple figure-ground maps and probability distribution ρ over them. ρ depends both on single image figure- ground saliency and co-occurrence model. •Segment figure-ground labels FG are sampled from FGsoft, the map with the highest score р The irrelevant foreground objects have been suppressed by the co-occurrence model Figure-ground shape aware model image representation bag of segments Image figure-ground changes through the update of the scores ρ of figure-ground maps! FG=1 FG=0 background models φ 2 φ T words P words P figure-ground aware model w λ θ z N I N S N W β T φ T topic model ρ FG w β φ T N I N S N W z M FGsoft figure ground given single image: observed figure ground given image set: latent sal Maximizing a conditional likelihood! Discrimination without a negative image set word model φ 1 shape model M foreground model words P figure-ground given image set ρ↓ sal 2 2 =0.4 Ρ 2 2 =0.7 sal 1 1 =0.8 Ρ 1 1 =0.8 sal 2 1 =0.2 Ρ 2 1 =0.2 sal 1 2 =0.6 Ρ 1 2 =0.3 figure-ground organization multiple figure-ground maps figure ground given single image ρ↓ sal 2 2 =0.4 Ρ 2 2 =0.4 sal 1 1 =0.8 Ρ 1 1 =0.8 sal 2 1 =0.2 Ρ 2 1 =0.2 sal 1 2 =0.6 Ρ 1 2 =0.6 The co-occurrence model rescores the figure- ground maps Maps switched score order! Datasets used: Caltech 101:1) 81 images of Airplanes; MSRC: 2) 70 images of Cars, 3) 84 images of Cows; ETH: 4) 48 images of Bottles, 5) 29 images of Swans, 6) 85 images of Giraffes; Weizmann Horses: 7)80 images In each dataset 2/3 of images for training and 1/3 for testing. 2 representations tested: • sFgmodel: shape +bag of words model (full model) • bagFGmodel: bag of words model (no shape at test time, only during learning) Use both single image figure-ground saliency and feature co-occurrence across image set to discover the common object t 1 t 3 t 2 Problem algorithm Assumption: Often salient image regions capture the common object. sample segment topic z є t 2 …t T Gibbs sampling Learning Test time Results ours baseline