Yuval Atzmon 1,2 , Gal Chechik 1,2 1 Bar-Ilan University, Israel 2 NVIDIA Research Adaptive Confidence Smoothing for Generalized Zero-Shot Learning In ZSL we learn new (unseen) classes from a description, without any visual examples Zero Shot Learning (ZSL) Could you recognize a Jackalope? “A Jackalope is a rabbit with horns.” Generalized Zero Shot Learning (GZSL) Idea 1: Soft combination of experts Break the model to domain experts. Inspired by the dual-route reading model in cognitive psychology. (1) Seen-classes expert (2) Unseen-classes expert (a ZSL model) (3) Gater: S/U classifier , =class, =Seen, =Unseen The full model adaptive COnfidence SMOothing (COSMO) Step 1: Experts Step 2: Gater Step 3: Smoothing Step 4: Combine Our approach (COSMO, blue) outperforms previous non-GAN approaches (triangles) and generative approaches (crosses). Accuracy Unseen [%] Accuracy Seen [%] Seen - Unseen accuracy curve by sweeping gate decision boundary Image In GZSL, at test time, we can either see an image from a seen class or from a new unseen class. Gater is “aware” of experts response “I’m cooking tonight and you can rely on me to absquatulate the moment it’s done.” Seen expert Paper, code and video: http://bit.ly/COSMO123 In reading, once we encounter an unfamiliar word, we compose it from syllables. Ideal Generative COSMO+LAGO (ours) LAGO (baseline) CS+LAGO COSMO CUB: Fine grained bird recognition AWA: Animal recognition SUN: Visual scenes ~12K images 150/50 S/U classes ~37K images 40/10 S/U classes ~14K images 645/72 S/U classes Train: Seen classes Test: Unseen Rabbit: rodent-shape with long ears Puku: antelope with ridge- structured horns or Seen Jackalope: rabbit with horns Saiga: antelope with bloated nostrils Rabbit: ... Puku: ... Standard ZSL models fail in GZSL due to • Spurious correlations • Domain adaptation • Extremely imbalanced data Three benchmark datasets Our solution: Use a uniform prior during inference, with adaptive weight λ=p(S)=1-p(U), set by the gater belief Idea 2: Smooth over-confident experts Over-confident prediction uniform prior Images outside-the- domain of an expert, usually produce over- confident predictions. Instead, all classes should have uniformly low probabilities, since they are all ”equally wrong”. GZSL requires robustness across Seen/Unseen domains. COSMO softly combines domain experts and smooths their predictions to address over-confident experts. With COSMO, standard ZSL classifiers can outperform generative classifiers. Takeaways LAGO zero-shot expert (Atzmon, 2018) Participation in CVPR is supported by the Israeli ministry of science a seen class or from a new unseen class. Gater is “aware” of experts response The gater is trained to discriminate the response of experts to seen and unseen images. Top-K pooling to achieve invariance to order and identity of input classes. Learns new attribute compositions, with a differentiable AND-OR architecture Desired smooth prediction Gater: S/U Classifier Unseen (ZSL) expert Confidence Based Gating