Message from the General and Program Chairs - PAMITCpamitc.org/wacv2017/wp-content/uploads/2017/03/WACV_2017_Pocket_Guide... · Message from the General and Program Chairs 1 Welcome

Message from the General and Program Chairs

1

Welcome to Santa Rosa, CA, and the 17th edition of the Winter Conference on Applications of Computer Vision (WACV), jointly sponsored by the IEEE Computer Society and the IEEE Biometrics Council. WACV is the premier outlet for research advances in applications of computer vision technology.

WACV 2017 spans four days, with a three-day, two-track, core program in which authors will present each accepted paper as a short oral and a poster. In addition, we have keynote talks and social functions, as well as several co-located events, including three workshops, two tutorials, a Ph.D. forum, and demo sessions. Following last year’s conference, WACV 2017 adopted a two-track core program, with two parallel oral sessions, each with 5-minute talks.

We used the Conference Management Toolkit (CMT) provided by Microsoft Research to manage the submission and selection of papers. To select papers for the program, we invited 27 researchers to act as Area Chairs (ACs). We recruited 275 experienced reviewers from the broader computer vision community. We received 320 original unpublished, full paper, submissions to the main conference. The Program Chairs (PCs) assigned the papers to the ACs who made recommendations for reviewers. All papers were reviewed by a minimum of three reviewers. Papers by PCs and GCs were handled to avoid conflict of interests, and the ACs were excluded from any decisions associated with papers from their research groups, affiliated institutions or collaborators. After the reviews were received, authors were offered an opportunity to rebut. Area chairs made initial recommendations based on the reviews, rebuttals, and reviewer discussions. In a few cases, the PCs discussed papers with the ACs to arrive at a final decision. Of the 320 full papers submitted, 144 high-quality papers were accepted to be part of the final program (~ 45% acceptance rate).

The proceedings of WACV 2017 are provided online before, during, and after the conference to all registered attendees. Like last year, there will not be USB proceedings, so participants are encouraged to download the proceedings before the conference. All papers in the main conference and associated workshops will be made available through the IEEE Computer Society Digital Library and IEEE Xplore.

The main conference also includes three keynote speakers: Dr. Richard Szeliski from Facebook & Univ. of Washington, Prof. Marc Pollefeys from Microsoft Research & ETH Zurich, and Prof. Tamara Berg from Shopagon Inc. & UNC-Chapel Hill.

We wish to thank all members of the Organizing Committee, the Area Chairs, reviewers, authors, and the CMT for the immense amount of hard work and professionalism that went into making WACV 2017 a first-rate conference on the applications of computer vision. Our thanks also go to the organizers of past WACV meetings and the steering committee for their helpful advice and support.

We are grateful to our Silver Sponsors, Cognex and Kitware, and Bronze sponsors, Adobe, Disney Research, Amazon, Verisk Analytics, and Google for their generous support.

Finally, we invite the attendees to be Sonomads for a few days and enjoy Sonoma County’s art, wine, and coffee.

Gérard Medioni, David Michael, Sudeep Sarkar (General Co-Chairs)

Michael S. Brown, Rogério Feris, Conrad Sanderson, Matthew Turk (Program Co-Chairs)

Organizing Committee & Area Chairs

2

WACV 2017 Organizing Committee

General Chairs: Gérard Medioni Sudeep Sarkar David Michael

Program Chairs: Michael S. Brown Conrad Sanderson Matthew Turk Rogério Feris

Steering Committee: Anthony Hoogs Bryan Morse Terrance Boult Bir Bhanu Fatih Porikli

Workshops Chair: Jiwen Lu

Tutorials Chair: Xiaoming Liu

Finance Chair: Terrance Boult

Publications Chairs: Eric Mortensen Revathy Narasimhan

Web Chair: Fillipe Souza

Demos Chair: Tal Hassner

PhD Forum Chair: Song Wang

Publicity Chair: Ajay Kumar

WACV 2017 Area Chairs

Teofilo de Campos Liangliang Cao Peter Carr Kristin Dana Victor Fragoso Danna Gurari

Bohyung Han Mehrtash Harandi Tal Hassner Wong Yong Kang Seon Joo Kim Adriana Kovashka

Laura Leal-Taixé Mohammad Mahoor Scott McCloskey Chris McCool Vlad Morariu Fatih Porikli

Andrea Prati Brian Price Behjat Siddiquie Kevin Smith Matt Turek Xiaoyu Wang

Arnold Wiliem Guoying Zhao Wenyi Zhao

Monday, March 27 Program

3

Sunday, March 26

1900–2100 Registration (Alexander Valley Foyer)

Monday, March 27


0850–0900 Welcome by the General Chairs (Dry Creek Valley & Russian River Valley)

0900–1000 Oral 1A: Segmentation, Tracking (Dry Creek Valley)

Chair: Larry Davis (Univ. of Maryland)

Format (5 min. short presentation)

1. Deep Salient Object Detection by Integrating Multi-Level

Cues, Jing Zhang, Yuchao Dai, Fatih Porikli

2. Multi-Planar Fitting in an Indoor Manhattan World,

Seongdo Kim, Roberto Manduchi

3. Universal Skin Detection Without Color Information,

Abhijit Sarkar, Amos Lynn Abbott, Zachary Doerzaph

4. Recurrent Fully Convolutional Networks for Video

Segmentation, Sepehr Valipour, Mennatullah Siam, Martin

Jagersand, Nilanjan Ray

5. Learning Spatial Transforms for Refining Object Segment

Proposals, Haoyang Zhang, Xuming He, Fatih Porikli

6. Repeated Pattern Detection Using CNN Activations, Louis

Lettry, Michal Perdoch, Kenneth Vanhoey, Luc Van Gool

7. Deep Context Modeling for Semantic Segmentation, Kien

Thanh, Clinton Fookes, Sridha Sridharan

8. 3D Semantic Segmentation of Modular Furniture Using

rjMCMC, Ishrat Badami, Manu Tom, Markus Mathias,

Bastian Leibe

9. PASCAL Boundaries: A Semantic Boundary Dataset With a

Deep Semantic Boundary Detector, Vittal Premachandran,

Boyan Bonev, Xiaochen Lian, Alan Yuille

10. Can Affordances Guide Object Decomposition Into

Semantically Meaningful Parts?, Safoura Rezapour Lakani,

Antonio J. Rodriguez-Sanchez, Justus Piater

11. Solving Occlusion Problem in Pedestrian Detection by

Constructing Discriminative Part Layers, Cong Cao, Yu

Wang, Jien Kato, Guanwen Zhang, Kenji Mase

12. Unifying Registration Based Tracking: A Case Study With

Structural Similarity, Abhineet Singh, Mennatullah Siam,

Martin Jagersand

0900–1000 Oral 1B: Action Recognition (Russian River Valley)

Chair: François Brémond (INRIA Sophia Antipolis)


1. Deep Moving Poselets for Video Based Action

Recognition, Effrosyni Mavroudi, Lingling Tao, René Vidal

2. First-Person Action Decomposition and Zero-Shot

Learning, Yun C. Zhang, Yin Li, James M. Rehg

3. Higher-Order Pooling of CNN Features via Kernel

Linearization for Action Recognition, Anoop Cherian, Piotr

Koniusz, Stephen Gould

4. Semi-Coupled Two-Stream Fusion ConvNets for Action

Recognition at Extremely Low Resolutions, Jiawei Chen,

Jonathan Wu, Janusz Konrad, Prakash Ishwar

5. On Geometric Features for Skeleton-Based Action

Recognition Using Multilayer LSTM Networks, Songyang

Zhang, Xiaoming Liu, Jun Xiao

6. Real-Time Online Action Detection Forests Using Spatio-

Temporal Contexts, Seungryul Baek, Kwang In Kim, Tae-

Kyun Kim

7. Ordered Pooling of Optical Flow Sequences for Action

Recognition, Jue Wang, Anoop Cherian, Fatih Porikli

8. Two Stream LSTM: A Deep Fusion Framework for Human

Action Recognition, Harshala Gammulle, Simon Denman,

Sridha Sridharan, Clinton Fookes

9. Multi-Camera Action Dataset for Cross-Camera Action

Recognition Benchmarking, Wenhui Li, Yongkang Wong,

An-An Liu, Yang Li, Yu-Ting Su, Mohan Kankanhalli

10. Efficient Action Detection in Untrimmed Videos via Multi-

Task Learning, Yi Zhu, Shawn Newsam


4

11. Learning Discriminative Features via Label Consistent

Neural Network, Zhuolin Jiang, Yaming Wang, Larry Davis,

Walter Andrews, Viktor Rozgic

12. Recognition of Group Activities in Videos Based on Single-

and Two-Person Descriptors, Stéphane Lathuilière,

Georgios Evangelidis, Radu Horaud

1000–1045 Morning Break (Alexander Valley)

1045–1150 Oral 2A: Computational Photography, 3D Modeling, Remote Sensing, Gesture (Dry Creek Valley)

Chair: Larry Davis (Univ. of Maryland)


1. Quantitative Analysis of Automatic Image Cropping

Algorithms: A Dataset and Comparative Study, Yi-Ling

Chen, Tzu-Wei Huang, Kai-Han Chang, Yu-Chen Tsai,

Hwann-Tzong Chen, Bing-Yu Chen

2. Joint Regression and Ranking for Image Enhancement,

Parag Shridhar Chandakkar, Baoxin Li

3. Material Classification Under Natural Illumination Using

Reflectance Maps, Stamatios Georgoulis, Vincent

Vanweddingen, Marc Proesmans, Luc Van Gool

4. Dense Batch Non-Rigid Structure From Motion in a

Second, Vladislav Golyanik, Didier Stricker

5. Global Model With Local Interpretation for Dynamic Shape

Reconstruction, Antonio Agudo, Francesc Moreno-Noguer

6. Occlusions Are Fleeting - Texture Is Forever: Moving Past

Brightness Constancy, Christopher Ham, Surya Singh,

Simon Lucey

7. Accurate 3D Reconstruction of Dynamic Scenes From

Monocular Image Sequences With Severe Occlusions,

Vladislav Golyanik, Torben Fetzer, Didier Stricker

8. Patchwork Stereo: Scalable, Structure-Aware 3D

Reconstruction in Man-Made Environments, Amine Bourki,

Martin de La Gorce, Renaud Marlet, Nikos Komodakis

9. Calibration Technique for Underwater Active Oneshot

Scanning System With Static Pattern Projector and

Multiple Cameras, Hiroshi Kawasaki, Hideaki Nakai,

Hirohisa Baba, Ryusuke Sagawa, Ryo Furukawa

10. Fast Deep Vehicle Detection in Aerial Images, Lars Wilko

Sommer, Tobias Schuchert, Jürgen Beyerer

11. Beyond Spatial Auto-Regressive Models: Predicting

Housing Prices With Satellite Imagery, Archith J. Bency,

Swati Rallapalli, Raghu K. Ganti, Mudhakar Srivatsa, B. S.

Manjunath

12. Robust Hand Gestural Interaction for Smartphone Based

AR/VR Applications, Shreyash Mohatta, Ramakrishna

Perla, Gaurav Gupta, Ehtesham Hassan, Ramya

Hebbalaguppe

13. Spatial-Temporal Motion Field Analysis for Pixelwise Crack

Detection on Concrete Surfaces, Subhajit Chaudhury, Gaku

Nakano, Jun Takada, Akihiko Iketani

1045–1150 Oral 2B: Scene Understanding, Motion Processing (Russian River Valley)

Chair: François Brémond (INRIA Sophia Antipolis)


1. 2-Line Exhaustive Searching for Real-Time Vanishing Point

Estimation in Manhattan World, Xiaohu Lu, Jian Yao,

Haoang Li, Yahui Liu, Xiaofeng Zhang

2. Pano2CAD: Room Layout From a Single Panorama Image,

Jiu Xu, Björn Stenger, Tommi Kerola, Tony Tung

3. A Multi-View RGB-D Approach for Human Pose Estimation

in Operating Rooms, Abdolrahim Kadkhodamohammadi,

Afshin Gangi, Michel de Mathelin, Nicolas Padoy

4. Real Estate Image Classification, Jawadul Hasan Bappy,

Joseph R. Barr, Nani Narayanan Srinivasan, Amit K. Roy-

Chowdhury

5. Learn How to Choose: Independent Detectors Versus

Composite Visual Phrase, Guy Rosenthal, Ariel Shamir,

Leomid Sigal

6. Temporal Robust Features for Violence Detection, Daniel

Moreira, Sandra Avila, Mauricio Perez, Daniel Moraes,

Vanessa Testoni, Eduardo Valle, Siome Goldenstein,

Anderson Rocha

7. SAMP: Shape and Motion Priors for 4D Vehicle

Reconstruction, Francis Engelmann, Jörg Stückler, Bastian

Leibe


5

8. Predicting the Perceptual Demands of Urban Driving With

Video Regression, Luke Palmer, Alina Bialkowski, Gabriel J.

Brostow, Jonas Ambeck-Madsen, Nilli Lavie

9. Optimal Threshold and LoG Based Feature Identification

and Tracking of Bat Flapping Flight, Yousi Lin, Yang Xu, Hui

Chen, Matthew J. Bender, Amos Lynn Abbott, Rolf Müller

10. Fast Semi Dense Epipolar Flow Estimation, Matthieu

Garrigues, Antoine Manzanera

11. Global Consistency Priors for Joint Part-Based Object

Tracking and Image Segmentation, Oliver Müller, Bodo

Rosenhahn

12. Joint Epipolar Tracking (JET): Simultaneous Optimization

of Epipolar Geometry and Feature Correspondences,

Henry Bradler, Matthias Ochs, Nolang Fanani, Rudolf

Mester

13. Computing Egomotion With Local Loop Closures for

Egocentric Videos, Suvam Patra, Himanshu Aggarwal,

Himani Arora, Subhashis Banerjee, Chetan Arora

1150–1300 Lunch (On your own)


1300–1500 Tutorial (details on next page)

1700–1730 Afternoon Break (Alexander Valley)

1730–1830 Keynote Session (Dry Creek Valley)

Keynote Talk: 3D Reconstruction for Image-Based Rendering, Richard Szeliski (Facebook & Univ. of Washington)

Abstract: The reconstruction of 3D scenes and their appearance from imagery is one of the longest-standing problems in computer vision. Originally developed to support robotics and artificial intelligence applications, it has found some of its most widespread use in the support of interactive 3D scene visualization. One of the keys to this success has been the melding of 3D geometric and photometric reconstruction with a heavy re-use of the original imagery, which produces more realistic rendering than a pure 3D model-driven approach. In this talk, I give a retrospective of two decades of research in this area,

touching on topics such as sparse and dense 3D reconstruction, the fundamental concepts in image-based rendering and computational photography, applications to virtual reality, as well as ongoing research in the areas of layered decompositions and 3D-enabled video stabilization.

1830–1930 Dinner (Alexander Valley)

1930–2130 Demos (Alexander Valley)

Instant Immersion of Brands in Videos, Brunno Attorre, Bill Marino (Uru, Inc.)

Visual Intelligence for Fashion, Jayaguru Panda, Naveen Sinha, Labhesh Patel (Abzooba India InfoTech)

Regression of 3D Morphable Face Models Using a Deep CNN, Anh Tuan Tran (Univ. of Southern California)

1930–2130 Exhibits (Alexander Valley)

Amazon • Zillow

1930–2130 Poster Session 1 (Alexander Valley)

Posters for Oral Sessions 1A, 1B, 2A, and 2B.

1930–2130 PhD Forum 1 (Alexander Valley)

Ph.D. Forum Presenters:

1. Unaiza Ahsan

2. Daniel Hernández

3. Yanyang Gu

4. Arun CS Kumar

5. Julius Schöning

6. Tomas Hodan

7. Chi-Hao Wu

8. Jiaping Zhao


6

Tutorial: Local 3D Vision, Multiview Geometry, Video Tracking and Visual Servoing for Robot Arm and Hand Motion Control

Organizers: Martin Jagersand Mona Gridseth Abhineet Singh Mennatullah Siam Camilo Perez Oscar Ramirez

Time: 1300-1500

Location: Russian River Valley

Description: Robot vision is significantly different from

general computer vision. While general vision often aims to

reconstruct the whole 3D model or identify all objects,

guiding robot motion towards a target requires tracking

typically only one or a few specific features. Research on

human hand-eye coordination shows that when solving arm

and hand manipulation tasks we acquire minimal

representations of very specific information rather than a

global scene model. For robot vision minimal, but provably

sufficient representations can be build from individual

projective geometry constraints. Computationally, these

constraints are formulated on tracked geometric features

(points, lines, regions etc), and solved through visual

servoing. The tutorial covers the uncalibrated formulation of

multi-view geometry, video tracking, visual servoing, and

shows how to use several ROS softwares to design robot

vision systems.

Notes:

Tuesday, March 28 Program

7

Tuesday, March 28


0850–0900 Announcements (Dry Creek Valley & Russian River Valley)

0900–1000 Oral 3A: Statistical Methods, Object Recognition (Dry Creek Valley)

Chair: Scott McCloskey (Honeywell)


1. Cyclical Learning Rates for Training Neural Networks,

Leslie N. Smith

2. Guaranteed Parameter Estimation for Discrete Energy

Minimization, Mengtian Li, Daniel Huber

3. Solving Robust Regularization Problems Using Iteratively

Re-Weighted Least Squares, Khurrum Aftab Kiani, Tom

Drummond

4. Detecting Social Insects in Videos With Spatiotemporal

Regularization, N. Rich Nguyen, Min C. Shin

5. From Affine Rank Minimization Solution to Sparse

Modeling, Iman Abbasnejad, Sridha Sridharan, Simon

Denman, Clinton Fookes, Simon Lucey

6. Learning Attributes From Human Gaze, Nils Murrugarra-

Llerena, Adriana Kovashka

7. Multi-Task Curriculum Transfer Deep Learning of Clothing

Attributes, Qi Dong, Shaogang Gong, Xiatian Zhu

8. Deep Learning Logo Detection With Data Expansion by

Synthesising Context, Hang Su, Xiatian Zhu, Shaogang

Gong

9. Boosted Convolutional Neural Networks (BCNN) for

Pedestrian Detection, Chi-Hao Wu, Weihao Gan, De Lan,

C.-C. Jay Kuo

10. Improved Deep Learning of Object Category Using Pose

Information, Jiaping Zhao, Laurent Itti

11. Learning to Recognize Objects by Retaining Other Factors

of Variation, Jiaping Zhao, Chin-kai Chang, Laurent Itti

12. Artistic Movement Recognition by Boosted Fusion of Color

Structure and Topographic Description, Corneliu Florea,

Cosmin Ţoca, Fabian Gieske

0900–1000 Oral 3B: Security, Vision for Aerial, Multimedia (Russian River Valley)

Chair: Nicolas Padoy (Univ. of Strasbourg)


1. Plug-And-Play CNN for Crowd Motion Analysis: An

Application to Abnormal Event Detection, Mahdyar

Ravanbakhsh, Moin Nabi, Hossein Mousavi, Enver

Sangineto, Nicu Sebe

2. Deep Heterogeneous Feature Fusion for Template-Based

Face Recognition, Navaneeth Bodla, Jingxiao Zheng,

Hongyu Xu, Jun-Cheng Chen, Carlos Castillo, Rama

Chellappa

3. Integrated Global-Local Metric Learning for Person Re-

Identification, Jing Zhang, Xu Zhao

4. Multi-Shot Person Re-Identification Using Part

Appearance Mixture, Furqan M. Khan, Francois Brèmond

5. Active Online Anomaly Detection Using Dirichlet Process

Mixture Model and Gaussian Process Classification,

Jagannadan Varadarajan, Ramanathan Subramanian,

Narendra Ahuja, Pierre Moulin, Jean-Marc Odobez

6. Flowdometry: An Optical Flow and Deep Learning Based

Approach to Visual Odometry, Peter Muller, Andreas

Savakis

7. PCA Based Computation of Illumination-Invariant Space

for Road Detection, Taeyoung Kim, Yu-Wing Tai, Sung-Eui

Yoon

8. Road Detection Using Convolutional Neural Networks,

Aparajit Narayan, Elio Tuci, Frédéric Labrosse, Muhanad H.

Mohammed Alkilabi

9. Providing Video Annotations in Multimedia Containers for

Visualization and Research, Julius Schöning, Patrick Faion,

Gunther Heidemann, Ulf Krumnack

10. Detecting Sexually Provocative Images, Debashis Ganguly,

Mohammad Hasanzadeh Mofrad, Adriana Kovashka

11. Complex Event Recognition From Images With Few

Training Examples, Unaiza Ahsan, Chen Sun, James Hays,

Irfan Essa

12. High Level Concepts for Affective Understanding of

Images, Afsheen Rafaqat Ali, Usman Shahid, Mohsen Ali,

Jeffrey Ho


8


1045–1145 Oral 4A: Vision Systems (Dry Creek Valley)

Chair: Scott McCloskey (Honeywell)


1. Assessment of Peanut Pod Maturity, Ekta Bindlish, Amos

Lynn Abbott, Maria Balota

2. X-Ray Scattering Image Classification Using Deep

Learning, Boyu Wang, Kevin Yager, Dantong Yu, Minh

Nguyen

3. A Deep Learning Frame-Work for Recognizing

Developmental Disorders, Pushkar Shukla, Tanu Gupta,

Aradhya Saini, Priyanka Singh, Raman Balasubramanian

4. When Was That Made?, Sirion Vittayakorn, Alexander C.

Berg, Tamara L. Berg

5. Telecom Inventory Management via Object Recognition

and Localisation on Google Street View Images, Ramya

Hebbalaguppe, Gaurav Garg, Ehtesham Hassan, Hiranmay

Ghosh, Ankit Verma

6. Deep Object Ranking for Template Matching, Jean-

Philippe Mercier, Ludovic Trottier, Philippe Giguère, Brahim

Chaib-draa

7. A Deep Learning Paradigm for Detection of Harmful Algal

Blooms, Arun CS Kumar, Suchendra M. Bhandarkar

8. Crime Mapping From Satellite Imagery via Deep Learning,

Alameen Najjar, Shun’ichi Kaneko, Yoshikazu Miyanaga

9. Robust Road Marking Detection and Recognition Using

Density-Based Grouping and Machine Learning

Techniques, Oleksandr Bailo, Seokju Lee, Francois Rameau,

Jae Shin Yoon, In So Kweon

10. Beacon-Guided Structure From Motion for Smartphone-

Based Navigation, Tatsuya Ishihara, Jayakorn

Vongkulbhisal, Kris M. Kitani, Chieko Asakawa

11. Hardware-Centric Vision Processing for Mobile IoT

Environment Exploiting Approximate Graph Cut in

Resistor Grid, Yeongjae Choi, Jun-Seok Park, Lee-Sup Kim

12. Exploring Local Context for Multi-Target Tracking in Wide

Area Aerial Surveillance, Bor-Jeng Chen, Gérard Medioni

1045–1145 Oral 4B: Medical, Vision for Graphics & Robotics, Open Source API (Russian River Valley)

Chair: Nicolas Padoy (Univ. of Strasbourg)


1. Melanoma Detection Based on Mahalanobis Distance

Learning and Constrained Graph Regularized Nonnegative

Matrix Factorization, Yanyang Gu, Jun Zhou, Bin Qian

2. Size and Texture-Based Classification of Lung Tumors

With 3D CNNs, ZhiHao Luo, Marcus A. Brubaker, Michael

Brudno

3. 3D-Brain Segmentation Using Deep Neural Network and

Gaussian Mixture Model, Duy M. . Nguyen, Huy T. Vu, Huy

Q. Ung, Binh T. Nguyen

4. Ultrasound Tracking Using ProbeSight: Camera Pose

Estimation Relative to External Anatomy by Inverse

Rendering of a Prior High-Resolution 3D Surface Map,

Jihang Wang, Chengqian Che, John Galeotti, Samantha

Horvath, Vijay Gorantla, George Stetten

5. Center-Focusing Multi-Task CNN With Injected Features

for Classification of Glioma Nuclear Images, Veda Murthy,

Le Hou, Dimitris Samaras, Tahsin M. Kurc, Joel H. Saltz

6. Densification of Semi-Dense Reconstructions for Novel

View Generation of Live Scenes, Domagoj Baričević, Tobias

Höllerer, Matthew Turk

7. Texture Attribute Synthesis and Transfer Using Feed-

Forward CNNs, Thomas Irmer, Tobias Glasmachers,

Subhransu Maji

8. A Statistical Approach to Continuous Self-Calibrating Eye

Gaze Tracking for Head-Mounted Virtual Reality Systems,

Subarna Tripathi, Brian Guenter

9. Sparse Dictionary Learning for Identifying Grasp

Locations, Ludovic Trottier, Philippe Giguère, Brahim Chaib-

draa

10. T-LESS: An RGB-D Dataset for 6D Pose Estimation of

Texture-Less Objects, Tomáš Hodaň, Pavel Haluza, Štěpán

Obdržálek, Jiri Matas, Manolis Lourakis, Xenophon Zabulis

11. Gaussian Mixture Models for Temporal Depth Fusion,

Cevahir Cigla, Roland Brockers, Larry Matthies


9

12. An Open-Source Platform for Underwater Image and

Video Analytics, Matthew Dawkins, Linus Sherrill, Keith

Fieldhouse, Anthony Hoogs, Benjamin Richards, David

Zhang, Lakshman Prasad, Kresimir Williams, Nathan

Lauffenburger, Gaoang Wang



1300–1700 Tutorial (details on next page)



Keynote Talk: Computer Vision for Mixed Reality, Marc Pollefeys (Microsoft Research & ETH Zurich)

Abstract: This is a golden age for computer vision. Research breakthroughs are leaving the lab and getting into users’ hands in record time. Computer vision now plays a pivotal role in many advances benefitting society, such as autonomous vehicles, improved biometric security, and medical imaging. But out of all these innovations, one stands out as having the potential to completely upend how we access information and communicate with each other: mixed reality. Spurred by recent developments in SLAM, 3D reconstruction, gesture recognition, scene understanding, and power-efficient embedded computing, we’re already experiencing it in the form of groundbreaking products like Microsoft HoloLens. In this talk I will present some of the key computer vision components that are essential for enabling compelling mixed reality experiences on HoloLens and also discuss some of the unique features that HoloLens offers as an experimental platform for computer vision researchers.

1830–1845 Best Paper Awards (Dry Creek Valley)







Amazon • Zillow



1930–2130 PhD Forum 2 (Alexander Valley)

Ph.D. Forum Presenters:

1. Haoang Li

2. Yi Zhu

3. Archith John Bency

4. Nguyen Van Dinh

5. Shagan Sah

6. Jiawei Chen

7. Abhijit Sarkar

Notes:


10

Tutorial: Understanding the In-Camera Image Processing Pipeline for Computer Vision

Organizer: Michael S. Brown

Time: 1300-1700


Description: Image processing and computer vision

algorithms often treat a camera as a light measurement

device, where pixel intensities represent meaningful physical

measurements of the imaged scene. However, modern digital

cameras are anything but light measuring devices, with a

wide range of on- board processing, including scene

relighting (dynamic light optimization), white balance, and

various color rendering options (e.g. landscape, portrait,

vivid). This on-board processing is often how camera

manufacturers distinguish themselves among competitors,

resulting in two different cameras producing noticeably

different output images (sRGB) for the same scene. This

raises the question if meaningful values can be obtained from

camera objects. This tutorial will overview the basics of color

theory and the camera imaging pipeline, discussing various

methods that have addressed how to reverse this processing

to obtain meaningful physical values from digital

photographs.

Notes:

Wednesday, March 29 Program

11

Wednesday, March 29


0850–0900 Announcements (Dry Creek Valley & Russian River Valley)

0900–1000 Oral 5A: Object Recognition 2, Large Scale Systems (Dry Creek Valley)

Chair: Terry Boult (Univ. of Colorado Colorado Springs)


1. Describing Unseen Classes by Exemplars: Zero-Shot

Learning Using Grouped Simile Ensemble, Yang Long, Ling

Shao

2. Deep Multi-Modal Vehicle Detection in Aerial ISR Imagery,

Wesam Sakla, Goran Konjevod, T. Nathan Mundhenk

3. Subcategory-Aware Convolutional Neural Networks for

Object Proposals and Detection, Yu Xiang, Wongun Choi,

Yuanqing Lin, Silvio Savarese

4. StuffNet: Using ‘Stuff’ to Improve Object Detection,

Samarth Manoj Brahmbhatt, Henrik I. Christensen, James

Hays

5. Towards Fine-Grained Open Zero-Shot Learning: Inferring

Unseen Visual Features From Attributes, Yang Long, Li Liu,

Ling Shao

6. Fused DNN: A Deep Neural Network Fusion Approach to

Fast and Robust Pedestrian Detection, Xianzhi Du, Mostafa

El-Khamy, Jungwon Lee, Larry Davis

7. Fast Pedestrian Detection via Random Projection Features

With Shape Prior, Yun Zhao, Zejian Yuan, Dapeng Chen, Jie

Lyu, Tie Liu

8. Enriched Deep Recurrent Visual Attention Model for

Multiple Object Recognition, Artsiom Ablavatski, Shijian

Lu, Jianfei Cai

9. Box Refinement: Object Proposal Enhancement and

Pruning, Siyang Li, Heming Zhang, Jjunting Zhang, Yuzhuo

Ren, C.-C. Jay Kuo

10. Semantic Text Summarization of Long Videos, Shagan

Sah, Sourabh Kulhare, Allison Gray, Subhashini

Venugopalan, Emily Prud'hommeaux, Raymond Ptucha

11. Unsupervised Joint Mining of Deep Features and Image

Labels for Large-Scale Radiology Image Annotation and

Scene Recognition, Xiaosong Wang, Le Lu, Hoo-chang Shin,

Lauren Kim, Mohammadhadi Bagheri, Isabella Nogues,

Jianhua Yao, Ronald M. Summers

0900–0955 Oral 5B: Industrial Inspection, VR & AR, Stereo, Evaluation (Russian River Valley)

Chair: Tom Drummond (Monash Univ.)


1. Probabilistic Surface Inference for Industrial Inspection

Planning, Mahsa Mohammadikaji, Stephan Bergmann,

Stephan Irgenfried, Jürgen Beyerer, Carsten Dachsbacher,

Heinz Wörn

2. Spatio-Temporal Anomaly Detection for Industrial Robots

Through Prediction in Unsupervised Feature Space, Asim

Munawar, Phongtharin Vinayavekhin, Giovanni De Magistris

3. Automatic Defect Recognition in X-Ray Testing Using

Computer Vision, Domingo Mery, Carlos Arteta

4. X-Ray PoseNet: 6 DoF Pose Estimation for Mobile X-Ray

Devices, Mai Bui, Shadi Albarqouni, Michael Schrapp, Nassir

Navab, Slobodan Ilic

5. Crack Segmentation by Leveraging Multiple Frames of

Varying Illumination, Stephen J. Schmugge, Lance Rice,

John Lindberg, Robert Grizzi, Chris Joffe, Min C. Shin

6. GPU-Accelerated Real-Time Stixel Computation, Daniel

Hernandez-Juarez, Antonio Espinosa, Juan Carlos Moure,

David Vázquez, Antonio López

7. Model-Driven Simulations for Computer Vision, VSR

Veeravasarapu, Constantin Rothkopf, Ramesh Visvanathan

8. Automatic Calibration of a Multiple-Projector Spherical

Fish Tank VR Display, Qian Zhou, Gregor Miller, Kai Wu,

Daniela Correa, Sidney Fels

9. Transfer Learning and Deep Feature Extraction for

Planktonic Image Data Sets, Eric C. Orenstein, Oscar

Beijbom

10. Fast and Robust Eyelid Outline and Aperture Detection in

Real-World Scenarios, Wolfgang Fuhl, Thiago Santini,

Enkelejda Kasneci


12

11. On Crater Verification Using Mislocalized Crater Regions,

Ebrahim Emami, George Bebis, Ara Nefian, Terry Fong


1045–1145 Oral 6A: Face Processing, Biometrics, Image Compression, HCI (Dry Creek Valley)

Chair: Terry Boult (Univ. of Colorado Colorado Springs)


1. Robust 3D Patch-Based Face Hallucination, Chengchao Qu,

Christian Herrmann, Eduardo Monari, Tobias Schuchert,

Jürgen Beyerer

2. Dictionary Alignment for Low-Resolution and

Heterogeneous Face Recognition, Sivaram Prasad

Mudunuri, Soma Biswas

3. Pose-Robust Face Verification by Exploiting Competing

Tasks, Boyu Lu, Jingxiao Zheng, Jun-Cheng Chen, Rama

Chellappa

4. Deep Feature Consistent Variational Autoencoder, Xianxu

Hou, Linlin Shen, Ke Sun, Guoping Qiu

5. Egocentric Height Estimation, Jessica Finocchiaro, Aisha

Urooj Khan, Ali Borji

6. Gender-From-Iris or Gender-From-Mascara?, Andrey

Kuehlkamp, Benedict Becker, Kevin Bowyer

7. ContlensNet: Robust Iris Contact Lens Detection Using

Deep Convolutional Neural Networks, Raghavendra

Ramachandra, Kiran B. Raja, Christoph Busch

8. Breathing Rate Monitoring During Sleep From a Depth

Camera Under Real-Life Conditions, Manuel Martinez,

Rainer Stiefelhagen

9. Writer Identification in Noisy Handwritten Documents,

Karl Ni, Patrick Callier, Bradley Hatch

10. Image Set Classification Using Sparse Bayesian

Regression, Mohammed E. Fathy, Rama Chellappa

11. Bandwidth Limited Object Recognition in High Resolution

Imagery, Laura Lopez-Fuentes, Andrew D. Bagdanov, Joost

van de Weijer, Harald Skinnemoen

12. Personalized Image Aesthetic Quality Assessment by Joint

Regression and Ranking, Kayoung Park, Seunghoon Hong,

Mooyeol Baek, Bohyung Han

1045–1140 Oral 6B: Human Motion, Image Indexing, Vision Systems (Russian River Valley)

Chair: Tom Drummond (Monash Univ.)


1. Deep Spatio-Temporal Features for Multimodal Emotion

Recognition, Dung Nguyen, Kien Thanh, Sridha Sridharan,

Afsane Ghasemi, David Dean, Clinton Fookes

2. Human Pose Estimation Using Deep Structure Guided

Learning, Baole Ai, Yu Zhou, Yao Yu, Sidan Du

3. Switching Linear Inverse-Regression Model for Tracking

Head Pose, Vincent Drouard, Silèye Ba, Radu Horaud

4. Deep Image Set Hashing, Jie Feng, Svebor Karaman, Shih-

Fu Chang

5. Learning Effective Binary Descriptors via Cross Entropy,

Liu Liu, Hairong Qi

6. Convolutional Sparse and Low-Rank Coding-Based Rain

Streak Removal, He Zhang, Vishal M. Patel

7. Fast, Accurate, Small-Scale 3D Scene Capture Using a

Low-Cost Depth Sensor, Nicole Carey, Justin Werfel,

Radhika Nagpal

8. Who Moved My Cheese? Automatic Annotation of Rodent

Behaviors With Convolutional Neural Networks,

Zhongzheng Ren, Adriana Noronha, Annie Vogel Ciernia,

Yong Jae Lee

9. Temporally Coded Illumination for Rolling Shutter Motion

De-Blurring, Scott McCloskey, Sharath Venkatesha

10. Text-Edge-Box: An Object Proposal Approach for Scene

Texts Localization, Dinh Nguyen, Lu Shijian, Nizar Ouarti,

Mounir Mokhtari

11. Distance Penalization and Fusion for Person Re-

Identification, Behzad Mirmahboub, Mohamed Lamine

Mekhalfi, Vittorio Murino





13


Keynote Talk: Image Description & Beyond..., Tamara Berg (Shopagon Inc. & UNC-Chapel Hill)

Abstract: Much of everyday language and discourse concerns the visual world around us, making understanding the relationship between the physical world and language describing that world an important challenge problem for AI. Comprehending the complex and subtle interplay between the visual and linguistic domains will have broad applicability toward inferring human-like understanding of images, producing natural human-robot interactions, and grounding natural language. In computer vision, along with improvements in deep learning based visual recognition, there has been an explosion of recent interest in methods to automatically generate natural language outputs for images and videos. In this talk I will describe our group's efforts to understand and produce relevant natural language about images, from developing early methods to generate complete and human-like image descriptions, to modeling how people interpret and describe image content, to moving beyond general image descriptions toward more focused natural language, such as referring expressions and question-answering.

1830–1845 Closing Remarks (Dry Creek Valley)







Amazon • Zillow



Notes:

Thursday, March 30 Workshops

14

Thursday, March 30




Automated Analysis of Video Data for Wildlife Surveillance

Organizers: Benjamin Richards Anthony Hoogs David Kriegman


Schedule: Full Day

0900 Overview of NOAA Fisheries Strategic Initiative on

Automated Image Analysis, Benjamin Richards

1000 VIAME: Open-Source Software for Video and Image

Analysis in the Marine Environment, Anthony Hoogs

1030 Morning Break

1045 Deep Learning for All: Managing and Analyzing

Underwater and Remote Sensing Imagery on the Web

Using BisQue, Dmitry V. Fedorov, Kristian G. Kvilekval,

B. S. Manjunath, Brandon M. Doheny, Sarah R.

Sampson, Robert J. Miller

1115 TBA

1145 TBA

1215 Lunch (On your own)

1315 TBA

1345 TBA

1415 TBA

1445 TBA

Large-Scale Soft Biometrics

Organizers: Yongxin Ge Xin Feng Xiuzhuang Zhou Li Geng

Location: Dry Creek Valley I

Schedule: Half Day (Morning)

0900 Opening Remarks

0910 Soft Biometrics in Online Social Networks: A Case

Study on Twitter User Gender Recognition, Li Geng, Ke

Zhang, Xinzhou Wei, Xin Feng

0930 Online Cost Efficient Customer Recognition System for

Retail Analytics, Yilin Song, Yuanyi Xue, Chenge Li, Xuan

Zhao, Sixuan Liu, Xiaona Zhuo, Kangjin Zhang, Bo Yan,

Xiaoran Ning, Yao Wang, Xin Feng

0950 An Intelligent Building Occupancy Detection System

Based on Sparse Auto-Encoder, Zhi Liu, Jie Zhang, Li

Geng

1010 Morning Break

1030 Automatic Video Annotation System for Archival

Sports Video, Yuanyi Xue, Yilin Song, Chenge Li, An-ti

Chiang, Xiaoran Ning

1050 A Phase Field Variational Model With Arctangent

Regularization for Saliency Detection, Meng Li, Xing

Liu, Liming Tang

1110 A Variable Exponent p-Laplace Variational Model

Preserving Texture for Image Interpolation, Zhan Yi,

Yongxin Ge

1130 Conclusions & Future Work

Thursday, March 30 Workshops

15

Human Activity Analysis With Highly Diverse Cameras

Organizers: Hideo Saito Yoichi Sato Ryo Yonetani Yuko Ozasa Kris M. Kitani

Location: Dry Creek Valley II

Schedule: Half Day (Morning)

0900 Opening Remarks

S1: Keynote Session 1 (0910-0940)

0910 Keynote Talk: Activity Recognition From Persons'

Viewpoint and Robots' Viewpoint, Michael S. Ryoo

(Indiana Univ. Bloomington)

S2: Paper Session 1 (0940-1025)

0940 Measuring Grasp Posture Using an Embedded Camera,

Naoaki Kashiwagi, Yuta Sugiura, Natsuki Miyata,

Mitsunori Tada, Maki Sugimoto, Hideo Saito

0955 Gaze Estimation Based on Eyeball-Head Dynamics,

Ikuhisa Mitsugami, Yamato Okinaka, Yasushi Yagi

1010 Speaker Identification Based on Integrated Face

Direction in a Group Conversation, Naoto Ienaga, Yuko

Ozasa, Hideo Saito

1025 Break


1040 Keynote Talk: Advances in Automating Analysis of

Highway and Driver Video Image Data: Managing Low

and Variable Image Quality, David Kuehn, Charles Fay

(FHWA Exploratory Advanced Research (EAR) Program)

S4: Paper Session 2 (1110-1155)

1110 Action Recognition in Still Images Using Word

Embeddings From Natural Language Descriptions,

Karan Sharma, Arun CS Kumar, Suchendra M.

Bhandarkar

1125 Investigation of Customer Behavior Analysis Based on

Top-View Depth Camera, Junpei Yamamoto, Katsufumi

Inoue, Michifumi Yoshioka

1140 Measurement of Eyeball Rotational Movements in the

Dark Environment, Kiyoshi Hoshino, Nayuta Ono


1155 Keynote Talk: Weakly-supervised activity localization,

Yong-Jae Lee (Univ. of California, Davis)

1225 Closing Remarks

Notes:

WACV 2017 Notes

16