@DocXavi Module 3 - Lecture 10 Deep Convnets for Video Processing 28 January 2016 Xavier Giró-i-Nieto [http://pagines.uab.cat/mcv/ ]
Jan 12, 2017
DocXavi
Module 3 - Lecture 10
Deep Convnets for Video Processing28 January 2016
Xavier Giroacute-i-Nieto
[httppaginesuabcatmcv]
Acknowledgments
2
Linked slides
Motivation
Motivation
[Website]
Outline
1 Recognition2 Optical Flow3 Object Tracking4 Learn more
6
Recognition
Demo Clarifai
MIT Technology Review ldquoA start-uprsquos Neural Network Can Understand Videordquo (322015)7
Figure Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE
8
Recognition
9
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
10
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Previous lectures with Jose M Aacutelvarez
11
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE
Slides extracted from ReadCV seminar by Victor Campos 12
Recognition DeepVideo
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 13
Recognition DeepVideo Demo
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 14
Recognition DeepVideo Architectures
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 15
Unsupervised learning [Le at alrsquo11] Supervised learning [Karpathy et alrsquo14]
Recognition DeepVideo Features
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 16
Recognition DeepVideo Multiscale
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 17
Recognition DeepVideo Results
18
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
19
Recognition C3D
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
20Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Demo
21K Simonyan A Zisserman Very Deep Convolutional Networks for Large-Scale Image Recognition ICLR 2015
Recognition C3D Spatial dimensionSpatial dimensions (XY) of the used kernels are fixed to 3x3 following Symonian amp Zisserman (ICLR 2015)
22Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Temporal dimension3D ConvNets are more suitable for spatiotemporal feature learning compared to 2D ConvNets
Temporal depth
2D ConvNets
23Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
A homogeneous architecture with small 3 times 3 times 3 convolution kernels in all layers is among the best performing architectures for 3D ConvNets
Recognition C3D Temporal dimension
24Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Temporal dimension
25Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Architecture
Featurevector
26Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
Video sequence
16 frames-long clips
8 frames-long overlap
27Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
16-frame clip
16-frame clip
16-frame clip
16-frame clip
Average
4096
-dim
vid
eo d
escr
ipto
r
4096
-dim
vid
eo d
escr
ipto
r
L2 norm
28Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D VisualizationBased on Deconvnets by Zeiler and Fergus [ECCV 2014] - See [ReadCV Slides] for more details
29Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Compactness
30Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Convolutional 3D(C3D) combined with a simple linear classifier outperforms state-of-the-art methods on 4 different benchmarks and are comparable with state of the art methods on other 2 benchmarks
Recognition C3D Performance
31Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D SoftwareImplementation by Michael Gygli (GitHub)
32
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
33
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Acknowledgments
2
Linked slides
Motivation
Motivation
[Website]
Outline
1 Recognition2 Optical Flow3 Object Tracking4 Learn more
6
Recognition
Demo Clarifai
MIT Technology Review ldquoA start-uprsquos Neural Network Can Understand Videordquo (322015)7
Figure Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE
8
Recognition
9
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
10
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Previous lectures with Jose M Aacutelvarez
11
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE
Slides extracted from ReadCV seminar by Victor Campos 12
Recognition DeepVideo
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 13
Recognition DeepVideo Demo
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 14
Recognition DeepVideo Architectures
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 15
Unsupervised learning [Le at alrsquo11] Supervised learning [Karpathy et alrsquo14]
Recognition DeepVideo Features
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 16
Recognition DeepVideo Multiscale
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 17
Recognition DeepVideo Results
18
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
19
Recognition C3D
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
20Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Demo
21K Simonyan A Zisserman Very Deep Convolutional Networks for Large-Scale Image Recognition ICLR 2015
Recognition C3D Spatial dimensionSpatial dimensions (XY) of the used kernels are fixed to 3x3 following Symonian amp Zisserman (ICLR 2015)
22Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Temporal dimension3D ConvNets are more suitable for spatiotemporal feature learning compared to 2D ConvNets
Temporal depth
2D ConvNets
23Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
A homogeneous architecture with small 3 times 3 times 3 convolution kernels in all layers is among the best performing architectures for 3D ConvNets
Recognition C3D Temporal dimension
24Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Temporal dimension
25Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Architecture
Featurevector
26Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
Video sequence
16 frames-long clips
8 frames-long overlap
27Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
16-frame clip
16-frame clip
16-frame clip
16-frame clip
Average
4096
-dim
vid
eo d
escr
ipto
r
4096
-dim
vid
eo d
escr
ipto
r
L2 norm
28Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D VisualizationBased on Deconvnets by Zeiler and Fergus [ECCV 2014] - See [ReadCV Slides] for more details
29Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Compactness
30Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Convolutional 3D(C3D) combined with a simple linear classifier outperforms state-of-the-art methods on 4 different benchmarks and are comparable with state of the art methods on other 2 benchmarks
Recognition C3D Performance
31Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D SoftwareImplementation by Michael Gygli (GitHub)
32
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
33
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Linked slides
Motivation
Motivation
[Website]
Outline
1 Recognition2 Optical Flow3 Object Tracking4 Learn more
6
Recognition
Demo Clarifai
MIT Technology Review ldquoA start-uprsquos Neural Network Can Understand Videordquo (322015)7
Figure Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE
8
Recognition
9
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
10
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Previous lectures with Jose M Aacutelvarez
11
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE
Slides extracted from ReadCV seminar by Victor Campos 12
Recognition DeepVideo
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 13
Recognition DeepVideo Demo
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 14
Recognition DeepVideo Architectures
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 15
Unsupervised learning [Le at alrsquo11] Supervised learning [Karpathy et alrsquo14]
Recognition DeepVideo Features
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 16
Recognition DeepVideo Multiscale
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 17
Recognition DeepVideo Results
18
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
19
Recognition C3D
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
20Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Demo
21K Simonyan A Zisserman Very Deep Convolutional Networks for Large-Scale Image Recognition ICLR 2015
Recognition C3D Spatial dimensionSpatial dimensions (XY) of the used kernels are fixed to 3x3 following Symonian amp Zisserman (ICLR 2015)
22Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Temporal dimension3D ConvNets are more suitable for spatiotemporal feature learning compared to 2D ConvNets
Temporal depth
2D ConvNets
23Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
A homogeneous architecture with small 3 times 3 times 3 convolution kernels in all layers is among the best performing architectures for 3D ConvNets
Recognition C3D Temporal dimension
24Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Temporal dimension
25Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Architecture
Featurevector
26Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
Video sequence
16 frames-long clips
8 frames-long overlap
27Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
16-frame clip
16-frame clip
16-frame clip
16-frame clip
Average
4096
-dim
vid
eo d
escr
ipto
r
4096
-dim
vid
eo d
escr
ipto
r
L2 norm
28Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D VisualizationBased on Deconvnets by Zeiler and Fergus [ECCV 2014] - See [ReadCV Slides] for more details
29Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Compactness
30Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Convolutional 3D(C3D) combined with a simple linear classifier outperforms state-of-the-art methods on 4 different benchmarks and are comparable with state of the art methods on other 2 benchmarks
Recognition C3D Performance
31Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D SoftwareImplementation by Michael Gygli (GitHub)
32
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
33
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Motivation
Motivation
[Website]
Outline
1 Recognition2 Optical Flow3 Object Tracking4 Learn more
6
Recognition
Demo Clarifai
MIT Technology Review ldquoA start-uprsquos Neural Network Can Understand Videordquo (322015)7
Figure Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE
8
Recognition
9
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
10
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Previous lectures with Jose M Aacutelvarez
11
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE
Slides extracted from ReadCV seminar by Victor Campos 12
Recognition DeepVideo
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 13
Recognition DeepVideo Demo
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 14
Recognition DeepVideo Architectures
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 15
Unsupervised learning [Le at alrsquo11] Supervised learning [Karpathy et alrsquo14]
Recognition DeepVideo Features
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 16
Recognition DeepVideo Multiscale
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 17
Recognition DeepVideo Results
18
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
19
Recognition C3D
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
20Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Demo
21K Simonyan A Zisserman Very Deep Convolutional Networks for Large-Scale Image Recognition ICLR 2015
Recognition C3D Spatial dimensionSpatial dimensions (XY) of the used kernels are fixed to 3x3 following Symonian amp Zisserman (ICLR 2015)
22Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Temporal dimension3D ConvNets are more suitable for spatiotemporal feature learning compared to 2D ConvNets
Temporal depth
2D ConvNets
23Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
A homogeneous architecture with small 3 times 3 times 3 convolution kernels in all layers is among the best performing architectures for 3D ConvNets
Recognition C3D Temporal dimension
24Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Temporal dimension
25Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Architecture
Featurevector
26Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
Video sequence
16 frames-long clips
8 frames-long overlap
27Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
16-frame clip
16-frame clip
16-frame clip
16-frame clip
Average
4096
-dim
vid
eo d
escr
ipto
r
4096
-dim
vid
eo d
escr
ipto
r
L2 norm
28Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D VisualizationBased on Deconvnets by Zeiler and Fergus [ECCV 2014] - See [ReadCV Slides] for more details
29Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Compactness
30Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Convolutional 3D(C3D) combined with a simple linear classifier outperforms state-of-the-art methods on 4 different benchmarks and are comparable with state of the art methods on other 2 benchmarks
Recognition C3D Performance
31Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D SoftwareImplementation by Michael Gygli (GitHub)
32
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
33
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Motivation
[Website]
Outline
1 Recognition2 Optical Flow3 Object Tracking4 Learn more
6
Recognition
Demo Clarifai
MIT Technology Review ldquoA start-uprsquos Neural Network Can Understand Videordquo (322015)7
Figure Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE
8
Recognition
9
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
10
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Previous lectures with Jose M Aacutelvarez
11
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE
Slides extracted from ReadCV seminar by Victor Campos 12
Recognition DeepVideo
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 13
Recognition DeepVideo Demo
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 14
Recognition DeepVideo Architectures
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 15
Unsupervised learning [Le at alrsquo11] Supervised learning [Karpathy et alrsquo14]
Recognition DeepVideo Features
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 16
Recognition DeepVideo Multiscale
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 17
Recognition DeepVideo Results
18
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
19
Recognition C3D
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
20Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Demo
21K Simonyan A Zisserman Very Deep Convolutional Networks for Large-Scale Image Recognition ICLR 2015
Recognition C3D Spatial dimensionSpatial dimensions (XY) of the used kernels are fixed to 3x3 following Symonian amp Zisserman (ICLR 2015)
22Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Temporal dimension3D ConvNets are more suitable for spatiotemporal feature learning compared to 2D ConvNets
Temporal depth
2D ConvNets
23Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
A homogeneous architecture with small 3 times 3 times 3 convolution kernels in all layers is among the best performing architectures for 3D ConvNets
Recognition C3D Temporal dimension
24Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Temporal dimension
25Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Architecture
Featurevector
26Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
Video sequence
16 frames-long clips
8 frames-long overlap
27Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
16-frame clip
16-frame clip
16-frame clip
16-frame clip
Average
4096
-dim
vid
eo d
escr
ipto
r
4096
-dim
vid
eo d
escr
ipto
r
L2 norm
28Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D VisualizationBased on Deconvnets by Zeiler and Fergus [ECCV 2014] - See [ReadCV Slides] for more details
29Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Compactness
30Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Convolutional 3D(C3D) combined with a simple linear classifier outperforms state-of-the-art methods on 4 different benchmarks and are comparable with state of the art methods on other 2 benchmarks
Recognition C3D Performance
31Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D SoftwareImplementation by Michael Gygli (GitHub)
32
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
33
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Outline
1 Recognition2 Optical Flow3 Object Tracking4 Learn more
6
Recognition
Demo Clarifai
MIT Technology Review ldquoA start-uprsquos Neural Network Can Understand Videordquo (322015)7
Figure Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE
8
Recognition
9
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
10
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Previous lectures with Jose M Aacutelvarez
11
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE
Slides extracted from ReadCV seminar by Victor Campos 12
Recognition DeepVideo
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 13
Recognition DeepVideo Demo
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 14
Recognition DeepVideo Architectures
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 15
Unsupervised learning [Le at alrsquo11] Supervised learning [Karpathy et alrsquo14]
Recognition DeepVideo Features
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 16
Recognition DeepVideo Multiscale
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 17
Recognition DeepVideo Results
18
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
19
Recognition C3D
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
20Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Demo
21K Simonyan A Zisserman Very Deep Convolutional Networks for Large-Scale Image Recognition ICLR 2015
Recognition C3D Spatial dimensionSpatial dimensions (XY) of the used kernels are fixed to 3x3 following Symonian amp Zisserman (ICLR 2015)
22Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Temporal dimension3D ConvNets are more suitable for spatiotemporal feature learning compared to 2D ConvNets
Temporal depth
2D ConvNets
23Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
A homogeneous architecture with small 3 times 3 times 3 convolution kernels in all layers is among the best performing architectures for 3D ConvNets
Recognition C3D Temporal dimension
24Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Temporal dimension
25Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Architecture
Featurevector
26Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
Video sequence
16 frames-long clips
8 frames-long overlap
27Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
16-frame clip
16-frame clip
16-frame clip
16-frame clip
Average
4096
-dim
vid
eo d
escr
ipto
r
4096
-dim
vid
eo d
escr
ipto
r
L2 norm
28Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D VisualizationBased on Deconvnets by Zeiler and Fergus [ECCV 2014] - See [ReadCV Slides] for more details
29Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Compactness
30Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Convolutional 3D(C3D) combined with a simple linear classifier outperforms state-of-the-art methods on 4 different benchmarks and are comparable with state of the art methods on other 2 benchmarks
Recognition C3D Performance
31Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D SoftwareImplementation by Michael Gygli (GitHub)
32
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
33
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Recognition
Demo Clarifai
MIT Technology Review ldquoA start-uprsquos Neural Network Can Understand Videordquo (322015)7
Figure Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE
8
Recognition
9
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
10
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Previous lectures with Jose M Aacutelvarez
11
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE
Slides extracted from ReadCV seminar by Victor Campos 12
Recognition DeepVideo
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 13
Recognition DeepVideo Demo
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 14
Recognition DeepVideo Architectures
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 15
Unsupervised learning [Le at alrsquo11] Supervised learning [Karpathy et alrsquo14]
Recognition DeepVideo Features
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 16
Recognition DeepVideo Multiscale
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 17
Recognition DeepVideo Results
18
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
19
Recognition C3D
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
20Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Demo
21K Simonyan A Zisserman Very Deep Convolutional Networks for Large-Scale Image Recognition ICLR 2015
Recognition C3D Spatial dimensionSpatial dimensions (XY) of the used kernels are fixed to 3x3 following Symonian amp Zisserman (ICLR 2015)
22Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Temporal dimension3D ConvNets are more suitable for spatiotemporal feature learning compared to 2D ConvNets
Temporal depth
2D ConvNets
23Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
A homogeneous architecture with small 3 times 3 times 3 convolution kernels in all layers is among the best performing architectures for 3D ConvNets
Recognition C3D Temporal dimension
24Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Temporal dimension
25Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Architecture
Featurevector
26Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
Video sequence
16 frames-long clips
8 frames-long overlap
27Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
16-frame clip
16-frame clip
16-frame clip
16-frame clip
Average
4096
-dim
vid
eo d
escr
ipto
r
4096
-dim
vid
eo d
escr
ipto
r
L2 norm
28Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D VisualizationBased on Deconvnets by Zeiler and Fergus [ECCV 2014] - See [ReadCV Slides] for more details
29Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Compactness
30Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Convolutional 3D(C3D) combined with a simple linear classifier outperforms state-of-the-art methods on 4 different benchmarks and are comparable with state of the art methods on other 2 benchmarks
Recognition C3D Performance
31Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D SoftwareImplementation by Michael Gygli (GitHub)
32
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
33
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Figure Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE
8
Recognition
9
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
10
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Previous lectures with Jose M Aacutelvarez
11
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE
Slides extracted from ReadCV seminar by Victor Campos 12
Recognition DeepVideo
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 13
Recognition DeepVideo Demo
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 14
Recognition DeepVideo Architectures
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 15
Unsupervised learning [Le at alrsquo11] Supervised learning [Karpathy et alrsquo14]
Recognition DeepVideo Features
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 16
Recognition DeepVideo Multiscale
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 17
Recognition DeepVideo Results
18
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
19
Recognition C3D
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
20Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Demo
21K Simonyan A Zisserman Very Deep Convolutional Networks for Large-Scale Image Recognition ICLR 2015
Recognition C3D Spatial dimensionSpatial dimensions (XY) of the used kernels are fixed to 3x3 following Symonian amp Zisserman (ICLR 2015)
22Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Temporal dimension3D ConvNets are more suitable for spatiotemporal feature learning compared to 2D ConvNets
Temporal depth
2D ConvNets
23Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
A homogeneous architecture with small 3 times 3 times 3 convolution kernels in all layers is among the best performing architectures for 3D ConvNets
Recognition C3D Temporal dimension
24Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Temporal dimension
25Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Architecture
Featurevector
26Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
Video sequence
16 frames-long clips
8 frames-long overlap
27Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
16-frame clip
16-frame clip
16-frame clip
16-frame clip
Average
4096
-dim
vid
eo d
escr
ipto
r
4096
-dim
vid
eo d
escr
ipto
r
L2 norm
28Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D VisualizationBased on Deconvnets by Zeiler and Fergus [ECCV 2014] - See [ReadCV Slides] for more details
29Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Compactness
30Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Convolutional 3D(C3D) combined with a simple linear classifier outperforms state-of-the-art methods on 4 different benchmarks and are comparable with state of the art methods on other 2 benchmarks
Recognition C3D Performance
31Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D SoftwareImplementation by Michael Gygli (GitHub)
32
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
33
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
9
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
10
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Previous lectures with Jose M Aacutelvarez
11
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE
Slides extracted from ReadCV seminar by Victor Campos 12
Recognition DeepVideo
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 13
Recognition DeepVideo Demo
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 14
Recognition DeepVideo Architectures
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 15
Unsupervised learning [Le at alrsquo11] Supervised learning [Karpathy et alrsquo14]
Recognition DeepVideo Features
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 16
Recognition DeepVideo Multiscale
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 17
Recognition DeepVideo Results
18
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
19
Recognition C3D
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
20Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Demo
21K Simonyan A Zisserman Very Deep Convolutional Networks for Large-Scale Image Recognition ICLR 2015
Recognition C3D Spatial dimensionSpatial dimensions (XY) of the used kernels are fixed to 3x3 following Symonian amp Zisserman (ICLR 2015)
22Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Temporal dimension3D ConvNets are more suitable for spatiotemporal feature learning compared to 2D ConvNets
Temporal depth
2D ConvNets
23Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
A homogeneous architecture with small 3 times 3 times 3 convolution kernels in all layers is among the best performing architectures for 3D ConvNets
Recognition C3D Temporal dimension
24Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Temporal dimension
25Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Architecture
Featurevector
26Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
Video sequence
16 frames-long clips
8 frames-long overlap
27Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
16-frame clip
16-frame clip
16-frame clip
16-frame clip
Average
4096
-dim
vid
eo d
escr
ipto
r
4096
-dim
vid
eo d
escr
ipto
r
L2 norm
28Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D VisualizationBased on Deconvnets by Zeiler and Fergus [ECCV 2014] - See [ReadCV Slides] for more details
29Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Compactness
30Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Convolutional 3D(C3D) combined with a simple linear classifier outperforms state-of-the-art methods on 4 different benchmarks and are comparable with state of the art methods on other 2 benchmarks
Recognition C3D Performance
31Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D SoftwareImplementation by Michael Gygli (GitHub)
32
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
33
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
10
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Previous lectures with Jose M Aacutelvarez
11
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE
Slides extracted from ReadCV seminar by Victor Campos 12
Recognition DeepVideo
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 13
Recognition DeepVideo Demo
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 14
Recognition DeepVideo Architectures
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 15
Unsupervised learning [Le at alrsquo11] Supervised learning [Karpathy et alrsquo14]
Recognition DeepVideo Features
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 16
Recognition DeepVideo Multiscale
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 17
Recognition DeepVideo Results
18
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
19
Recognition C3D
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
20Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Demo
21K Simonyan A Zisserman Very Deep Convolutional Networks for Large-Scale Image Recognition ICLR 2015
Recognition C3D Spatial dimensionSpatial dimensions (XY) of the used kernels are fixed to 3x3 following Symonian amp Zisserman (ICLR 2015)
22Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Temporal dimension3D ConvNets are more suitable for spatiotemporal feature learning compared to 2D ConvNets
Temporal depth
2D ConvNets
23Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
A homogeneous architecture with small 3 times 3 times 3 convolution kernels in all layers is among the best performing architectures for 3D ConvNets
Recognition C3D Temporal dimension
24Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Temporal dimension
25Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Architecture
Featurevector
26Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
Video sequence
16 frames-long clips
8 frames-long overlap
27Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
16-frame clip
16-frame clip
16-frame clip
16-frame clip
Average
4096
-dim
vid
eo d
escr
ipto
r
4096
-dim
vid
eo d
escr
ipto
r
L2 norm
28Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D VisualizationBased on Deconvnets by Zeiler and Fergus [ECCV 2014] - See [ReadCV Slides] for more details
29Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Compactness
30Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Convolutional 3D(C3D) combined with a simple linear classifier outperforms state-of-the-art methods on 4 different benchmarks and are comparable with state of the art methods on other 2 benchmarks
Recognition C3D Performance
31Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D SoftwareImplementation by Michael Gygli (GitHub)
32
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
33
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
11
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE
Slides extracted from ReadCV seminar by Victor Campos 12
Recognition DeepVideo
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 13
Recognition DeepVideo Demo
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 14
Recognition DeepVideo Architectures
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 15
Unsupervised learning [Le at alrsquo11] Supervised learning [Karpathy et alrsquo14]
Recognition DeepVideo Features
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 16
Recognition DeepVideo Multiscale
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 17
Recognition DeepVideo Results
18
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
19
Recognition C3D
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
20Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Demo
21K Simonyan A Zisserman Very Deep Convolutional Networks for Large-Scale Image Recognition ICLR 2015
Recognition C3D Spatial dimensionSpatial dimensions (XY) of the used kernels are fixed to 3x3 following Symonian amp Zisserman (ICLR 2015)
22Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Temporal dimension3D ConvNets are more suitable for spatiotemporal feature learning compared to 2D ConvNets
Temporal depth
2D ConvNets
23Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
A homogeneous architecture with small 3 times 3 times 3 convolution kernels in all layers is among the best performing architectures for 3D ConvNets
Recognition C3D Temporal dimension
24Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Temporal dimension
25Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Architecture
Featurevector
26Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
Video sequence
16 frames-long clips
8 frames-long overlap
27Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
16-frame clip
16-frame clip
16-frame clip
16-frame clip
Average
4096
-dim
vid
eo d
escr
ipto
r
4096
-dim
vid
eo d
escr
ipto
r
L2 norm
28Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D VisualizationBased on Deconvnets by Zeiler and Fergus [ECCV 2014] - See [ReadCV Slides] for more details
29Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Compactness
30Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Convolutional 3D(C3D) combined with a simple linear classifier outperforms state-of-the-art methods on 4 different benchmarks and are comparable with state of the art methods on other 2 benchmarks
Recognition C3D Performance
31Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D SoftwareImplementation by Michael Gygli (GitHub)
32
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
33
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE
Slides extracted from ReadCV seminar by Victor Campos 12
Recognition DeepVideo
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 13
Recognition DeepVideo Demo
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 14
Recognition DeepVideo Architectures
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 15
Unsupervised learning [Le at alrsquo11] Supervised learning [Karpathy et alrsquo14]
Recognition DeepVideo Features
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 16
Recognition DeepVideo Multiscale
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 17
Recognition DeepVideo Results
18
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
19
Recognition C3D
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
20Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Demo
21K Simonyan A Zisserman Very Deep Convolutional Networks for Large-Scale Image Recognition ICLR 2015
Recognition C3D Spatial dimensionSpatial dimensions (XY) of the used kernels are fixed to 3x3 following Symonian amp Zisserman (ICLR 2015)
22Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Temporal dimension3D ConvNets are more suitable for spatiotemporal feature learning compared to 2D ConvNets
Temporal depth
2D ConvNets
23Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
A homogeneous architecture with small 3 times 3 times 3 convolution kernels in all layers is among the best performing architectures for 3D ConvNets
Recognition C3D Temporal dimension
24Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Temporal dimension
25Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Architecture
Featurevector
26Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
Video sequence
16 frames-long clips
8 frames-long overlap
27Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
16-frame clip
16-frame clip
16-frame clip
16-frame clip
Average
4096
-dim
vid
eo d
escr
ipto
r
4096
-dim
vid
eo d
escr
ipto
r
L2 norm
28Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D VisualizationBased on Deconvnets by Zeiler and Fergus [ECCV 2014] - See [ReadCV Slides] for more details
29Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Compactness
30Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Convolutional 3D(C3D) combined with a simple linear classifier outperforms state-of-the-art methods on 4 different benchmarks and are comparable with state of the art methods on other 2 benchmarks
Recognition C3D Performance
31Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D SoftwareImplementation by Michael Gygli (GitHub)
32
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
33
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 13
Recognition DeepVideo Demo
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 14
Recognition DeepVideo Architectures
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 15
Unsupervised learning [Le at alrsquo11] Supervised learning [Karpathy et alrsquo14]
Recognition DeepVideo Features
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 16
Recognition DeepVideo Multiscale
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 17
Recognition DeepVideo Results
18
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
19
Recognition C3D
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
20Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Demo
21K Simonyan A Zisserman Very Deep Convolutional Networks for Large-Scale Image Recognition ICLR 2015
Recognition C3D Spatial dimensionSpatial dimensions (XY) of the used kernels are fixed to 3x3 following Symonian amp Zisserman (ICLR 2015)
22Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Temporal dimension3D ConvNets are more suitable for spatiotemporal feature learning compared to 2D ConvNets
Temporal depth
2D ConvNets
23Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
A homogeneous architecture with small 3 times 3 times 3 convolution kernels in all layers is among the best performing architectures for 3D ConvNets
Recognition C3D Temporal dimension
24Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Temporal dimension
25Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Architecture
Featurevector
26Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
Video sequence
16 frames-long clips
8 frames-long overlap
27Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
16-frame clip
16-frame clip
16-frame clip
16-frame clip
Average
4096
-dim
vid
eo d
escr
ipto
r
4096
-dim
vid
eo d
escr
ipto
r
L2 norm
28Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D VisualizationBased on Deconvnets by Zeiler and Fergus [ECCV 2014] - See [ReadCV Slides] for more details
29Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Compactness
30Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Convolutional 3D(C3D) combined with a simple linear classifier outperforms state-of-the-art methods on 4 different benchmarks and are comparable with state of the art methods on other 2 benchmarks
Recognition C3D Performance
31Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D SoftwareImplementation by Michael Gygli (GitHub)
32
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
33
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 14
Recognition DeepVideo Architectures
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 15
Unsupervised learning [Le at alrsquo11] Supervised learning [Karpathy et alrsquo14]
Recognition DeepVideo Features
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 16
Recognition DeepVideo Multiscale
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 17
Recognition DeepVideo Results
18
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
19
Recognition C3D
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
20Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Demo
21K Simonyan A Zisserman Very Deep Convolutional Networks for Large-Scale Image Recognition ICLR 2015
Recognition C3D Spatial dimensionSpatial dimensions (XY) of the used kernels are fixed to 3x3 following Symonian amp Zisserman (ICLR 2015)
22Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Temporal dimension3D ConvNets are more suitable for spatiotemporal feature learning compared to 2D ConvNets
Temporal depth
2D ConvNets
23Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
A homogeneous architecture with small 3 times 3 times 3 convolution kernels in all layers is among the best performing architectures for 3D ConvNets
Recognition C3D Temporal dimension
24Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Temporal dimension
25Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Architecture
Featurevector
26Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
Video sequence
16 frames-long clips
8 frames-long overlap
27Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
16-frame clip
16-frame clip
16-frame clip
16-frame clip
Average
4096
-dim
vid
eo d
escr
ipto
r
4096
-dim
vid
eo d
escr
ipto
r
L2 norm
28Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D VisualizationBased on Deconvnets by Zeiler and Fergus [ECCV 2014] - See [ReadCV Slides] for more details
29Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Compactness
30Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Convolutional 3D(C3D) combined with a simple linear classifier outperforms state-of-the-art methods on 4 different benchmarks and are comparable with state of the art methods on other 2 benchmarks
Recognition C3D Performance
31Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D SoftwareImplementation by Michael Gygli (GitHub)
32
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
33
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 15
Unsupervised learning [Le at alrsquo11] Supervised learning [Karpathy et alrsquo14]
Recognition DeepVideo Features
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 16
Recognition DeepVideo Multiscale
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 17
Recognition DeepVideo Results
18
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
19
Recognition C3D
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
20Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Demo
21K Simonyan A Zisserman Very Deep Convolutional Networks for Large-Scale Image Recognition ICLR 2015
Recognition C3D Spatial dimensionSpatial dimensions (XY) of the used kernels are fixed to 3x3 following Symonian amp Zisserman (ICLR 2015)
22Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Temporal dimension3D ConvNets are more suitable for spatiotemporal feature learning compared to 2D ConvNets
Temporal depth
2D ConvNets
23Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
A homogeneous architecture with small 3 times 3 times 3 convolution kernels in all layers is among the best performing architectures for 3D ConvNets
Recognition C3D Temporal dimension
24Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Temporal dimension
25Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Architecture
Featurevector
26Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
Video sequence
16 frames-long clips
8 frames-long overlap
27Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
16-frame clip
16-frame clip
16-frame clip
16-frame clip
Average
4096
-dim
vid
eo d
escr
ipto
r
4096
-dim
vid
eo d
escr
ipto
r
L2 norm
28Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D VisualizationBased on Deconvnets by Zeiler and Fergus [ECCV 2014] - See [ReadCV Slides] for more details
29Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Compactness
30Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Convolutional 3D(C3D) combined with a simple linear classifier outperforms state-of-the-art methods on 4 different benchmarks and are comparable with state of the art methods on other 2 benchmarks
Recognition C3D Performance
31Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D SoftwareImplementation by Michael Gygli (GitHub)
32
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
33
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 16
Recognition DeepVideo Multiscale
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 17
Recognition DeepVideo Results
18
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
19
Recognition C3D
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
20Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Demo
21K Simonyan A Zisserman Very Deep Convolutional Networks for Large-Scale Image Recognition ICLR 2015
Recognition C3D Spatial dimensionSpatial dimensions (XY) of the used kernels are fixed to 3x3 following Symonian amp Zisserman (ICLR 2015)
22Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Temporal dimension3D ConvNets are more suitable for spatiotemporal feature learning compared to 2D ConvNets
Temporal depth
2D ConvNets
23Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
A homogeneous architecture with small 3 times 3 times 3 convolution kernels in all layers is among the best performing architectures for 3D ConvNets
Recognition C3D Temporal dimension
24Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Temporal dimension
25Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Architecture
Featurevector
26Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
Video sequence
16 frames-long clips
8 frames-long overlap
27Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
16-frame clip
16-frame clip
16-frame clip
16-frame clip
Average
4096
-dim
vid
eo d
escr
ipto
r
4096
-dim
vid
eo d
escr
ipto
r
L2 norm
28Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D VisualizationBased on Deconvnets by Zeiler and Fergus [ECCV 2014] - See [ReadCV Slides] for more details
29Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Compactness
30Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Convolutional 3D(C3D) combined with a simple linear classifier outperforms state-of-the-art methods on 4 different benchmarks and are comparable with state of the art methods on other 2 benchmarks
Recognition C3D Performance
31Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D SoftwareImplementation by Michael Gygli (GitHub)
32
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
33
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Karpathy A Toderici G Shetty S Leung T Sukthankar R amp Fei-Fei L (2014 June) Large-scale video classification with convolutional neural networks In Computer Vision and Pattern Recognition (CVPR) 2014 IEEE Conference on (pp 1725-1732) IEEE 17
Recognition DeepVideo Results
18
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
19
Recognition C3D
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
20Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Demo
21K Simonyan A Zisserman Very Deep Convolutional Networks for Large-Scale Image Recognition ICLR 2015
Recognition C3D Spatial dimensionSpatial dimensions (XY) of the used kernels are fixed to 3x3 following Symonian amp Zisserman (ICLR 2015)
22Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Temporal dimension3D ConvNets are more suitable for spatiotemporal feature learning compared to 2D ConvNets
Temporal depth
2D ConvNets
23Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
A homogeneous architecture with small 3 times 3 times 3 convolution kernels in all layers is among the best performing architectures for 3D ConvNets
Recognition C3D Temporal dimension
24Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Temporal dimension
25Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Architecture
Featurevector
26Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
Video sequence
16 frames-long clips
8 frames-long overlap
27Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
16-frame clip
16-frame clip
16-frame clip
16-frame clip
Average
4096
-dim
vid
eo d
escr
ipto
r
4096
-dim
vid
eo d
escr
ipto
r
L2 norm
28Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D VisualizationBased on Deconvnets by Zeiler and Fergus [ECCV 2014] - See [ReadCV Slides] for more details
29Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Compactness
30Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Convolutional 3D(C3D) combined with a simple linear classifier outperforms state-of-the-art methods on 4 different benchmarks and are comparable with state of the art methods on other 2 benchmarks
Recognition C3D Performance
31Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D SoftwareImplementation by Michael Gygli (GitHub)
32
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
33
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
18
Recognition
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
19
Recognition C3D
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
20Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Demo
21K Simonyan A Zisserman Very Deep Convolutional Networks for Large-Scale Image Recognition ICLR 2015
Recognition C3D Spatial dimensionSpatial dimensions (XY) of the used kernels are fixed to 3x3 following Symonian amp Zisserman (ICLR 2015)
22Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Temporal dimension3D ConvNets are more suitable for spatiotemporal feature learning compared to 2D ConvNets
Temporal depth
2D ConvNets
23Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
A homogeneous architecture with small 3 times 3 times 3 convolution kernels in all layers is among the best performing architectures for 3D ConvNets
Recognition C3D Temporal dimension
24Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Temporal dimension
25Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Architecture
Featurevector
26Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
Video sequence
16 frames-long clips
8 frames-long overlap
27Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
16-frame clip
16-frame clip
16-frame clip
16-frame clip
Average
4096
-dim
vid
eo d
escr
ipto
r
4096
-dim
vid
eo d
escr
ipto
r
L2 norm
28Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D VisualizationBased on Deconvnets by Zeiler and Fergus [ECCV 2014] - See [ReadCV Slides] for more details
29Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Compactness
30Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Convolutional 3D(C3D) combined with a simple linear classifier outperforms state-of-the-art methods on 4 different benchmarks and are comparable with state of the art methods on other 2 benchmarks
Recognition C3D Performance
31Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D SoftwareImplementation by Michael Gygli (GitHub)
32
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
33
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
19
Recognition C3D
Figure Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
20Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Demo
21K Simonyan A Zisserman Very Deep Convolutional Networks for Large-Scale Image Recognition ICLR 2015
Recognition C3D Spatial dimensionSpatial dimensions (XY) of the used kernels are fixed to 3x3 following Symonian amp Zisserman (ICLR 2015)
22Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Temporal dimension3D ConvNets are more suitable for spatiotemporal feature learning compared to 2D ConvNets
Temporal depth
2D ConvNets
23Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
A homogeneous architecture with small 3 times 3 times 3 convolution kernels in all layers is among the best performing architectures for 3D ConvNets
Recognition C3D Temporal dimension
24Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Temporal dimension
25Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Architecture
Featurevector
26Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
Video sequence
16 frames-long clips
8 frames-long overlap
27Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
16-frame clip
16-frame clip
16-frame clip
16-frame clip
Average
4096
-dim
vid
eo d
escr
ipto
r
4096
-dim
vid
eo d
escr
ipto
r
L2 norm
28Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D VisualizationBased on Deconvnets by Zeiler and Fergus [ECCV 2014] - See [ReadCV Slides] for more details
29Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Compactness
30Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Convolutional 3D(C3D) combined with a simple linear classifier outperforms state-of-the-art methods on 4 different benchmarks and are comparable with state of the art methods on other 2 benchmarks
Recognition C3D Performance
31Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D SoftwareImplementation by Michael Gygli (GitHub)
32
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
33
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
20Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Demo
21K Simonyan A Zisserman Very Deep Convolutional Networks for Large-Scale Image Recognition ICLR 2015
Recognition C3D Spatial dimensionSpatial dimensions (XY) of the used kernels are fixed to 3x3 following Symonian amp Zisserman (ICLR 2015)
22Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Temporal dimension3D ConvNets are more suitable for spatiotemporal feature learning compared to 2D ConvNets
Temporal depth
2D ConvNets
23Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
A homogeneous architecture with small 3 times 3 times 3 convolution kernels in all layers is among the best performing architectures for 3D ConvNets
Recognition C3D Temporal dimension
24Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Temporal dimension
25Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Architecture
Featurevector
26Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
Video sequence
16 frames-long clips
8 frames-long overlap
27Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
16-frame clip
16-frame clip
16-frame clip
16-frame clip
Average
4096
-dim
vid
eo d
escr
ipto
r
4096
-dim
vid
eo d
escr
ipto
r
L2 norm
28Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D VisualizationBased on Deconvnets by Zeiler and Fergus [ECCV 2014] - See [ReadCV Slides] for more details
29Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Compactness
30Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Convolutional 3D(C3D) combined with a simple linear classifier outperforms state-of-the-art methods on 4 different benchmarks and are comparable with state of the art methods on other 2 benchmarks
Recognition C3D Performance
31Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D SoftwareImplementation by Michael Gygli (GitHub)
32
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
33
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
21K Simonyan A Zisserman Very Deep Convolutional Networks for Large-Scale Image Recognition ICLR 2015
Recognition C3D Spatial dimensionSpatial dimensions (XY) of the used kernels are fixed to 3x3 following Symonian amp Zisserman (ICLR 2015)
22Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Temporal dimension3D ConvNets are more suitable for spatiotemporal feature learning compared to 2D ConvNets
Temporal depth
2D ConvNets
23Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
A homogeneous architecture with small 3 times 3 times 3 convolution kernels in all layers is among the best performing architectures for 3D ConvNets
Recognition C3D Temporal dimension
24Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Temporal dimension
25Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Architecture
Featurevector
26Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
Video sequence
16 frames-long clips
8 frames-long overlap
27Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
16-frame clip
16-frame clip
16-frame clip
16-frame clip
Average
4096
-dim
vid
eo d
escr
ipto
r
4096
-dim
vid
eo d
escr
ipto
r
L2 norm
28Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D VisualizationBased on Deconvnets by Zeiler and Fergus [ECCV 2014] - See [ReadCV Slides] for more details
29Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Compactness
30Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Convolutional 3D(C3D) combined with a simple linear classifier outperforms state-of-the-art methods on 4 different benchmarks and are comparable with state of the art methods on other 2 benchmarks
Recognition C3D Performance
31Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D SoftwareImplementation by Michael Gygli (GitHub)
32
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
33
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
22Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Temporal dimension3D ConvNets are more suitable for spatiotemporal feature learning compared to 2D ConvNets
Temporal depth
2D ConvNets
23Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
A homogeneous architecture with small 3 times 3 times 3 convolution kernels in all layers is among the best performing architectures for 3D ConvNets
Recognition C3D Temporal dimension
24Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Temporal dimension
25Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Architecture
Featurevector
26Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
Video sequence
16 frames-long clips
8 frames-long overlap
27Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
16-frame clip
16-frame clip
16-frame clip
16-frame clip
Average
4096
-dim
vid
eo d
escr
ipto
r
4096
-dim
vid
eo d
escr
ipto
r
L2 norm
28Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D VisualizationBased on Deconvnets by Zeiler and Fergus [ECCV 2014] - See [ReadCV Slides] for more details
29Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Compactness
30Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Convolutional 3D(C3D) combined with a simple linear classifier outperforms state-of-the-art methods on 4 different benchmarks and are comparable with state of the art methods on other 2 benchmarks
Recognition C3D Performance
31Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D SoftwareImplementation by Michael Gygli (GitHub)
32
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
33
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
23Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
A homogeneous architecture with small 3 times 3 times 3 convolution kernels in all layers is among the best performing architectures for 3D ConvNets
Recognition C3D Temporal dimension
24Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Temporal dimension
25Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Architecture
Featurevector
26Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
Video sequence
16 frames-long clips
8 frames-long overlap
27Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
16-frame clip
16-frame clip
16-frame clip
16-frame clip
Average
4096
-dim
vid
eo d
escr
ipto
r
4096
-dim
vid
eo d
escr
ipto
r
L2 norm
28Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D VisualizationBased on Deconvnets by Zeiler and Fergus [ECCV 2014] - See [ReadCV Slides] for more details
29Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Compactness
30Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Convolutional 3D(C3D) combined with a simple linear classifier outperforms state-of-the-art methods on 4 different benchmarks and are comparable with state of the art methods on other 2 benchmarks
Recognition C3D Performance
31Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D SoftwareImplementation by Michael Gygli (GitHub)
32
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
33
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
24Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Temporal dimension
25Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Architecture
Featurevector
26Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
Video sequence
16 frames-long clips
8 frames-long overlap
27Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
16-frame clip
16-frame clip
16-frame clip
16-frame clip
Average
4096
-dim
vid
eo d
escr
ipto
r
4096
-dim
vid
eo d
escr
ipto
r
L2 norm
28Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D VisualizationBased on Deconvnets by Zeiler and Fergus [ECCV 2014] - See [ReadCV Slides] for more details
29Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Compactness
30Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Convolutional 3D(C3D) combined with a simple linear classifier outperforms state-of-the-art methods on 4 different benchmarks and are comparable with state of the art methods on other 2 benchmarks
Recognition C3D Performance
31Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D SoftwareImplementation by Michael Gygli (GitHub)
32
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
33
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
25Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
No gain when varying the temporal depth across layers
Recognition C3D Architecture
Featurevector
26Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
Video sequence
16 frames-long clips
8 frames-long overlap
27Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
16-frame clip
16-frame clip
16-frame clip
16-frame clip
Average
4096
-dim
vid
eo d
escr
ipto
r
4096
-dim
vid
eo d
escr
ipto
r
L2 norm
28Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D VisualizationBased on Deconvnets by Zeiler and Fergus [ECCV 2014] - See [ReadCV Slides] for more details
29Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Compactness
30Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Convolutional 3D(C3D) combined with a simple linear classifier outperforms state-of-the-art methods on 4 different benchmarks and are comparable with state of the art methods on other 2 benchmarks
Recognition C3D Performance
31Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D SoftwareImplementation by Michael Gygli (GitHub)
32
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
33
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
26Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
Video sequence
16 frames-long clips
8 frames-long overlap
27Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
16-frame clip
16-frame clip
16-frame clip
16-frame clip
Average
4096
-dim
vid
eo d
escr
ipto
r
4096
-dim
vid
eo d
escr
ipto
r
L2 norm
28Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D VisualizationBased on Deconvnets by Zeiler and Fergus [ECCV 2014] - See [ReadCV Slides] for more details
29Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Compactness
30Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Convolutional 3D(C3D) combined with a simple linear classifier outperforms state-of-the-art methods on 4 different benchmarks and are comparable with state of the art methods on other 2 benchmarks
Recognition C3D Performance
31Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D SoftwareImplementation by Michael Gygli (GitHub)
32
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
33
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
27Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Feature vector
16-frame clip
16-frame clip
16-frame clip
16-frame clip
Average
4096
-dim
vid
eo d
escr
ipto
r
4096
-dim
vid
eo d
escr
ipto
r
L2 norm
28Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D VisualizationBased on Deconvnets by Zeiler and Fergus [ECCV 2014] - See [ReadCV Slides] for more details
29Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Compactness
30Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Convolutional 3D(C3D) combined with a simple linear classifier outperforms state-of-the-art methods on 4 different benchmarks and are comparable with state of the art methods on other 2 benchmarks
Recognition C3D Performance
31Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D SoftwareImplementation by Michael Gygli (GitHub)
32
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
33
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
28Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D VisualizationBased on Deconvnets by Zeiler and Fergus [ECCV 2014] - See [ReadCV Slides] for more details
29Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Compactness
30Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Convolutional 3D(C3D) combined with a simple linear classifier outperforms state-of-the-art methods on 4 different benchmarks and are comparable with state of the art methods on other 2 benchmarks
Recognition C3D Performance
31Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D SoftwareImplementation by Michael Gygli (GitHub)
32
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
33
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
29Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D Compactness
30Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Convolutional 3D(C3D) combined with a simple linear classifier outperforms state-of-the-art methods on 4 different benchmarks and are comparable with state of the art methods on other 2 benchmarks
Recognition C3D Performance
31Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D SoftwareImplementation by Michael Gygli (GitHub)
32
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
33
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
30Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Convolutional 3D(C3D) combined with a simple linear classifier outperforms state-of-the-art methods on 4 different benchmarks and are comparable with state of the art methods on other 2 benchmarks
Recognition C3D Performance
31Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D SoftwareImplementation by Michael Gygli (GitHub)
32
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
33
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
31Tran Du Lubomir Bourdev Rob Fergus Lorenzo Torresani and Manohar Paluri Learning spatiotemporal features with 3D convolutional networks In Proceedings of the IEEE International Conference on Computer Vision pp 4489-4497 2015
Recognition C3D SoftwareImplementation by Michael Gygli (GitHub)
32
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
33
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
32
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
33
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
33
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
34
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
35
Recognition ImageNet Video
[ILSVRC 2015 Slides and videos]
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
36
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
37
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
38
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
39
Recognition ImageNet Video
Kai Kang et al Object Detection in Videos with TubeLets and Multi-Context Cues (ILSVRC 2015) [video] [poster]
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Optical Flow
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 40
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Optical Flow Small vs Large
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 41
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 42
Optical FlowClassic approachRigid matching of HoG or SIFT descriptors
Deep MatchingAllow each subpatch to move
independently in a limited range
depending on its size
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 43
Optical Flow Deep Matching
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Source Matlab R2015b documentation for normxcorr2 by Mathworks44
Optical Flow 2D correlation
Image
Sub-Image
Offset of the sub-image with respect to the image [00]
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 45
Instead of pre-trained filters a convolution is defined between each
patch of the reference image target image
as a results a correlation map is generated for each reference patch
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 46
Optical Flow Deep Matching
The most discriminative response map
The less discriminative
response map
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 47
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 48
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Bottom-upextraction
(BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 49
Optical Flow Deep Matching (BU)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 50
Key idea Build (bottom-up) a pyramid of correlation maps to run an efficient (top-down) search
Optical Flow Deep Matching (TD)
4x4 patches
8x8 patches
16x16 patches
32x32 patches
Top-down matching
(TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 51
Optical Flow Deep Matching (TD)Each local maxima in the top layer corresponds to a shift of one of the biggest (32x32) patchesIf we focus on local maximum we can retrieve the corresponding responses one scale below and focus on shift of the sub-patches that generated it
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 52
Optical Flow Deep Matching (TD)
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 53
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 54
Ground truth
Dense HOG[Brox amp Malik 2011]
Deep Matching
Optical Flow Deep Matching
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Weinzaepfel P Revaud J Harchaoui Z amp Schmid C (2013 December) DeepFlow Large displacement optical flow with deep matching In Computer Vision (ICCV) 2013 IEEE International Conference on (pp 1385-1392) IEEE 55
Optical Flow Deep Matching
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Optical Flow
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 56
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 57
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 58
End to end supervised learning of optical flow
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 59
Option A Stack both input images together and feed them through a generic network
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 60
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Optical Flow FlowNet (contracting)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 61
Option B Create two separate yet identical processing streams for the two images and combine them at a later stage
Correlation layer Convolution of data patches from the layers to combine
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Optical Flow FlowNet (expanding)
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 62
Upconvolutional layers Unpooling features maps + convolutionUpconvolutioned feature maps are concatenated with the corresponding map from the contractive part
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Optical Flow FlowNet
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 63
Since existing ground truth datasets are not sufficiently large to train a Convnet a synthetic Flying Dataset is generatedhellip and augmented (translation rotation scaling transformations additive Gaussian noise changes in brightness contrast gamma and color)
Convnets trained on these unrealistic data generalize well to existing datasets such as Sintel and KITTI
Data augmentation
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Dosovitskiy A Fischer P Ilg E Hausser P Hazirbas C Golkov V van der Smagt P Cremers D and Brox T 2015 FlowNet Learning Optical Flow With Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision (pp 2758-2766) 64
Optical Flow FlowNet
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Object tracking MDNet
65Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Object tracking MDNet
66Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Object tracking MDNet Architecture
67Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
Domain-specific layers are used during training for each sequence but are replaced by a single one at test time
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Object tracking MDNet Online update
68Nam Hyeonseob and Bohyung Han Learning multi-domain convolutional neural networks for visual tracking ICCV VOT Workshop (2015)
MDNet is updated online at test time with hard negative mining that is selecting negative samples with the highest positive score
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Object tracking FCNT
69Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Object tracking FCNT
70Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Focus on conv4-3 and conv5-3 of VGG-16 network pre-trained for ImageNet image classification
conv4-3 conv5-3
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Object tracking FCNT Specialization
71Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Most feature maps in VGG-16 conv4-3 and conv5-3 are not related to the foreground regions in a tracking sequence
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Object tracking FCNT Localization
72Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
Although trained for image classification feature maps in conv5-3 enable object localizationhellipbut is not discriminative enough to different objects of the same category
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Object tracking Localization
73Zhou Bolei Aditya Khosla Agata Lapedriza Aude Oliva and Antonio Torralba Object detectors emerge in deep scene cnns ICLR 2015
[Zhou et al ICLR 2015] ldquoObject detectors emerge in deep scene CNNsrdquo [Slides from ReadCV]
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Object tracking FCNT Localization
74Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
On the other hand feature maps from conv4-3 are more sensitive to intra-class appearance variationhellip
conv4-3 conv5-3
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Object tracking FCNT Architecture
75Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
SNet=Specific Network (online update)
GNet=General Network (fixed)
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Object tracking FCNT Results
76Wang Lijun Wanli Ouyang Xiaogang Wang and Huchuan Lu Visual Tracking with Fully Convolutional Networks In Proceedings of the IEEE International Conference on Computer Vision pp 3119-3127 2015 [code]
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
ConvNets Software
Caffe httpcaffeberkeleyvisionorg
Torch (Overfeat) httptorchch
Theano httpdeeplearningnetsoftwaretheano
Tensor Flow httpswwwtensorfloworg
MatconvNet (VLFeat) httpwwwvlfeatorgmatconvnet
CNTK (Mcrosoft) httpwwwcntkai 77
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Seminar Series Compacting ConvNets
for End to End Learning
Tuesday February 2 4pm
D5-010 Campus Nord
ConvNets Learn more
78
Jose M Aacutelvarez
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Stanford course CS231n
Convolutional Neural Networks for Visual
Recognition
ConvNets Learn more
79
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
ConvNets Learn more
Online course
Deep Learning
Taking machine learning to the next
level
80
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
ReadCV seminarFriendly reviews of SoA papers
Spring 2016
Tuesdays at 11am
ConvNets Learn more
81
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Barcelona Convolucionada
Deep Learning a lrsquoabast de tothom
Monday February 1 7pm FIB Campus Nord UPC
ConvNets Learn more
82
Grup drsquoestudi de machine learning Barcelona
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Summer course
Deep Learning for Computer Vision
(25 ECTS for MSc amp Phd)
July 4-8 3-7pm
ConvNets Learn more
83
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Deep learning methos for vision (CVPR 2012)
Tutorial on deep learning for vision (CVPR 2014)
Kyunghyun Cho ldquoDeep Learning Past Present amp Futurerdquo
ConvNets Learn more
84
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
ConvNets Learn more
85
ldquoMachine learningrdquo sub-Reddit
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
ConvNets Learn more
86
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
ConvNets Learn more
87
Check profile requirements for Summer internship (disclaimer offered to Phd students by default)
Company Avg Salary hour Avg Salary month
Yahoo $43 ($43x160=$6880)
Apple $37 ($37x160=$5920)
Google $2954-$3132 $7151
Facebook $2292 $6150-$7378
Microsoft $2263 $6506-$7171
Source Glassdoorcom (internships in California No stipends included)
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
ConvNets Learn more
88
Video Cristian Cantonrsquos talk ldquoFrom Catalonia to America notes on how to achieve a successful post-Phd career rdquo ACMCV 2015 amp UPC
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Li Fei-Fei ldquoHow wersquore teaching computers to understand picturesrdquo TEDTalks 2014
ConvNets Learn more
89
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Jeremy Howard ldquoThe wonderful and terrifying implications of computers that can learnrdquo TEDTalks 2014
ConvNets Learn more
90
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
ConvNets Learn more
91
Neil Lawrence OpenAI wonrsquot benefit humanity without open data sharing (The Guardian 14122015)
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Is Computer Vision solved
ConvNets Discussion
92
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Sports Do you know them
93
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
ConvNets Do you know them
94
Antonio Torralba MIT(former UPC)
and MANY MORE I am missing in the page (apologies)
Oriol Vinyals Google(former UPC)
Jose M Aacutelvarez NICTA(former URL amp UAB)
Joan Bruna Berkeley(former UPC)
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
95
ConvNets Where you are studyingVisioCat dinner CVPR 2015
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Considering a Phd at GPI-UPC Currently no direct funding available (check in the future)We can support your application to scholarships
External grant listings UPC UPF
Funding institution Last deadlines (on 2812016)
FI (Catalonia) 22092015
FPU (Spain) 15012016
Check our activity at httpsimatgeupceduweb 96
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Image Classification
97
Our past research
A Salvador Zeppelzauer M Manchon-Vizuete D Calafell-Oroacutes A and Giroacute-i-Nieto X ldquoCultural Event Recognition with Visual ConvNets and Temporal Modelsrdquo in CVPR ChaLearn Looking at People Workshop 2015 2015 [slides]
ChaLearn Worshop
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Saliency Prediction
J Pan and Giroacute-i-Nieto X ldquoEnd-to-end Convolutional Network for Saliency Predictionrdquo in Large-scale Scene Understanding Challenge (LSUN) at CVPR Workshops Boston MA (USA) 2015 [Slides] 98
Our current research
LSUN Challenge
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Sentiment Analysis
99
Our current research
[Slides]
CNN
V Campos Salvador A Jou B and Giroacute-i-Nieto X ldquoDiving Deep into Sentiment Understanding Fine-tuned CNNs for Visual Sentiment Predictionrdquo in 1st International Workshop on Affect and Sentiment in Multimedia Brisbane Australia 2015
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Our current research
Instance Search in Video
100
V - T Nguyen -Dinh-Le D Salvador A -Zhu C Nguyen D - L Tran M - T Duc T Ngo Duong D Anh Satoh S ichi and Giroacute-i-Nieto X ldquoNII-HITACHI-UIT at TRECVID 2015 Instance Searchrdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
K McGuinness Mohedano E Salvador A Zhang Z X Marsden M Wang P Jargalsaikhan I Antony J Giroacute-i-Nieto X Satoh S ichi OConnor N and Smeaton A F ldquoInsight DCU at TRECVID 2015rdquo in TRECVID 2015 Workshop Gaithersburg MD USA 2015
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101
Thank you
Slides available on and
httpsimatgeupceduwebpeoplexavier-giro
httpbitsearchblogspotcom
httpstwittercomDocXavi
httpswwwfacebookcomProfessorXavi
xaviergiroupcedu
101