Top Banner
Computer Vision - now working in over 2 Billion Web Browsers! Rob Manson CEO & co-founder Sebastian Montabone Computer Vision Engineer Mixed Reality. In the web. On any device. https://try.awe.media
15

Computer Vision - now working in over 2 Billion Web Browsers!

Jan 21, 2018

Download

Internet

Rob Manson
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Computer Vision - now working in over 2 Billion Web Browsers!

Computer Vision - now workingin over 2 Billion Web Browsers!

Rob MansonCEO & co-founder

Sebastian MontaboneComputer Vision Engineer

Mixed Reality. In the web. On any device. https://try.awe.media

Page 2: Computer Vision - now working in over 2 Billion Web Browsers!
Page 3: Computer Vision - now working in over 2 Billion Web Browsers!

So what is Mixed Reality?

Here’s a short demo of Milgram’s Mixed Reality Continuum - all running in a browser.

awe.media

Page 4: Computer Vision - now working in over 2 Billion Web Browsers!

A brief/biased history of Computer Vision 1957 - Russel A. Kirsch scans first photo with a computer

1960 - Larry Roberts publishes thesis at MIT

1964 - First facial recognition system (unamed intelligence agency)

1976 - UK Police create first License Plate recognition system

1978 - David Marr proposes edge detection framework at MIT

1985 - Lockheed Martin/Carnegie Mellon create first self-driving land vehicle

1992 - Tom Caudell at Boeing coins the term Augmented Reality

1999 - Billinghurst & Kato publish/demo ARToolkit at IWAR/SIGGRAPH

2000 - Windows only alpha version of OpenCV launched at CVPR

2007 - OpenCV 1.0 released

2008 - ARToolkit ported to Flash by @saqoosha

2011 - ARToolkit ported to Javascript by Ilmari Heikkinen

2011 - FastCV/Vuforia 1.0 released

2017 - Facebook adds Computer Vision to their camera app

2017 - OpenCV in the browser demonstrated here awe.media

Page 5: Computer Vision - now working in over 2 Billion Web Browsers!

How does Computer Visionwork in the browser?

awe.media

camera -> gUM -> video -> canvas -> pixels -> vision algorithms

Page 6: Computer Vision - now working in over 2 Billion Web Browsers!

HTMLVideoElement

This is a container for decoding and presenting video streams. This brought plugin free video to the web.

awe.media

Page 7: Computer Vision - now working in over 2 Billion Web Browsers!

awe.media

Canvas, WebGL & the ArrayBuffer

The 2D Canvas gave us the ability to convert a video stream into pixel data.

WebGL brought 3D Canvases with access to the GPU. But most importantly WebGL gave us ArrayBuffers

which allowed us to access the pixel data for the first time.

Page 8: Computer Vision - now working in over 2 Billion Web Browsers!

awe.media

JSARToolkit

In 2011 Billinghurst & Kato's ARToolkit was ported to Javascript.

Page 9: Computer Vision - now working in over 2 Billion Web Browsers!

awe.media

Enter WebRTC's getUserMedia()

Some claim this has a latency that makes the web unusable for AR.But here’s the numbers running on a Pixel - the max difference is ~200ms

200-250ms - Camera stream in a native AR 350-400ms - gUM stream in a web app

Page 10: Computer Vision - now working in over 2 Billion Web Browsers!

awe.media

WebRTC's getUserMedia()

FAST feature detection & Tigerstail in 2012

Page 11: Computer Vision - now working in over 2 Billion Web Browsers!

awe.media

WebRTC's getUserMedia()

Tracking.js released in 2012

Page 12: Computer Vision - now working in over 2 Billion Web Browsers!

awe.media

WebRTC's getUserMedia()

AR.js released in 2017

Page 13: Computer Vision - now working in over 2 Billion Web Browsers!

awe.media

Transpiling OpenCV

This brings a more general computer vision toolkit to the web!

Page 14: Computer Vision - now working in over 2 Billion Web Browsers!

Demo Time!

awe.media

Page 15: Computer Vision - now working in over 2 Billion Web Browsers!

awe.media

But there's no gUM on iOS?

For Vision based functionality we fallback to Visual Search

For Location based apps we fallback to 360°/VR (like Pokemon Go with the camera off)

And remember “video see thu” is not the only form of AR