Copyright © 2016 MathWorks, Inc. and Intel Corp 1 Getting from Idea to Product with 3D Vision Avinash Nehemiah, A.G. Ramesh May 3, 2016
Apr 15, 2017
Copyright © 2016 MathWorks, Inc. and Intel Corp 1
Getting from Idea to Product with 3D Vision
Avinash Nehemiah, A.G. Ramesh
May 3, 2016
Copyright © 2016 MathWorks, Inc. and Intel Corp 2
• Measure objects and distances
• Locate position of autonomous systems
• Map the environment of an autonomous system
• Reconstruct the 3-D structure of the environment
What Can You Do With 3-D Vision?
3-D Vision uses sensors to measure, map, locate and reconstruct the three
dimensional structure of the environment for visual perception tasks
Copyright © 2016 MathWorks, Inc. and Intel Corp 3
• Examples of two real systems with 3-D vision
• Box Measurement: Measuring object
dimensions with 3-D Vision
• 3-D Reconstruction and Odometry:
Reconstructing 3-D scene map and estimating
trajectory of an autonomous system
• For each system
• Challenges faced
• How we solved them
• Practical tradeoffs
Topics for Today’s Talk
Copyright © 2016 MathWorks, Inc. and Intel Corp 4
System Integration
Test
Integrate
3-D Vision System Design
3-D Vision System Design: Top 3 Challenges
Sensor
Selection
Dilemma
Very Difficult
to Test 3-D Vision is
Never Perfect
Test Algorithm
Design Algorithm
Select Camera
Copyright © 2016 MathWorks, Inc. and Intel Corp 5
• Measuring the dimensions of packages in 3-D using vision only
• Non-intrusive — no measuring tape etc.
• Faster — simply point device at box to measure
Box Measurement
Example System #1
Copyright © 2016 MathWorks, Inc. and Intel Corp 6
Box Measurement
Sensor Selection Dilemma
Sensor Use Cases Limitation Range Software
Complexity
Cost
Single
Camera
• 3-D structure from sensor or
object motion
• Measurement (planar objects
only)
• Up to scale
reconstruction only
( not metric)
• Planar objects only
Usually up to 8
meters
High Low
Stereo
Camera
• Measurement
• Object recognition
• Navigation
• 3-D Reconstruction
• Does not work well
with homogenous
surfaces
• Requires good
lighting conditions
Usually up to 10
meters
Medium Medium
Depth sensor
IR based
(Time of
flight,
structured
light, active
stereo)
• Measurement
• Object recognition
• Navigation
• 3D reconstruction
• Gesture recognition
• Augmented Reality
• Does not work well
outdoors
• Requires
calibration
Range depends on
implementation
0.3 m to 4m
Low High
Copyright © 2016 MathWorks, Inc. and Intel Corp 7
Box Measurement
Challenges Faced
User Clicked
Points
Holes in RGB-D
Output
• Point to point measurement with user input
• User clicks on end points
• Significant error if user is off by a few pixels
• Time consuming — 4 clicks from user
• Fully automatic measurement
• Accuracy of RGB-D output depends on surface
material
• Boxes are hard to detect
• No distinct texture or features
• Can be confused with other surfaces —
corner between wall and floor
Copyright © 2016 MathWorks, Inc. and Intel Corp 8
Box Measurement
3-D Vision is Never Perfect
Detected Plane
– Not Box
Surface
• Initial Solution Approach (Fully Automated)
• Use 3-D plane fitting to find box surface
• Practical Issues Found
• Confusion with floor, walls and other planes
• Failures when too many holes in point cloud
• Final Solution
• Single click from user to locate box
• Holes in RGB-D filled using 2-D image
processing techniques ( flood fill )
• Used plane intersection to find edges
• Tradeoff: Single click by user = substantially
better accuracy
Copyright © 2016 MathWorks, Inc. and Intel Corp 9
• Consecutive runs of algorithm — 2cm variance in output
• Why does this happen?
• Plane fitting uses Random Sample Consensus (RANSAC)
to account for noisy points
• This gives a small variation in the planes fit from run – run
• Workarounds
• Accuracy: Run algorithm multiple times on same data
and remove bad measurements
• Testing: Collect ground truth (perfect mm accurate
measurements), automate testing to test outputs vs.
ground truth
Box Measurement
Very Difficult to Test
#1 – 270cm x417cm x 174cm
#2 – 272cm x414cm x 174cm
Copyright © 2016 MathWorks, Inc. and Intel Corp 10
• User Experience vs. Measurement Accuracy
(Single click vs. Fully Automated)
Single click user input
• Higher measurement accuracy
• Fewer false detections
• Fully automatic
• Potential for false detections
• Compute Time vs. Accuracy
(Run Algorithm Once vs. Multiple Times)
• Single Run
• 2-5 cm variance from run-run
Multiple runs
• Better measurement accuracy
Box Measurement
Practical Tradeoffs
Copyright © 2016 MathWorks, Inc. and Intel Corp 12
• Estimate trajectory (visual
odometry) of motion of
autonomous robot and reconstruct
3-D model of scene
• Used by search and rescue
robots to map areas reconstruct
3-D models
• Elements of simultaneous
localization and mapping
(SLAM)
3-D Reconstruction and Odometry
Example System #2
Copyright © 2016 MathWorks, Inc. and Intel Corp 13
• Three sensor types considered – Single camera, stereo camera, RGB-D
3-D Reconstruction and Odometry
Sensor Selection Dilemma
Sensor 3-D Reconstruction Cost Software
Complexity
Processing
Power Required
Single Camera
(Calibrated)
Up to a projective
transform (no real units)
Low High High
Stereo Camera Full 3-D reconstruction
with real units
Medium Low High
RGB-D Full 3-D reconstruction
with real units . Does not
work well outdoors
High Low Medium
• Our choice: Two sensors ( RGB-D and single color camera )
• 3-D Reconstruction (RGB-D):
• RGB-D over stereo camera since application was indoors (due to limited compute)
• Visual Odometry (Single color camera)
• Better range and resolution than color sensor on RGB-D camera
Copyright © 2016 MathWorks, Inc. and Intel Corp 14
3-D Reconstruction and Odometry
3-D Vision is Never Perfect
• Issue
• Small errors creating map from
frame-frame compound
• Workarounds
• Leverage other sensors
• Use secondary sensor(IMU) to
augment estimate from vision
algorithm
• Vision only (increased algorithm
complexity)
• Detect loop closure (the robot passes
same point for second time)
• Perform bundle adjustment to adjust
transforms for stitching point clouds
Copyright © 2016 MathWorks, Inc. and Intel Corp 15
3-D Reconstruction and Odometry
Very Difficult to Test
• Issues Faced
• Different runs with identical data had different results
• Very difficult to test visualize and test intermediate steps
• Why does this happen
• Randomness in pose estimation from frame-frame due to
RANSAC
• Workarounds
• Simulate against perfect synthetic data to establish a
“known good” — use this for initial development
• Test algorithms with random inputs — make sure a good
result wasn’t “lucky”
• Use bundle adjustment to refine estimates
Copyright © 2016 MathWorks, Inc. and Intel Corp 16
• Tradeoff: Cost vs. Computation Time
• Factor: Additional Sensor vs. Increased Algorithm Complexity
Additional Sensor (IMU)
• Very accurate transform estimation for stitching
• Less computation since load on vision system is reduced
• Difficult to align and synchronize multiple sensors
• Increased Algorithm Complexity
• Cheaper since no additional sensor required
• Slower and more computationally expensive
3-D Reconstruction and Odometry
Practical Tradeoffs
Copyright © 2016 MathWorks, Inc. and Intel Corp 18
• 3-D vision is never perfect
• Leverage other sensors
• Clever heuristics can make all the difference
• Very difficult to test
• Establish “ground truth” and test vs. ground truth
• Use perfect synthetic data to test against
• Sensor selection dilemma
• Consider the software complexity vs. cost tradeoff
Lessons Learned: 3-D Vision
Copyright © 2016 MathWorks, Inc. and Intel Corp 19
• Books
• Multiple View Geometry in Computer Vision — Hartley and Zisserman
• Links
• 3-D Point Cloud Processing
• Stereo Vision
• Structure from Motion
• Measuring Planar Objects with a Calibrated Camera
• Software
• Computer Vision System Toolbox
• Hardware
• Intel® RealSense™
Resources