Privacy-Preserving Action Recognition using Coded Aperture Videos · 2020-06-11 · The Bright and Dark Sides of Computer Vision: Challenges and Opportunities for Privacy and Security

Privacy-Preserving Action Recognition using Coded Aperture Videos

Z. W. Wang1, V. Vineet2, F. Pittaluga3, S. N. Sinha2, O. Cossairt1, S. B. Kang4

1Northwestern University, 2Microsoft Research, 3University of Florida, 4Zillow Group

What is a coded aperture camera?

We propose:1. Pre-capture privacy: lens-free coded aperture cameras.2. Post-capture privacy: “mask-invariant” motion features.

Conventional cam.Output

Lens-free CA cam.Output

Vision from CA images?5-class image classification

gray images >95% CA images ~60%

The Bright and Dark Sides of Computer Vision: Challenges and Opportunities for Privacy and Security (CV-COPS 2019), Long Beach CA, June 16, 2019, in conjunction with CVPR 2019

Encoding mask

Imaging sensor

1. Polarizing filters.

2. 550nm long-pass filter

3. Spatial light

modulator (SLM)

4. Camera board

Building a lens-free CA camera

Motion features

C𝑑𝑑 𝛎𝛎 =D1 ⋅ D2

∗

D1 ⋅ D2∗ = 𝜙𝜙∗

O1 � A � A∗ ⋅ O1∗

O1 � A � A∗ ⋅ O1∗ ≈ 𝜙𝜙∗

+ T features are invariant of mask patterns (A in Fourier space).

- RS features do not share mask-invariant property.+ Solution: shuffle masks during training.+ Further improvement: compute TRS at multiple time intervals.

Cross power spectrum of two CA images in Fourier space.

training with varying random masks improves accuracy!

Benefits of mask-invariant propertyApplication: private/public surveillance

User: a generic classifier to only monitor/respond to actions.

Manufacturer: relaxed mask design, less calibration effort.

Hacker: more challenging to recover the scenes w/o mask info.

Reconstruction with PSF info?Non-trivial and expensive

Goal: executing visual task(s) without looking at privacy-revealing data.

Translation (phase correlation), Rotation & Scaling

𝑜𝑜1(𝐩𝐩)

𝑜𝑜2(𝐩𝐩)

O1(𝝊𝝊)

𝑜𝑜2 𝐩𝐩 = 𝑜𝑜1 𝐬𝐬𝐬𝐬𝐩𝐩′

O2(𝝊𝝊)

C(𝝊𝝊) Tfeatures

Ϝ 𝑂𝑂2

Ϝ 𝑂𝑂2

C(𝝎𝝎) RSfeatures

C 𝝊𝝊 =O1 ⋅ O2

∗

O1 ⋅ O2∗ = 𝜙𝜙∗ O1 ⋅ O1

∗

O1 ⋅ O1∗ = 𝜙𝜙(−∆𝐩𝐩)𝐩𝐩′ = 𝐩𝐩 + ∆𝐩𝐩

Log-polar transform

Fourier

Results in simulation

Salient motion > subtle motion

Privacy-Preserving Action Recognition using Coded Aperture Videos · 2020-06-11 · The Bright and Dark Sides of Computer Vision: Challenges and Opportunities for Privacy and Security

Documents