Top Banner
Energy-Efficient Face Detection Using Andes RISC-V Processor Presenter: Chien-Hao Chen Advisor: Prof. Chen-Yi Lee Date: 2018/03/12 1
58

Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Mar 31, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Energy-Efficient Face Detection Using Andes RISC-V Processor

Presenter: Chien-Hao Chen

Advisor: Prof. Chen-Yi Lee

Date: 2018/03/12

1

Page 2: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Outline • Introduction

• Face Detector on Andes Processor

• Experiment Result

• Conclusion

• Reference

2

Page 3: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Outline • Introduction

• Motivation

• Face Detection Model

• Face Detector on Andes Processor

• Experiment Result

• Conclusion

• Reference

3

Page 4: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Motivation • Cloud computing

– Image upload to cloud → → result returned

• Edge computing

– Image directly computed → → result returned

4

processing

processing

Page 5: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Face Detection Model MTCNN, 2016[1]

1. Resize image and sliding window sampling

2. P-Net (Proposal): Find candidate bounding box

3. R-Net (Refine): Reject the wrong candidate from P-Net

4. O-Net (Output): From R-Net, find more correct face region

P-Net R-Net O-Net

5 Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters (SPL), vol. 23, no. 10, pp. 1499-1503, 2016

Page 6: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Face Detection Model • P-Net (Proposal):

• Fully convolution with 3 convolution and 1 max pooling layer

• Rough proposal

• R-Net (Refine): • 3 convolution, 2 max pooling and 1 fully connect layer

• Reject false proposal from P-Net

• O-Net (Output): • 4 convolution, 3 max pooling and

1 fully connect layer

• More complicated model

→ Reject false result from R-NET

→ Better face bounding box position

6

Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters (SPL), vol. 23, no. 10, pp. 1499-1503, 2016

Page 7: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Outline • Introduction

• Face Detector on Andes Processor − Hardware environment

− Model Simplification and Acceleration

• Experiment Result

• Conclusion

• Reference

7

Page 8: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

8

Hardware environment Andes RISC-V :

− Processor 60MHz, 64-bit AndesCore

− Xilinx Kintex-7 FPGA XC7K410T

− DRAM: 1GB

− Flash: 64MB

Page 9: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Outline • Introduction

• Face Detector on Andes Processor − Hardware environment

− Model Simplification and Acceleration

• Experiment Result

• Conclusion

• Reference

9

Page 10: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Depth-wise separable convolution [3]

10

Model Simplification and Acceleration

Model Simplify

1 1

Page 11: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Depth-wise MTCNN

• P-Net: (Proposal) • Fully convolution with 1 convolution layer: stride = 2 (channel: 10)

2 DW convolution layer: stride = 1 (channel: 16, 32)

• R-Net: (Refine) • 1 convolution layer: stride = 2

1 DW convolution layer: stride = 2 1 DW convolution layer: stride = 1

• 1 fully connect

• O-Net: (Output) • 1 convolution: stride = 2

2 DW convolution: stride = 2 2 convolution: stride = 1 (channel: 128, 128)

• 1 fully connect

11

Model Simplification and Acceleration

8 24

Page 12: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Motivation

• Ex: If PNET input size 240 × 320 output1 size 115 × 155 × 2 output2 size 115 × 155 × 4

• Soft-max:

𝜎𝑥𝑦 =

𝑒𝑥

𝑒𝑥 + 𝑒𝑦

𝑒𝑦

𝑒𝑥 + 𝑒𝑦

→ 6 𝑒𝑥𝑝𝑜𝑛𝑒𝑛𝑡𝑖𝑎𝑙 & 2 𝑑𝑖𝑣𝑖𝑠𝑖𝑜𝑛

• For output1 Soft-max: → 115 × 155 × 6~107𝑘 𝑒𝑥𝑝𝑜𝑛𝑒𝑛𝑡𝑖𝑎𝑙 → 115 × 155 × 2~35𝑘 𝑑𝑖𝑣𝑖𝑠𝑖𝑜𝑛

12

1 2 Soft-max

Approximation

Model Simplification and Acceleration

𝐻𝑜𝑢𝑡 =𝐻𝑖𝑛 − 𝐻𝑓𝑖𝑙𝑡𝑒𝑟 + 𝑃𝑎𝑑𝑑𝑖𝑛𝑔

𝑆𝑡𝑟𝑖𝑑𝑒+ 1

=240 − 12 + 0

2+ 1 = 115

Page 13: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Soft-max approximation

• 𝜎𝑥𝑦 =

𝑒𝑥

𝑒𝑥+𝑒𝑦

𝑒𝑦

𝑒𝑥+𝑒𝑦

13

Model Simplification and Acceleration

1 2 Soft-max

Approximation

Page 14: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Soft-max approximation

• 𝜎𝑥𝑦 =

𝑒𝑥

𝑒𝑥+𝑒𝑦

𝑒𝑦

𝑒𝑥+𝑒𝑦

14

> 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑(𝑃)

Model Simplification and Acceleration

1 2 Soft-max

Approximation

Page 15: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Soft-max approximation

• 𝜎𝑥𝑦 =

𝑒𝑥

𝑒𝑥+𝑒𝑦

𝑒𝑦

𝑒𝑥+𝑒𝑦

15

> 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑(𝑃)

𝑒𝑥

𝑒𝑥 + 𝑒𝑦> 𝑃

Model Simplification and Acceleration

1 2 Soft-max

Approximation

Page 16: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Soft-max approximation

• 𝜎𝑥𝑦 =

𝑒𝑥

𝑒𝑥+𝑒𝑦

𝑒𝑦

𝑒𝑥+𝑒𝑦

16

> 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑(𝑃)

𝑒𝑥

𝑒𝑥 + 𝑒𝑦> 𝑃

𝑒𝑥 > 𝑃𝑒𝑥 + 𝑃𝑒𝑦

Model Simplification and Acceleration

1 2 Soft-max

Approximation

Page 17: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Soft-max approximation

• 𝜎𝑥𝑦 =

𝑒𝑥

𝑒𝑥+𝑒𝑦

𝑒𝑦

𝑒𝑥+𝑒𝑦

17

> 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑(𝑃)

𝑒𝑥

𝑒𝑥 + 𝑒𝑦> 𝑃

𝑒𝑥 > 𝑃𝑒𝑥 + 𝑃𝑒𝑦

(1 − 𝑃)𝑒𝑥> 𝑃𝑒𝑦

Model Simplification and Acceleration

1 2 Soft-max

Approximation

Page 18: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Soft-max approximation

• 𝜎𝑥𝑦 =

𝑒𝑥

𝑒𝑥+𝑒𝑦

𝑒𝑦

𝑒𝑥+𝑒𝑦

18

> 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑(𝑃)

𝑒𝑥

𝑒𝑥 + 𝑒𝑦> 𝑃

𝑒𝑥 > 𝑃𝑒𝑥 + 𝑃𝑒𝑦

(1 − 𝑃)𝑒𝑥> 𝑃𝑒𝑦

𝑙𝑛 1 − 𝑃 + 𝑥 > 𝑙𝑛 𝑃 + 𝑦

Model Simplification and Acceleration

1 2 Soft-max

Approximation

Page 19: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Soft-max approximation

• 𝜎𝑥𝑦 =

𝑒𝑥

𝑒𝑥+𝑒𝑦

𝑒𝑦

𝑒𝑥+𝑒𝑦

19

> 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑(𝑃)

𝑒𝑥

𝑒𝑥 + 𝑒𝑦> 𝑃

𝑒𝑥 > 𝑃𝑒𝑥 + 𝑃𝑒𝑦

(1 − 𝑃)𝑒𝑥> 𝑃𝑒𝑦

𝑙𝑛 1 − 𝑃 + 𝑥 > 𝑙𝑛 𝑃 + 𝑦

𝑥 > 𝑙𝑛 (𝑃

1 − 𝑃) + 𝑦

Model Simplification and Acceleration

1 2 Soft-max

Approximation

Page 20: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Soft-max approximation

• 𝜎𝑥𝑦 =

𝑒𝑥

𝑒𝑥+𝑒𝑦

𝑒𝑦

𝑒𝑥+𝑒𝑦

20

> 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑(𝑃)

𝑒𝑥

𝑒𝑥 + 𝑒𝑦> 𝑃

𝑒𝑥 > 𝑃𝑒𝑥 + 𝑃𝑒𝑦

(1 − 𝑃)𝑒𝑥> 𝑃𝑒𝑦

𝑙𝑛 1 − 𝑃 + 𝑥 > 𝑙𝑛 𝑃 + 𝑦

𝑥 > 𝑙𝑛 (𝑃

1 − 𝑃) + 𝑦

constant

Model Simplification and Acceleration

1 2 Soft-max

Approximation

Page 21: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

21

𝑒𝑥

𝑒𝑥 + 𝑒𝑦= 0.7

𝑥 = 𝑙𝑛 (0.7

1 − 0.7) + 𝑦

Model Simplification and Acceleration

1 2

Page 22: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Outline • Introduction

• Face Detector on Andes Processor

• Experiment Result

• Conclusion

• Reference

22

Page 23: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

• On FDDB[4] database: • P-Net, R-Net threshold = 0.6, 0.7; min-face = 25x25

23

Experiment Result

Method Accuracy @

FPPI 0.01 Accuracy @

FPPI 0.1 Accuracy @

FPPI 1.0

Speedup @ Andes RISC-V

Processor

MTCNN 84.95% 92.40% 94.66% -

Ours 82.59% 88.15% 90.68% 106x

• FPPI: False Positive Per Image

Page 24: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

• On FDDB database:

24

Experiment Result

• FPPI: False Positive Per Image

Method Accuracy @

FPPI 1.0

Speedup @ Andes RISC-V

Processor

MTCNN 94.66% -

Ours 90.68% 106x

Method Accuracy

@ FPPI 0.1 Accuracy

@ FPPI 0.01 FPS

(Titan X GPU)

FPS (1080-Ti)

Brodmann17 89.25% 81.88% 200 90

DeepIR 88.45% 82.16% <=1

Xiaomi 87.82% 77.99% 2?

Faceness 86.04% 79.67% 1

Hyperface 85.63% 80.68% 0.33

DP2MFD 85.57% 76.73% <0.05

Ours 88.15% 82.59% 54

Page 25: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

• On FDDB database:

• Performance without considering face size under 48x48

• P-Net, R-Net threshold = 0.9, 0.85; min-face = 48x48

• P-Net, R-Net threshold = 0.6, 0.7; min-face = 48x48

25

Method Accuracy @

FPPI 0.01 Accuracy @

FPPI 0.1

Ours 86.64% 87.7%

Method Accuracy @

FPPI 0.01 Accuracy @

FPPI 0.1

Ours 90.53% 93.81%

Experiment Result

Page 26: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Outline • Introduction

• Face Detector on Andes Processor

• Experiment Result

• Conclusion

• Reference

26

Page 27: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

• Proposed face detection model

Conclusion

27

Model Size 3.6x smaller

Speedup @ Andes processor

106x faster

Accuracy @ FPPI 1.0

90.68%

Page 28: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Reference

28

[1] Zhang, Kaipeng, et al. "Joint face detection and alignment using multitask cascaded convolutional networks." IEEE Signal Processing Letters 23.10 (2016): 1499-1503.

[2] Li, Haoxiang, et al. "A convolutional neural network cascade for face detection." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.

[3] Howard, Andrew G., et al. "Mobilenets: Efficient convolutional neural networks for mobile vision applications." arXiv preprint arXiv:1704.04861 (2017).

[4] Jain, Vidit, and Erik Learned-Miller. Fddb: A benchmark for face detection in unconstrained settings. Vol. 2. No. 4. UMass Amherst Technical Report, 2010.

Page 29: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Reference

29

[5] Sun, Xudong, Pengcheng Wu, and Steven CH Hoi. "Face detection using deep learning: An improved faster rcnn approach." Neurocomputing 299 (2018): 42-50.

[6] Jiang, Huaizu, and Erik Learned-Miller. "Face detection with the faster R-CNN." 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017). IEEE, 2017.

[7] Yang, Shuo, et al. "Faceness-net: Face detection through deep facial part responses." IEEE transactions on pattern analysis and machine intelligence 40.8 (2018): 1845-1859.

[8] Ranjan, Rajeev, Vishal M. Patel, and Rama Chellappa. "Hyperface: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition." IEEE Transactions on Pattern Analysis and Machine Intelligence 41.1 (2019): 121-135.

Page 30: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Reference

30

[9] Ranjan, Rajeev, Vishal M. Patel, and Rama Chellappa. "A deep pyramid deformable part model for face detection." 2015 IEEE 7th International Conference on Biometrics Theory, Applications and Systems (BTAS). IEEE, 2015.

Page 31: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Thanks for your listening!

31

Page 32: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

32

Page 33: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Soft-max with NMS

33

Page 34: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Soft-max approximation

•𝑒𝑥

𝑒𝑥+𝑒𝑦 > 𝑃 → 𝑥 >𝑙𝑛 𝑃

𝑙𝑛 1−𝑃+ 𝑦

Soft-max approximation with NMS

• NMS:

34

Highest score

Model Simplification and Acceleration

1 2 Soft-max

Approximation

Page 35: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Soft-max approximation

•𝑒𝑥

𝑒𝑥+𝑒𝑦 > 𝑃 → 𝑥 >𝑙𝑛 𝑃

𝑙𝑛 1−𝑃+ 𝑦

Soft-max approximation with NMS

• NMS:

•𝑒𝑥1

𝑒𝑥1+𝑒𝑦1 >𝑒𝑥2

𝑒𝑥2+𝑒𝑦2

35

Highest score

Model Simplification and Acceleration

1 2 Soft-max

Approximation

Page 36: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Soft-max approximation

•𝑒𝑥

𝑒𝑥+𝑒𝑦 > 𝑃 → 𝑥 >𝑙𝑛 𝑃

𝑙𝑛 1−𝑃+ 𝑦

Soft-max approximation with NMS

• NMS:

•𝑒𝑥1

𝑒𝑥1+𝑒𝑦1 >𝑒𝑥2

𝑒𝑥2+𝑒𝑦2

→ 𝑒𝑥1(𝑒𝑥2 + 𝑒𝑦2) > 𝑒𝑥2(𝑒𝑥1 + 𝑒𝑦1)

36

Highest score

Model Simplification and Acceleration

1 2 Soft-max

Approximation

Page 37: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Soft-max approximation

•𝑒𝑥

𝑒𝑥+𝑒𝑦 > 𝑃 → 𝑥 >𝑙𝑛 𝑃

𝑙𝑛 1−𝑃+ 𝑦

Soft-max approximation with NMS

• NMS:

•𝑒𝑥1

𝑒𝑥1+𝑒𝑦1 >𝑒𝑥2

𝑒𝑥2+𝑒𝑦2

→ 𝑒𝑥1 ∙ 𝑒𝑥2 + 𝑒𝑥1 ∙ 𝑒𝑦2 > 𝑒𝑥2 ∙ 𝑒𝑥1 + 𝑒𝑥2 ∙ 𝑒𝑦1

37

Highest score

Model Simplification and Acceleration

1 2 Soft-max

Approximation

Page 38: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Soft-max approximation

•𝑒𝑥

𝑒𝑥+𝑒𝑦 > 𝑃 → 𝑥 >𝑙𝑛 𝑃

𝑙𝑛 1−𝑃+ 𝑦

Soft-max approximation with NMS

• NMS:

•𝑒𝑥1

𝑒𝑥1+𝑒𝑦1 >𝑒𝑥2

𝑒𝑥2+𝑒𝑦2

→ 𝑒𝑥1 ∙ 𝑒𝑥2 + 𝑒𝑥1 ∙ 𝑒𝑦2 > 𝑒𝑥2 ∙ 𝑒𝑥1 + 𝑒𝑥2 ∙ 𝑒𝑦1

38

Highest score

Model Simplification and Acceleration

1 2 Soft-max

Approximation

Page 39: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Soft-max approximation

•𝑒𝑥

𝑒𝑥+𝑒𝑦 > 𝑃 → 𝑥 >𝑙𝑛 𝑃

𝑙𝑛 1−𝑃+ 𝑦

Soft-max approximation with NMS

• NMS:

•𝑒𝑥1

𝑒𝑥1+𝑒𝑦1 >𝑒𝑥2

𝑒𝑥2+𝑒𝑦2

→ 𝑒𝑥1 ∙ 𝑒𝑥2 + 𝑒𝑥1 ∙ 𝑒𝑦2 > 𝑒𝑥2 ∙ 𝑒𝑥1 + 𝑒𝑥2 ∙ 𝑒𝑦1

39

Highest score

𝑒𝑥1+𝑦2 > 𝑒𝑥2+𝑦1

Model Simplification and Acceleration

1 2 Soft-max

Approximation

Page 40: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Soft-max approximation

•𝑒𝑥

𝑒𝑥+𝑒𝑦 > 𝑃 → 𝑥 >𝑙𝑛 𝑃

𝑙𝑛 1−𝑃+ 𝑦

Soft-max approximation with NMS

• NMS:

•𝑒𝑥1

𝑒𝑥1+𝑒𝑦1 >𝑒𝑥2

𝑒𝑥2+𝑒𝑦2

→ 𝑒𝑥1 ∙ 𝑒𝑥2 + 𝑒𝑥1 ∙ 𝑒𝑦2 > 𝑒𝑥2 ∙ 𝑒𝑥1 + 𝑒𝑥2 ∙ 𝑒𝑦1

40

Highest score

𝑒𝑥1+𝑦2 > 𝑒𝑥2+𝑦1 𝑥1 + 𝑦2 > 𝑥2 + 𝑦1

Model Simplification and Acceleration

1 2 Soft-max

Approximation

Page 41: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Soft-max approximation

•𝑒𝑥

𝑒𝑥+𝑒𝑦 > 𝑃 → 𝑥 >𝑙𝑛 𝑃

𝑙𝑛 1−𝑃+ 𝑦

Soft-max approximation with NMS

• NMS:

•𝑒𝑥1

𝑒𝑥1+𝑒𝑦1 >𝑒𝑥2

𝑒𝑥2+𝑒𝑦2

→ 𝑒𝑥1 ∙ 𝑒𝑥2 + 𝑒𝑥1 ∙ 𝑒𝑦2 > 𝑒𝑥2 ∙ 𝑒𝑥1 + 𝑒𝑥2 ∙ 𝑒𝑦1

41

Highest score

𝑒𝑥1+𝑦2 > 𝑒𝑥2+𝑦1 𝑥1 + 𝑦2 > 𝑥2 + 𝑦1 𝑥1 − 𝑦1 > 𝑥2 − 𝑦2

• Speedup: 1.43x faster

Model Simplification and Acceleration

1 2 Soft-max

Approximation

Page 42: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Computational Complexity

42

Page 43: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Model operation complexity comparison

43

Experiment Result

Original MTCNN

Network Input size MAC number

P-Net 12x12 44.76K

P-Net* 120x160 55x75x44.76K

=184.6M

R-Net 24x24 1.531M

O-Net 48x48 12.91M

Ours

Network Input size MAC number

P-Net 12x12 7.872K

P-Net* 120x160 55x75x7.872K

=32.47M

R-Net 24x24 319.3K

O-Net 48x48 2.267M

*: Consider P-Net’s input is an image with size 120x160 but not a block only.

Page 44: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Quantization

44

Page 45: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Model size comparison

45

Experiment Result

Original MTCNN

Network Data type Model size (Byte)

P-Net float32 26.04K

R-Net float32 398.5K

O-Net float32 1.542M

Total 1.966M

Ours

Network Data type Model size (Byte)

P-Net int8 1.088K

R-Net int8 137.4K

O-Net int8 402.6K

Total 541.2K

Page 46: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

• On FDDB database:

Quantization Result

46

Word Length Accuracy @

FPPI 0.1

Original MTCNN 92.40%

Ours (float32) 88.20%

Ours (int8) 88.15%

• FPPI: False Positive Per Image ANDES

DSP 1 3

Page 47: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Quantization Method

47

ANDES DSP

1 3

Page 48: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

• Weight quantization

𝑠ℎ𝑖𝑓𝑡 𝑛𝑢𝑚𝑏𝑒𝑟 = 7 − 𝑐𝑒𝑖𝑙(𝑙𝑜𝑔2(max (𝑎𝑏𝑠 𝑤𝑒𝑖𝑔ℎ𝑡 𝑚𝑖𝑛 , 𝑎𝑏𝑠 𝑤𝑒𝑖𝑔ℎ𝑡 𝑚𝑎𝑥 )))

𝑠ℎ𝑖𝑓𝑡𝑒𝑑 𝑤𝑒𝑖𝑔ℎ𝑡𝑠 = 𝑟𝑜𝑢𝑛𝑑 𝑑𝑜𝑤𝑛 𝑜𝑙𝑑 𝑤𝑒𝑖𝑔ℎ𝑡𝑠 × 2𝑠ℎ𝑖𝑓𝑡 𝑛𝑢𝑚𝑏𝑒𝑟

𝑠ℎ𝑖𝑓𝑡𝑒𝑑 𝑤𝑒𝑖𝑔ℎ𝑡𝑠 𝑠ℎ𝑖𝑓𝑡𝑒𝑑 𝑤𝑒𝑖𝑔ℎ𝑡𝑠 > 126 = 127

𝑠ℎ𝑖𝑓𝑡𝑒𝑑 𝑤𝑒𝑖𝑔ℎ𝑡𝑠 𝑠ℎ𝑖𝑓𝑡𝑒𝑑 𝑤𝑒𝑖𝑔ℎ𝑡𝑠 < −127 = −128

𝑓𝑖𝑛𝑎𝑙 𝑤𝑒𝑖𝑔ℎ𝑡𝑠 = 𝑠ℎ𝑖𝑓𝑡𝑒𝑑 𝑤𝑒𝑖𝑔ℎ𝑡𝑠 ÷ 2𝑠ℎ𝑖𝑓𝑡 𝑛𝑢𝑚𝑏𝑒𝑟

48

Quantization Method ANDES

DSP 1 3

Page 49: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

• Layer output quantization

𝑠ℎ𝑖𝑓𝑡 𝑛𝑢𝑚𝑏𝑒𝑟

= 7

− 𝑐𝑒𝑖𝑙(𝑙𝑜𝑔2(max (𝑎𝑏𝑠 𝑙𝑎𝑦𝑒𝑟 𝑜𝑢𝑡𝑝𝑢𝑡 𝑚𝑖𝑛 , 𝑎𝑏𝑠 𝑙𝑎𝑦𝑒𝑟 𝑜𝑢𝑡𝑝𝑢𝑡 𝑚𝑎𝑥 )))

𝑤ℎ𝑖𝑙𝑒 (𝑠ℎ𝑖𝑓𝑡_𝑠𝑡𝑎𝑟𝑡):

𝑜𝑢𝑡𝑝𝑢𝑡 = 𝑟𝑜𝑢𝑛𝑑 𝑑𝑜𝑤𝑛 𝑜𝑢𝑡𝑝𝑢𝑡 × 2𝑠ℎ𝑖𝑓𝑡 𝑛𝑢𝑚𝑏𝑒𝑟

𝑜𝑢𝑡𝑝𝑢𝑡 𝑜𝑢𝑡𝑝𝑢𝑡 > 126 = 127

𝑜𝑢𝑡𝑝𝑢𝑡 𝑜𝑢𝑡𝑝𝑢𝑡 < −127 = −128

𝑓𝑖𝑛𝑎𝑙 𝑜𝑢𝑡𝑝𝑢𝑡 = 𝑜𝑢𝑡𝑝𝑢𝑡 ÷ 2𝑠ℎ𝑖𝑓𝑡 𝑛𝑢𝑚𝑏𝑒𝑟

𝑠ℎ𝑖𝑓𝑡𝑒𝑑 𝑛𝑢𝑚𝑏𝑒𝑟 += 1

49

Quantization Method ANDES

DSP 1 3

Page 50: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

• 𝑜𝑢𝑡𝑝𝑢𝑡 = −4, −0.24, −0.20, … , 0.19, 0.23, 4

Example

50

ANDES DSP

1 3

Page 51: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

• 𝑜𝑢𝑡𝑝𝑢𝑡 = −4, −0.24, −0.20, … , 0.19, 0.23, 4

Example

51

ANDES DSP

1 3

7 − 𝑙𝑜𝑔2 4 = 5

Page 52: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

• 𝑜𝑢𝑡𝑝𝑢𝑡 = −4, −0.24, −0.20, … , 0.19, 0.23, 4

• 𝑜𝑟𝑖𝑔𝑖𝑛𝑎𝑙 𝑞𝑢𝑎𝑛𝑡𝑖𝑧𝑎𝑡𝑖𝑜𝑛 𝑠ℎ𝑖𝑓𝑡 5 = [−4, −0.25, −0.1875, … , 0.1875, 0.21875, 3.96875]

Example

52

ANDES DSP

1 3

Page 53: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

• 𝑜𝑢𝑡𝑝𝑢𝑡 = −4, −0.24, −0.20, … , 0.19, 0.23, 4

• 𝑜𝑟𝑖𝑔𝑖𝑛𝑎𝑙 𝑞𝑢𝑎𝑛𝑡𝑖𝑧𝑎𝑡𝑖𝑜𝑛 𝑠ℎ𝑖𝑓𝑡 5 = [−4, −0.25, −0.1875, … , 0.1875, 0.21875, 3.96875]

• 𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑖𝑜𝑛 𝑞𝑢𝑎𝑛𝑡𝑖𝑧𝑎𝑡𝑖𝑜𝑛 𝑠ℎ𝑖𝑓𝑡 6 = [−2, −0.234375, −0.203125, … , 0.1875, 0.234375, 1.984375]

Example

53

ANDES DSP

1 3

Page 54: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

• 𝑜𝑢𝑡𝑝𝑢𝑡 = −4, −0.24, −0.20, … , 0.19, 0.23, 4

• 𝑜𝑟𝑖𝑔𝑖𝑛𝑎𝑙 𝑞𝑢𝑎𝑛𝑡𝑖𝑧𝑎𝑡𝑖𝑜𝑛 𝑠ℎ𝑖𝑓𝑡 5 = [−4, −0.25, −0.1875, … , 0.1875, 0.21875, 3.96875]

• 𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑖𝑜𝑛 𝑞𝑢𝑎𝑛𝑡𝑖𝑧𝑎𝑡𝑖𝑜𝑛 𝑠ℎ𝑖𝑓𝑡 6 = [−2, −0.234375, −0.203125, … , 0.1875, 0.234375, 1.984375]

Example

54

More precise

ANDES DSP

1 3

Page 55: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

Speed-up each step

55

Page 56: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

• On FDDB database:

56

Experiment Result • FPPI: False Positive Per Image

Method Accuracy @

FPPI 1.0

Speedup @ Andes RISC-V

Processor

Ori-MTCNN 94.66% -

Ours 90.68% 106x

Method Accuracy

@ FPPI 0.1 Accuracy

@ FPPI 0.01 FPS

(Titan X GPU)

FPS (1080-Ti)

Brodmann17 89.25% 81.88% 200 90

DeepIR 88.45% 82.16% <=1

Xiaomi 87.82% 77.99% 2?

Faceness 86.04% 79.67% 1

Hyperface 85.63% 80.68% 0.33

DP2MFD 85.57% 76.73% <0.05

MTCNN 92.40% 84.95% 51

Ours 88.15% 82.59% 54

Page 57: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

57

Step Baseline Sim#1 Fast soft-max DSP-Sim#1 DSP-Sim#2

Overall 294.0129 99.81 53.69 3.88 2.78

Overall Speedup - 2.95 1.86 13.84 1.397

FPS 0.0034 0.01002 0.01863 0.25776 0.3601

P-Net Overall time

97.25 77.2 31.2 1.54 1.18

P-Net Overall speedup

- 1.26 2.47 20.30 1.30

R-Net Overall time

59.08 6.158 6.028 0.989 0.628

R-Net Trigger Times 46 22 22 32 29

R-Net normalize 1.28 0.28 0.274 0.0309 0.022

R-Net normalize speedup - 4.59 1.02 8.87 1.43

O-Net Overall time

132.19 15.034 15.004 1.35 0.96

O-Net Trigger Times 14 9 9 8 9

O-Net normalize 9.44 1.67 1.67 0.17 0.107

O-Net normalize speedup - 5.65 1.002 9.9 1.57

Page 58: Energy-Efficient Face Detection Using Andes RISC-V …...Image from Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks, IEEE Signal Processing Letters

58

Step Baseline Sim#1 Fast soft-max DSP-Sim#1 DSP-Sim#2

Overall 294.0129 99.8111858 53.687959 3.879538 2.777296

Overall Speedup - 2.94569088 1.8590982 13.8388 1.396875954

FPS 0.0034012107406

68638 0.01001891713455

0104 0.01862615026848

5554 0.25776 0.360062449

P-Net Overall time

97.248312473297119

77.170741379261017

31.195177435874939

1.536423 1.180413

P-Net Overall speedup

- 1.26017077 2.4738036 20.3038 1.301597831

R-Net Overall time

59.077883005142212

6.1582962274551392

6.0284666419029236

0.988531 0.627762

R-Net Trigger Times 46 22 22 32 29

R-Net normalize 1.284302 0.27992256 0.2740212 0.03089 0.021646966

R-Net normalize speedup - 4.58806178 1.0215361 8.87087 1.426989815

O-Net Overall time

132.18732833862305

15.033685207366943

15.003592789173126

1.345193 0.961341

O-Net Trigger Times 14 9 9 8 9

O-Net normalize 9.441952 1.67040947 1.6670659 0.16815 0.106815667

O-Net normalize speedup - 5.65247753 1.0020057 9.91416 1.574207274