LOSSLESS IMAGE COMPRESSION USING REVERSIBLE INTEGER WAVELET TRANSFORMS AND CONVOLUTIONAL NEURAL NETWORKS by Eze Ahanonu Copyright c Eze Ahanonu 2018 A Thesis Submitted to the Faculty of the DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING In Partial Fulfillment of the Requirements For the Degree of MASTER OF SCIENCE In the Graduate College THE UNIVERSITY OF ARIZONA 2018
87
Embed
LOSSLESS IMAGE COMPRESSION USING REVERSIBLE …...LOSSLESS IMAGE COMPRESSION USING REVERSIBLE INTEGER WAVELET TRANSFORMS AND CONVOLUTIONAL NEURAL NETWORKS by Eze Ahanonu Copyright
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
LOSSLESS IMAGE COMPRESSION USING REVERSIBLEINTEGER WAVELET TRANSFORMS AND CONVOLUTIONAL
This thesis has been submitted in partial fulfillment of the requirements for an advanced degree at the University of Arizona and is deposited in the University Library to be made available to borrowers under rules of the Library.
Brief quotations from this thesis are allowable without special permission, provided that accurate acknowledgment of the source is made. Requests for permission for extended quotation from or reproduction of this manuscript in whole or in part may be granted by the head of the major department or the Dean of the Graduate College
when in his or her judgment the proposed use of the material is in the interests of scholarship. In all other instances, however, permission must be obtained from the author.
SIGNED:
Eze Ahanonu
APPROVAL BY THESIS DIRECTOR
This thesis has been approved on the date shown below:
Eze
Pencil
3
ACKNOWLEDGEMENTS
First, I would like to thank Dr. Ali Bilgin who inspired me to pursue a graduateeducation, as well as provided me with outstanding feedback and support throughoutmy masters work.
I am also very grateful for the support of Dr. Michael Marcellin, who played acrucial role in my graduate success. Additionally, I would like to thank Dr. AmitAshok for agreesing to be a member of my thesis committee.
I highly value my various discussions with Dr. Feng Liu, Dr. Miguel Hern-ndez and Yuzhang Lin, which have both expanded and refined my understanding ofseveral key concepts.
I would like to thank Maria Teresa Velez, Shetara OliwoOlabode and Jim Fieldfor the unconditional funding and professional development resources they haveprovided during my masters.
I would like to thank Tami Whelan and Diana Wilson for their guidance andpatience while ensuring I satisfied the various requirements to complete my degrees.
I would like to thank my friends for keeping me company over the years, in par-ticular Corey Zammit, Matt Konen, Robert Blair, Otis Blank, and Shayan Milani.Additionally, I greatly appreciate the support of Savanna Weninger, who kept memotivated while completing my thesis.
Finally, I would like to thank my family. Throughout the years they have givenme endless support, and have always believed in me more than I believed in myself.
dataset. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414.7 Network loss over training epochs for networks trained on the first
level of decomposition. . . . . . . . . . . . . . . . . . . . . . . . . . . 444.8 Network loss over training epochs for networks trained on the second
level of decomposition. . . . . . . . . . . . . . . . . . . . . . . . . . . 454.9 Entropy reductions over training epochs for networks trained on the
first level of decomposition. . . . . . . . . . . . . . . . . . . . . . . . 464.10 Entropy reduction over training epochs for networks trained on the
second level of decomposition. . . . . . . . . . . . . . . . . . . . . . . 474.11 Distribution of wavelet coefficient within the first two levels of wavelet
4.12 Distributions of entropy reduction over the natural dataset for subb-nads in the 1st decomposition level. Solid and dashed lines representnetworks trained with L1 and L2 loss, respectively. . . . . . . . . . . 54
4.13 Distributions of entropy reduction over the natural dataset for subb-nads in the 2nd decomposition level. Solid and dashed lines representnetworks trained with L1 and L2 loss, respectively. . . . . . . . . . . 55
4.14 Distributions of entropy reduction over the pathology dataset forsubbnads in the 1st decomposition level. Solid and dashed lines rep-resent networks trained with L1 and L2 loss, respectively. . . . . . . . 56
4.15 Distributions of entropy reduction over the pathology dataset forsubbnads in the 2nd decomposition level. Solid and dashed linesrepresent networks trained with L1 and L2 loss, respectively. . . . . . 57
4.16 Distributions of entropy reduction over the graphics dataset for subb-nads in the 1st decomposition level. Solid and dashed lines representnetworks trained with L1 and L2 loss, respectively. . . . . . . . . . . 58
4.17 Distributions of entropy reduction over the graphics dataset for subb-nads in the 2nd decomposition level. Solid and dashed lines representnetworks trained with L1 and L2 loss, respectively. . . . . . . . . . . 59
4.18 Example prediction from the natural dataset of subbands within D1.The entropy of the original coefficients, and prediction residuals, areprovided below each corresponding column. . . . . . . . . . . . . . . . 61
4.19 Example prediction from the natural dataset of subbands within D1.The entropy of the original coefficients, and prediction residuals, areprovided below each corresponding column. . . . . . . . . . . . . . . . 62
4.20 Example prediction from the natural dataset of subbands within D1.The entropy of the original coefficients, and prediction residuals, areprovided below each corresponding column. . . . . . . . . . . . . . . . 63
4.21 Example prediction from the natural dataset of subbands within D1.The entropy of the original coefficients, and prediction residuals, areprovided below each corresponding column. . . . . . . . . . . . . . . . 64
4.22 Example prediction from the natural dataset of subbands within D1.The entropy of the original coefficients, and prediction residuals, areprovided below each corresponding column. . . . . . . . . . . . . . . . 65
4.23 Example prediction from the natural dataset of subbands within D1.The entropy of the original coefficients, and prediction residuals, areprovided below each corresponding column. . . . . . . . . . . . . . . . 66
4.24 Demonstration of the non-shift invariance of the DWT by looking atthe DWT of two image crops which only differ by a 1 pixel shift inthe row dimension. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
LIST OF FIGURES – Continued
8
4.25 Example prediction from the pathology dataset of subbands withinD1. The entropy of the original coefficients, and prediction residuals,are provided below each corresponding column. . . . . . . . . . . . . 68
4.26 Example prediction from the pathology dataset of subbands withinD1. The entropy of the original coefficients, and prediction residuals,are provided below each corresponding column. . . . . . . . . . . . . 69
4.27 Example prediction from the pathology dataset of subbands withinD1. The entropy of the original coefficients, and prediction residuals,are provided below each corresponding column. . . . . . . . . . . . . 70
4.28 Example prediction from the pathology dataset of subbands withinD1. The entropy of the original coefficients, and prediction residuals,are provided below each corresponding column. . . . . . . . . . . . . 71
4.29 Example prediction from the graphics dataset of subbands within D1.The entropy of the original coefficients, and prediction residuals, areprovided below each corresponding column. . . . . . . . . . . . . . . . 72
4.30 Example prediction from the graphics dataset of subbands within D1.The entropy of the original coefficients, and prediction residuals, areprovided below each corresponding column. . . . . . . . . . . . . . . . 73
4.31 Example prediction from the graphics dataset of subbands within D1.The entropy of the original coefficients, and prediction residuals, areprovided below each corresponding column. . . . . . . . . . . . . . . . 74
4.32 Example prediction from the graphics dataset of subbands within D1.The entropy of the original coefficients, and prediction residuals, areprovided below each corresponding column. . . . . . . . . . . . . . . . 75
4.33 Example prediction from the graphics dataset of subbands within D1.The entropy of the original coefficients, and prediction residuals, areprovided below each corresponding column. . . . . . . . . . . . . . . . 76
4.34 Example prediction from the graphics dataset of subbands within D1.The entropy of the original coefficients, and prediction residuals, areprovided below each corresponding column. . . . . . . . . . . . . . . . 77
provements (+0.14 bpc); interestingly, a performance reduction is seen of -0.04 bpc.
We point out that the results in Figure 4.8 and 4.10 indicate that the networks
used for graphics prediction were not fully converged after 40 training epochs. As
a result, these values may not completely reflect the optimal performance for the
network configuration.
The benefit of the SBE procedure is evident when comparing the distributions
before and after its application. The SBE procedure prevents any residual blocks
from being coded which will result in a increase in bitrate. As a result, the average
bitrate across an image is at most equal to that of the original coefficients. This
50
benefit is seen most prominently in results for the one-to-one and one-to-many
models, though there are cases in which the many-to-one models also benefit. The
benefit of the SBE procedure may not be fully appreciated when only considering the
average entropy reduction, in Tables 4.3 - 4.5, the minimum and maximum entropy
reductions are also provided (shown as [minimum, maximum]). In a majority of
cases, the minimum reduction is negative, indicating an increase in bitrate; while
after SBE the minimum reduction is 0 bpc.
51
Table 4.3: Entropy Reduction (bpc) - Natural
Input Output Original L1 L1+SBE L2 L2+SBE
LL1 HL1 3.50 0.23[−0.12, 0.94]
0.24[0, 0.94]
0.21[−0.27, 0.95]
0.23[0, 0.95]
LL1, LH1 HL1 3.50 0.42[−0.06, 1.06]
0.42[0.02, 1.06]
0.41[−0.06, 0.98]
0.41[0.01, 0.98]
LL1 HL1
(One-to-Many)
3.50 0.22[−0.19, 0.89]
0.22[0, 0.89]
0.20[−0.31, 0.84]
0.22[0, 0.84]
LL1 LH1 3.57 0.25[−0.15, 1.16]
0.25[0, 1.16]
0.25[−0.12, 1.2]
0.25[0, 1.2]
LL1, HL1 LH1 3.57 0.43[−0.14, 1.22]
0.43[0.02, 1.22]
0.44[−0.14, 1.27]
0.44[0.02, 1.27]
LL1 LH1
(One-to-Many)
3.57 0.24[−0.1, 1.14]
0.24[0, 1.14]
0.21[−0.23, 1.07]
0.23[0, 1.07]
LL1 HH1 2.95 0.13[−0.63, 0.69]
0.16[0, 0.69]
0.13[−0.61, 0.68]
0.16[0, 0.68]
LL1, HL1, LH1 HH1 2.95 0.43[−0.07, 1.03]
0.43[0, 1.03]
0.44[−0.05, 1.05]
0.44[0, 1.05]
LL1 HH1
(One-to-Many)
2.95 0.12[−0.57, 0.65]
0.16[0, 0.65]
0.11[−0.53, 0.55]
0.13[0, 0.55]
LL2 HL2 4.39 0.12[−0.01, 0.58]
0.12[0, 0.58]
0.11[−0.03, 0.53]
0.11[0, 0.53]
LL2, LH2 HL2 4.39 0.12[−0.16, 0.60]
0.13[0, 0.60]
0.12[−0.14, 0.56]
0.12[0.12, 0.56]
LL2 HL2
(One-to-Many)
4.39 0.11[−0.01, 0.58]
0.12[0, 0.58]
0.09[−0.41, 0.46]
0.10[0, 0.46]
LL2 LH2 4.49 0.10[−0.36, 0.54]
0.11[0, 0.54]
0.12[−0.36, 0.52]
0.12[0, 0.52]
LL2, HL2 LH2 4.49 0.13[−0.24, 0.65]
0.13[0, 0.65]
0.12[−0.29, 0.55]
0.12[0, 0.55]
LL2 LH2
(One-to-Many)
4.49 0.12[−0.30, 0.60]
0.12[0, 0.60]
0.10[−0.36, 0.47]
0.10[0, 0.47]
LL2 HH2 4.33 0.03[−0.10, 0.30]
0.03[0, 0.30]
0.03[−0.19, 0.25]
0.03[0, 0.25]
LL2, HL2, LH2 HH2 4.33 0.12[−0.01, 0.66]
0.12[0, 0.66]
0.09[−0.39, 0.65]
0.11[0, 0.65]
LL2 HH2
(One-to-Many)
4.33 0.03[−0.1, 0.37]
0.03[0, 0.37]
0.02[−0.35, 0.25]
0.02[0, 0.26]
52
Table 4.4: Entropy Reduction (bpc) - Pathology
Input Output Original L1 L1+SBE L2 L2+SBE
LL1 HL1 3.09 0.22[−0.01, 0.77]
0.23[0, 0.77]
0.23[−0.09, 0.77]
0.23[0, 0.77]
LL1, LH1 HL1 3.09 0.25[−0.04, 0.81]
0.26[0, 0.81]
0.26[−0.04, 0.81]
0.26[0, 0.81]
LL1 HL1
(One-to-Many)
3.09 0.22[−0.11, 0.76]
0.23[0, 0.76]
0.22[−0.06, 0.76]
0.22[0, 0.76]
LL1 LH1 3.10 0.27[−0.02, 0.80]
0.27[0, 0.80]
0.28[−0.01, 0.80]
0.28[0, 0.8]
LL1, HL1 LH1 3.10 0.31[0, 0.86]
0.32[0, 0.86]
0.32[0, 0.87]
0.32[0, 0.87]
LL1 LH1
(One-to-Many)
3.10 0.27[−0.01, 0.79]
0.27[0, 0.79]
0.27[−0.02, 0.79]
0.27[0, 0.79]
LL1 HH1 3.28 0.02[−0.01, 0.09]
0.02[0, 0.09]
0.02[−0.04, 0.09]
0.02[0, 0.09]
LL1, HL1, LH1 HH1 3.28 0.14[0, 0.24]
0.14[0, 0.24]
0.14[0, 0.25]
0.14[0, 0.25]
LL1 HH1
(One-to-Many)
3.28 0.02[0, 0.09]
0.02[0, 0.09]
0.02[−0.02, 0.09]
0.02[0, 0.09]
LL2 HL2 3.53 0.16[−0.31, 0.50]
0.17[0, 0.50]
0.16[−0.03, 0.51]
0.17[0, 0.51]
LL2, LH2 HL2 3.53 0.16[−0.15, 0.55]
0.17[0, 0.55]
0.15[−0.43, 0.57]
0.18[0, 0.57]
LL2 HL2
(One-to-Many)
3.53 0.15[−0.02, 0.48]
0.15[0, 0.48]
0.17[−0.01, 0.51]
0.17[0, 0.51]
LL2 LH2 3.54 0.17[−0.06, 0.49]
0.18[0, 0.49]
0.18[−0.01, 0.49]
0.18[0, 0.49]
LL2, HL2 LH2 3.54 0.21[−0.02, 0.59]
0.21[0, 0.59]
0.20[−0.03, 0.59]
0.20[0, 0.59]
LL2 LH2
(One-to-Many)
3.54 0.17[−0.01, 0.48]
0.17[0, 0.48]
0.18[−0.01, 0.52]
0.18[0, 0.52]
LL2 HH2 3.70 0.06[−0.05, 0.20]
0.06[0, 0.20]
0.06[−0.01, 0.19]
0.06[0, 0.19]
LL2, HL2, LH2 HH2 3.70 0.16[−0.03, 0.48]
0.16[0, 0.48]
0.17[−0.01, 0.49]
0.17[0, 0.49]
LL2 HH2
(One-to-Many)
3.70 0.03[−0.22, 0.15]
0.04[0, 0.15]
0.06[−0.01, 0.20]
0.06[0, 0.20]
53
Table 4.5: Entropy Reduction (bpc) - Graphics
Input Output Original L1 L1+SBE L2 L2+SBE
LL1 HL1 0.53 0.07[−0.37, 2.15]
0.10[0, 2.15]
0.15[−0.35, 2.46]
0.15[0, 2.46]
LL1, LH1 HL1 0.53 0.20[−0.21, 2.62]
0.21[0, 2.62]
0.16[−0.30, 2.48]
0.17[0, 2.48]
LL1 HL1
(One-to-Many)
0.53 0.19[−0.28, 2.60]
0.19[0, 2.60]
0.12[−0.48, 2.04]
0.13[0, 2.04]
LL1 LH1 0.55 0.17[−0.18, 2.50]
0.18[0, 2.50]
0.16[−0.34, 2.39]
0.16[0, 2.39]
LL1, HL1 LH1 0.55 0.19[−0.21, 2.55]
0.19[0, 2.55]
0.18[−0.27, 2.53]
0.18[0, 2.53]
LL1 LH1
(One-to-Many)
0.55 0.18[−0.27, 2.40]
0.18[0, 2.40]
0.13[−0.46, 1.94]
0.13[0, 1.94]
LL1 HH1 0.45 0.10[−0.12, 1.71]
0.10[0, 1.71]
0.08[−0.18, 1.48]
0.08[0, 1.48]
LL1, HL1, LH1 HH1 0.45 0.13[−0.08, 2.19]
0.13[0, 2.19]
0.17[−0.13, 2.23]
0.17[0, 2.23]
LL1 HH1
(One-to-Many)
0.45 0.12[−0.27, 2.15]
0.12[0, 2.15]
0.04[−0.66, 1.25]
0.06[0, 1.27]
LL2 HL2 0.94 0.17[−0.33, 2.49]
0.18[0, 2.49]
0.06[−0.45, 2.34]
0.12[0, 2.34]
LL2, LH2 HL2 0.94 0.19[−0.36, 2.86]
0.19[0, 2.86]
0.16[−0.45, 2.44]
0.18[0, 2.44]
LL2 HL2
(One-to-Many)
0.94 0.16[−0.27, 2.32]
0.16[0, 2.32]
0.15[−0.91, 2.36]
0.16[0, 2.36]
LL2 LH2 0.96 0.22[−0.30, 2.35]
0.22[0, 2.35]
0.14[−0.37, 2.04]
0.15[0, 2.04]
LL2, HL2 LH2 0.96 0.18[−0.34, 2.7]
0.19[0, 2.7]
0.18[−0.34, 2.59]
0.18[0, 2.59]
LL2 LH2
(One-to-Many)
0.96 0.15[−0.29, 2.39]
0.15[0, 2.39]
0.14[−0.43, 2.28]
0.16[0, 2.28]
LL2 HH2 0.86 0.05[−0.57, 2.14]
0.10[0, 2.14]
0.05[−0.62, 2.06]
0.08[0, 2.06]
LL2, HL2, LH2 HH2 0.86 0.19[−0.40, 2.69]
0.19[0, 2.69]
0.03[−0.67, 3.02]
0.12[0, 3.02]
LL2 HH2
(One-to-Many)
0.86 0.06[−0.68, 2.04]
0.08[0, 2.04]
0.06[−1.07, 2.21]
0.09[0, 2.21]
54
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
Natural HL1
One-to-One
Many-to-One
One-to-Many
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
Natural HL1 + BSE
One-to-One
Many-to-One
One-to-Many
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
Natural LH1
One-to-One
Many-to-One
One-to-Many
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
Natural LH1 + BSE
One-to-One
Many-to-One
One-to-Many
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
Natural HH1
One-to-One
Many-to-One
One-to-Many
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.05
0.1
0.15
0.2
0.25
0.3
Natural HH1 + BSE
One-to-One
Many-to-One
One-to-Many
Figure 4.12: Distributions of entropy reduction over the natural dataset for subbnadsin the 1st decomposition level. Solid and dashed lines represent networks trainedwith L1 and L2 loss, respectively.
55
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
Natural HL2
One-to-One
Many-to-One
One-to-Many
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.1
0.2
0.3
0.4
Natural HL2 + BSE
One-to-One
Many-to-One
One-to-Many
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
Natural LH2
One-to-One
Many-to-One
One-to-Many
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
Natural LH2 + BSE
One-to-One
Many-to-One
One-to-Many
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Natural HH2
One-to-One
Many-to-One
One-to-Many
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Natural HH2 + BSE
One-to-One
Many-to-One
One-to-Many
Figure 4.13: Distributions of entropy reduction over the natural dataset for subbnadsin the 2nd decomposition level. Solid and dashed lines represent networks trainedwith L1 and L2 loss, respectively.
56
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.05
0.1
0.15
0.2
0.25
Pathology HL1
One-to-One
Many-to-One
One-to-Many
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.05
0.1
0.15
0.2
0.25
Pathology HL1 + BSE
One-to-One
Many-to-One
One-to-Many
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.05
0.1
0.15
0.2
Pathology LH1
One-to-One
Many-to-One
One-to-Many
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.05
0.1
0.15
0.2
Pathology LH1 + BSE
One-to-One
Many-to-One
One-to-Many
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.2
0.4
0.6
0.8
1
Pathology HH1
One-to-One
Many-to-One
One-to-Many
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.2
0.4
0.6
0.8
Pathology HH1 + BSE
One-to-One
Many-to-One
One-to-Many
Figure 4.14: Distributions of entropy reduction over the pathology dataset for subb-nads in the 1st decomposition level. Solid and dashed lines represent networkstrained with L1 and L2 loss, respectively.
57
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.05
0.1
0.15
0.2
0.25
Pathology HL2
One-to-One
Many-to-One
One-to-Many
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.05
0.1
0.15
0.2
0.25
0.3
Pathology HL2 + BSE
One-to-One
Many-to-One
One-to-Many
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.05
0.1
0.15
0.2
0.25
Pathology LH2
One-to-One
Many-to-One
One-to-Many
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.05
0.1
0.15
0.2
0.25
Pathology LH2 + BSE
One-to-One
Many-to-One
One-to-Many
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.1
0.2
0.3
0.4
0.5
Pathology HH2
One-to-One
Many-to-One
One-to-Many
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.1
0.2
0.3
0.4
0.5
Pathology HH2 + BSE
One-to-One
Many-to-One
One-to-Many
Figure 4.15: Distributions of entropy reduction over the pathology dataset for subb-nads in the 2nd decomposition level. Solid and dashed lines represent networkstrained with L1 and L2 loss, respectively.
58
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.1
0.2
0.3
0.4
Graphics HL1
One-to-One
Many-to-One
One-to-Many
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.1
0.2
0.3
0.4
0.5
0.6
Graphics HL1 + BSE
One-to-One
Many-to-One
One-to-Many
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
Graphics LH1
One-to-One
Many-to-One
One-to-Many
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
Graphics LH1 + BSE
One-to-One
Many-to-One
One-to-Many
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.1
0.2
0.3
0.4
0.5
0.6
Graphics HH1
One-to-One
Many-to-One
One-to-Many
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.1
0.2
0.3
0.4
0.5
0.6
Graphics HH1 + BSE
One-to-One
Many-to-One
One-to-Many
Figure 4.16: Distributions of entropy reduction over the graphics dataset for subb-nads in the 1st decomposition level. Solid and dashed lines represent networkstrained with L1 and L2 loss, respectively.
59
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
Graphics HL2
One-to-One
Many-to-One
One-to-Many
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.1
0.2
0.3
0.4
Graphics HL2 + BSE
One-to-One
Many-to-One
One-to-Many
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
Graphics LH2
One-to-One
Many-to-One
One-to-Many
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.05
0.1
0.15
0.2
0.25
0.3
Graphics LH2 + BSE
One-to-One
Many-to-One
One-to-Many
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.1
0.2
0.3
0.4
Graphics HH2
One-to-One
Many-to-One
One-to-Many
-0.2 0 0.2 0.4 0.6 0.8 1
Entropy Reduction (bpc)
0
0.1
0.2
0.3
0.4
0.5
0.6
Graphics HH2 + BSE
One-to-One
Many-to-One
One-to-Many
Figure 4.17: Distributions of entropy reduction over the graphics dataset for subb-nads in the 2nd decomposition level. Solid and dashed lines represent networkstrained with L1 and L2 loss, respectively.
4.3.1 CNN Prediction Examples
In this section, several visual examples are given to demonstrate the efficacy
of the proposed CNN prediction method. For simplicity, we only show results for
networks which were trained using L1 loss on D1. Within each figure, the entropy
of the original coefficients and prediction residuals are provided for comparison.
In Figures 4.18 through 4.23, we look at examples from the natural dataset. In
general, we see that all networks are capable of effectively recovering the signifi-
cant structural information within a subband. Noticeable differences in prediction
60
performance occur when looking at the ability to correctly predict signs, as well as
capture subtle textural details contained within small magnitude coefficients.
In Figures 4.18 through 4.20, incorrect sign prediction by the one-to-one and one-
to-many models results in poor prediction results and low entropy reductions, while
the many-to-one model produces good sign predictions and high entropy reductions.
Issues related to sign prediction appear to occur primarily in the HH1 subband. We
note that the nature of this incorrect sign prediction is largely uniform over a given
subband prediction, where nearly all predicted values may have the opposite sign
of the original coefficients. This may be attributed to the non-shift invariance of
the DWT, where shifting of a signal can result in completely different subband
coefficients. This phenomenon is demonstrated in Figure 4.24. In this example, two
images are cropped from the same image, the only difference being a 1 pixel shift
in the row dimension during cropping. While the images are virtually identical,
the resulting subband after decomposition show prominent structural, as well as
sign, differences. In the many-to-one mode, having access to information within
neighboring detail subbands may provide contextual information that allows the
network to better compensate for this effect.
In Figures 4.21 through 4.23, the ability of the many-to-one model to effectively
recover small magnitude coefficients is demonstrated. This is most apparent in
Figure 4.23, in which substantial improvements are seen across all subbands.
61
LL1
HL1
LH1
HH1
Original
Coefficients
HL1:3.42
LH1:2.94
HH1:2.95
One-to-One
Prediction
One-to-Many
Prediction
Many-to-One
Prediction
One-to-One
Residual
HL1:3.28
LH1:2.92
HH1:3.12
One-to-Many
Residual
HL1:3.29
LH1:2.9
HH1:3.1
Many-to-One
Residual
HL1:3.11
LH1:2.71
HH1:2.46
Figure 4.18: Example prediction from the natural dataset of subbands within D1.The entropy of the original coefficients, and prediction residuals, are provided beloweach corresponding column.
62
LL1
HL1
LH1
HH1
Original
Coefficients
HL1:4.73
LH1:4.44
HH1:4.18
One-to-One
Prediction
One-to-Many
Prediction
Many-to-One
Prediction
One-to-One
Residual
HL1:3.73
LH1:3.77
HH1:4.54
One-to-Many
Residual
HL1:3.82
LH1:3.73
HH1:4.52
Many-to-One
Residual
HL1:3.57
LH1:3.66
HH1:3.53
Figure 4.19: Example prediction from the natural dataset of subbands within D1.The entropy of the original coefficients, and prediction residuals, are provided beloweach corresponding column.
63
LL1
HL1
LH1
HH1
Original
Coefficients
HL1:3.61
LH1:3.44
HH1:3.34
One-to-One
Prediction
One-to-Many
Prediction
Many-to-One
Prediction
One-to-One
Residual
HL1:3.27
LH1:3.32
HH1:3.6
One-to-Many
Residual
HL1:3.31
LH1:3.31
HH1:3.58
Many-to-One
Residual
HL1:3.04
LH1:3.07
HH1:2.68
Figure 4.20: Example prediction from the natural dataset of subbands within D1.The entropy of the original coefficients, and prediction residuals, are provided beloweach corresponding column.
64
LL1
HL1
LH1
HH1
Original
Coefficients
HL1:2.93
LH1:3.22
HH1:2.34
One-to-One
Prediction
One-to-Many
Prediction
Many-to-One
Prediction
One-to-One
Residual
HL1:2.79
LH1:2.95
HH1:2.22
One-to-Many
Residual
HL1:2.78
LH1:2.92
HH1:2.21
Many-to-One
Residual
HL1:2.64
LH1:2.8
HH1:2.09
Figure 4.21: Example prediction from the natural dataset of subbands within D1.The entropy of the original coefficients, and prediction residuals, are provided beloweach corresponding column.
65
LL1
HL1
LH1
HH1
Original
Coefficients
HL1:5.51
LH1:5.35
HH1:4.83
One-to-One
Prediction
One-to-Many
Prediction
Many-to-One
Prediction
One-to-One
Residual
HL1:5.12
LH1:5.04
HH1:4.63
One-to-Many
Residual
HL1:5.11
LH1:5.04
HH1:4.6
Many-to-One
Residual
HL1:4.85
LH1:4.78
HH1:4.15
Figure 4.22: Example prediction from the natural dataset of subbands within D1.The entropy of the original coefficients, and prediction residuals, are provided beloweach corresponding column.
66
LL1
HL1
LH1
HH1
Original
Coefficients
HL1:3.53
LH1:3.96
HH1:3.18
One-to-One
Prediction
One-to-Many
Prediction
Many-to-One
Prediction
One-to-One
Residual
HL1:3.17
LH1:3.49
HH1:2.73
One-to-Many
Residual
HL1:3.16
LH1:3.48
HH1:2.78
Many-to-One
Residual
HL1:3.07
LH1:3.42
HH1:2.53
Figure 4.23: Example prediction from the natural dataset of subbands within D1.The entropy of the original coefficients, and prediction residuals, are provided beloweach corresponding column.
67
Image A
DWT(Image A)
Image B
DWT(Image B)
Figure 4.24: Demonstration of the non-shift invariance of the DWT by looking at theDWT of two image crops which only differ by a 1 pixel shift in the row dimension.
Figures 4.25 through 4.28 gives prediction examples from the pathology dataset.
The results in Table 4.4 implied that no benefit may be seen by using the many-
to-one prediction model on pathology images, except for on the HH1 subband. We
see this in the examples provided below, where in all cases, the many-to-one model
produced nearly identical predictions to the one-to-one and one-to-many models on
HL1 and LH1.
The examples in Figures 4.25 through 4.27 demonstrates the noise contamination
in HH1 mentioned in Section 4.1.3. While the network does not learn the noise,
it does learn the banding within the noise; this banding is likely the result of a
scanning procedure used to capture the images. By learning this banding, the
network is able to achieve modest entropy reductions. In Figure 4.28 we see an
68
example in which structure does exist within the HH1 subband. All networks do a
poor job of predicting this structure, which is likely results from a lack of training
examples from which to learn.
LL1
HL1
LH1
HH1
Original
Coefficients
HL1:3.43
LH1:3.25
HH1:3.44
One-to-One
Prediction
One-to-Many
Prediction
Many-to-One
Prediction
One-to-One
Residual
HL1:3.13
LH1:2.99
HH1:3.43
One-to-Many
Residual
HL1:3.16
LH1:3
HH1:3.42
Many-to-One
Residual
HL1:3.11
LH1:2.94
HH1:3.3
Figure 4.25: Example prediction from the pathology dataset of subbands within D1.The entropy of the original coefficients, and prediction residuals, are provided beloweach corresponding column.
69
LL1
HL1
LH1
HH1
Original
Coefficients
HL1:2.85
LH1:2.96
HH1:3.49
One-to-One
Prediction
One-to-Many
Prediction
Many-to-One
Prediction
One-to-One
Residual
HL1:2.85
LH1:2.9
HH1:3.48
One-to-Many
Residual
HL1:2.85
LH1:2.9
HH1:3.48
Many-to-One
Residual
HL1:2.82
LH1:2.84
HH1:3.38
Figure 4.26: Example prediction from the pathology dataset of subbands within D1.The entropy of the original coefficients, and prediction residuals, are provided beloweach corresponding column.
70
LL1
HL1
LH1
HH1
Original
Coefficients
HL1:4.18
LH1:3.95
HH1:3.43
One-to-One
Prediction
One-to-Many
Prediction
Many-to-One
Prediction
One-to-One
Residual
HL1:3.47
LH1:3.26
HH1:3.37
One-to-Many
Residual
HL1:3.49
LH1:3.26
HH1:3.37
Many-to-One
Residual
HL1:3.45
LH1:3.2
HH1:3.18
Figure 4.27: Example prediction from the pathology dataset of subbands within D1.The entropy of the original coefficients, and prediction residuals, are provided beloweach corresponding column.
71
LL1
HL1
LH1
HH1
Original
Coefficients
HL1:2.15
LH1:2.1
HH1:1.83
One-to-One
Prediction
One-to-Many
Prediction
Many-to-One
Prediction
One-to-One
Residual
HL1:1.86
LH1:1.72
HH1:1.79
One-to-Many
Residual
HL1:1.83
LH1:1.73
HH1:1.79
Many-to-One
Residual
HL1:1.77
LH1:1.72
HH1:1.81
Figure 4.28: Example prediction from the pathology dataset of subbands within D1.The entropy of the original coefficients, and prediction residuals, are provided beloweach corresponding column.
Figures 4.29 through 4.34 provide several examples from the graphics dataset.
Between networks, we see that predictions made by the one-to-one network for
HL1 containing a glowing effect around edges. This is likely the result of network
weights which did not converge well, which agrees with our observations in 4.2. In
all examples, networks are able to sufficiently capture structural information. The
textural details which gave the many-to-one network the largest advantage on the
natural dataset are not present in the graphics dataset. This likely contributes to it
having similar performance to the one-to-one and one-to-many models in this case.
72
LL1
HL1
LH1
HH1
Original
Coefficients
HL1:1.76
LH1:1.7
HH1:1.47
One-to-One
Prediction
One-to-Many
Prediction
Many-to-One
Prediction
One-to-One
Residual
HL1:1.66
LH1:1.23
HH1:1.27
One-to-Many
Residual
HL1:1.19
LH1:1.27
HH1:1.24
Many-to-One
Residual
HL1:1.12
LH1:1.19
HH1:1.08
Figure 4.29: Example prediction from the graphics dataset of subbands within D1.The entropy of the original coefficients, and prediction residuals, are provided beloweach corresponding column.
73
LL1
HL1
LH1
HH1
Original
Coefficients
HL1:3.54
LH1:3.5
HH1:3.33
One-to-One
Prediction
One-to-Many
Prediction
Many-to-One
Prediction
One-to-One
Residual
HL1:2.81
LH1:2.37
HH1:2.99
One-to-Many
Residual
HL1:2.17
LH1:2.46
HH1:2.71
Many-to-One
Residual
HL1:2.03
LH1:2.34
HH1:2.48
Figure 4.30: Example prediction from the graphics dataset of subbands within D1.The entropy of the original coefficients, and prediction residuals, are provided beloweach corresponding column.
74
LL1
HL1
LH1
HH1
Original
Coefficients
HL1:2.09
LH1:1.72
HH1:1.52
One-to-One
Prediction
One-to-Many
Prediction
Many-to-One
Prediction
One-to-One
Residual
HL1:1.86
LH1:1.03
HH1:1
One-to-Many
Residual
HL1:0.94
LH1:1.04
HH1:0.94
Many-to-One
Residual
HL1:0.97
LH1:1.01
HH1:0.86
Figure 4.31: Example prediction from the graphics dataset of subbands within D1.The entropy of the original coefficients, and prediction residuals, are provided beloweach corresponding column.
75
LL1
HL1
LH1
HH1
Original
Coefficients
HL1:3.48
LH1:3.68
HH1:3.23
One-to-One
Prediction
One-to-Many
Prediction
Many-to-One
Prediction
One-to-One
Residual
HL1:2.89
LH1:2.62
HH1:2.78
One-to-Many
Residual
HL1:2.51
LH1:2.55
HH1:2.63
Many-to-One
Residual
HL1:2.42
LH1:2.47
HH1:2.36
Figure 4.32: Example prediction from the graphics dataset of subbands within D1.The entropy of the original coefficients, and prediction residuals, are provided beloweach corresponding column.
76
LL1
HL1
LH1
HH1
Original
Coefficients
HL1:2.78
LH1:2.79
HH1:2.41
One-to-One
Prediction
One-to-Many
Prediction
Many-to-One
Prediction
One-to-One
Residual
HL1:2.4
LH1:1.93
HH1:2.07
One-to-Many
Residual
HL1:1.93
LH1:1.97
HH1:1.99
Many-to-One
Residual
HL1:1.79
LH1:1.86
HH1:1.65
Figure 4.33: Example prediction from the graphics dataset of subbands within D1.The entropy of the original coefficients, and prediction residuals, are provided beloweach corresponding column.
77
LL1
HL1
LH1
HH1
Original
Coefficients
HL1:2.81
LH1:2.17
HH1:1.97
One-to-One
Prediction
One-to-Many
Prediction
Many-to-One
Prediction
One-to-One
Residual
HL1:2.17
LH1:1.15
HH1:1.27
One-to-Many
Residual
HL1:1.47
LH1:1.17
HH1:1.19
Many-to-One
Residual
HL1:1.52
LH1:1.09
HH1:1.07
Figure 4.34: Example prediction from the graphics dataset of subbands within D1.The entropy of the original coefficients, and prediction residuals, are provided beloweach corresponding column.
4.4 Baseline Prediction Framework
A baseline prediction framework is now developed, guided by the results obtained
throughout this section. For this, we look to minimize the potential bit-rate of the
entire image, and so we must now consider bit-rate in bits-per-pixel (bpp). Since
the total compressed file consists of compressed data from multiple subbands, we
must scale the entropy reduction of each subband by its relative contribution to
the full image codestream to obtain bit-rate reduction in bits-per-pixel (bpp). The
average relative contribution for each detail subband in the first two decomposition
levels is given in Table 4.6, where compressed data from test images within the
natural, pathology, and graphics datasets are used to generate statistics. With these
values, along with those provided in Tables 4.3-4.5, we estimate the potential bit-rate
reduction of each proposed framework using Equation 4.3
For Lossless JPEG2000, JPEG-LS, GLICBAWLS, and FLIF, compression exper-
iments were ran on a machine running Ubuntu 16.04.4 LTS, with Intel(R) Xeon(R)
E5-2699A CPUs, and Nvidia Tesla P100 GPU. For CALIC, only windows bina-
ries were available, these experiments were ran on a machine running Windows
10.0.17134, with Intel(R) Core(TM) i5-4670K CPU.
84
CHAPTER 5
CONCLUSION
In this thesis, we proposed a lossless compression framework which incorporates
CNN s for prediction of wavelet coefficients. Multiple CNN prediction frameworks
were assess for usage with natural, pathology, and graphics image data. From these, a
baseline prediction framework was proposed by analyzing optimal potential bit-rate
rate reductions of each framework. Using this framework, an end-to-end implemen-
tation was developed, and compared with current standards, as well as state of the
art image compression techniques. In these experiments, we found that the proposed
model produces bit-rates which compete strongly with state of the art techniques
for natural images. Weaker, but still competitive, performance was seen in models
trained for pathology and graphics images. Additionally, the proposed model has
a computational complexity which is practical, even when implemented using only
CPUs.
In future work, the proposed model can be extended to multi-component imagery
(e.g. RGB, YCbCr, or hyperspectral). With these, a CNN prediction framework
may be developed which employs cross-component prediction. The proposed model
may also be extended to lossy image compression. For this we must study the
effects of quantization on CNN prediction, then a rate allocation model may be
developed which takes into consideration the information required for prediction at
the decoder.
85
REFERENCES
[1] X. Wu and N. Memon, “CALIC-a context based adaptive lossless image codec,”in Proceedings of the Acoustics, Speech, and Signal Processing, 1996. OnConference Proceedings., 1996 IEEE International Conference - Volume 04,ICASSP ’96, (Washington, DC, USA), pp. 1890–1893, IEEE Computer Soci-ety, 1996.
[2] M. J. Weinberger, G. Seroussi, and G. Sapiro, “The LOCO-I lossless imagecompression algorithm: Principles and standardization into JPEG-LS,” Trans.Img. Proc., vol. 9, pp. 1309–1324, Aug. 2000.
[3] D. Taubman and M. Marcellin, JPEG2000 Image Compression Fundamentals,Standards and Practice. Springer Publishing Company, Incorporated, 2002.
[4] B. Meyer and P. Tischer, “GLICBAWLS- grey level image compression byadaptive weighted,” in Least Squares,, DCC 2001, Data Compression Confer-ence 2001, p. 503, 2001.
[5] J. Sneyers and P. Wuille, “FLIF: free lossless image format based on MANIACcompression,” in 2016 IEEE International Conference on Image Processing,ICIP 2016, Phoenix, AZ, USA, September 25-28, 2016, pp. 66–70, 2016.
[6] A. van den Oord and B. Schrauwen, “The student-t mixture as a natural im-age patch prior with application to image compression,” Journal of MachineLearning Research, vol. 15, pp. 2061–2086, 2014.
[7] K. Gregor, F. Besse, D. Jimenez Rezende, I. Danihelka, and D. Wierstra, “To-wards conceptual compression,” in Advances in Neural Information ProcessingSystems 29 (D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett,eds.), pp. 3549–3557, Curran Associates, Inc., 2016.
[8] C. C. Cutle, “Differential quantization of communication signals,” 1950. USPatent 2605361 A.
[9] N. Ahmed, T. Natarajan, and K. R. Rao, “Discrete cosine transfom,” IEEETrans. Comput., vol. 23, pp. 90–93, Jan. 1974.
[10] G. Strang and T. Nguyen, Wavelets and filter banks. Wellesley-CambridgePress, 1997.
86
[11] D. A. Huffman, “A method for the construction of minimum-redundancycodes,” Proceedings of the Institute of Radio Engineers, vol. 40, pp. 1098–1101,September 1952.
[12] A. Robinson and C. Cherry, “Results of a prototype television bandwidth com-pression scheme,” vol. 55, pp. 356 – 364, 04 1967.
[13] I. H. Witten, R. M. Neal, and J. G. Cleary, “Arithmetic coding for data com-pression,” Commun. ACM, vol. 30, pp. 520–540, 06 1987.
[14] C. Shannon, “A mathematical theory of communication,” Bell Systems Tech-nical Journal, vol. 27, pp. 379–423, 1948.
[15] D. Taylor, D. Newman, and B. Schunck, “Cineform,” Dec 2017.
[16] S. Mallat, A Wavelet Tour of Signal Processing, Third Edition: The SparseWay. Academic Press, 3rd ed., 2008.
[17] I. Daubechies and W. Sweldens, “Factoring wavelet transforms into liftingsteps,” J. Fourier Anal. Appl., vol. 4, no. 3, pp. 245–267, 1998.
[18] I. Daubechies, “Biorthogonal bases of compactly supported wavelets,” 1992.
[19] J. Kiefer and J. Wolfowitz, “Stochastic estimation of the maximum of a regres-sion function,” Ann. Math. Statist., vol. 23, pp. 462–466, 09 1952.
[20] Y. Chauvin and D. E. Rumelhart, eds., Backpropagation: Theory, Architec-tures, and Applications. Hillsdale, NJ, USA: L. Erlbaum Associates Inc., 1995.
[21] A. Calderbank, I. Daubechies, W. Sweldens, and B.-L. Yeo, “Wavelet trans-forms that map integers to integers,” Applied and Computational HarmonicAnalysis, vol. 5, pp. 332–369, 1998.
[22] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” CoRR, vol. abs/1409.1556, 2014.
[23] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado,A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving,M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mane,R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner,I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viegas,O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng, “Ten-sorFlow: Large-scale machine learning on heterogeneous systems,” 2015. Soft-ware available from tensorflow.org.
87
[24] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,”CoRR, vol. abs/1412.6980, 2014.
[25] J. Kim, J. K. Lee, and K. M. Lee, “Accurate image super-resolution using verydeep convolutional networks,” June 2016.
[26] C. Dong, C. C. Loy, K. He, and X. Tang, “Image super-resolution using deepconvolutional networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 38,pp. 295–307, Feb. 2016.
[27] P. M. Radiuk, “Impact of training set batch size on the performance of con-volutional neural networks for diverse datasets,” Information Technology andManagement Science, vol. 20, no. 1, pp. 20–24, 2017.
[28] D.-T. Dang-Nguyen, C. Pasquini, V. Conotter, and G. Boato, “Raise a rawimages dataset for digital image forensics,” 2015.
[29] “Public domain vectors,” June 2018.
[30] Contributors, “Inkscape.” Online, March 2018.
[31] D. Janssens, K. Hgihara, J. Fimes, Giuseppe, M. Savinaud, M. Malaterre,Y. Verschueren, H. Drolon, F.-O. Devaux, and A. Descampe, “OpenJPEG,”Oct 2018.