Top Banner
Noisy-As-Clean: Learning Unsupervised Denoising from the Corrupted Image Jun Xu 1,#* , Yuan Huang 2,# , Ming-Ming Cheng 1 , Li Liu 3 , Fan Zhu 3 , Xingsong Hou 2 , Ling Shao 3 1 College of Computer Science, Nankai University, Tianjin, China 2 School of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an, China 3 Inception Institute of Artificial Intelligence (IIAI), Abu Dhabi, UAE Abstract Recently, supervised deep networks have achieved promis- ing performance on image denoising, by learning image pri- ors and noise statistics on plenty pairs of noisy and clean images. Unsupervised denoising networks are also proposed to use external noisy images for training. However, for an unseen test image, these denoising networks ignore either its particular image prior, the noise statistics, or both. That is, the networks learned from external images inherently suffer from a domain gap problem that the image priors and noise statistics are very different between the training and test im- ages. This problem becomes more clear when dealing with the signal dependent realistic noise. To bridge this gap, in this work, we propose a novel “Noisy-As-Clean” (NAC) strategy of training unsupervised denoising networks. In NAC, the corrupted test image is taken as the “clean” target, while the inputs are simulated images consisting of the test image and similar corruptions. A simple and useful observation on our NAC is: as long as the noise is weak, it is feasible to learn an unsupervised network only with the corrupted image, approximating the optimal parameters of a supervised network learned with pairs of noisy and clean images. Experiments show that the unsupervised networks trained with our NAC strategy outperform previous networks, including supervised ones, on synthetic and realistic noise removal. 1. Introduction Image denoising is an ill-posed inverse problem to re- cover a clean image x from the observed noisy image y = x + n o , where n o is the observed corrupted noise. One popular assumption on n is the additive white Gaus- sian noise (AWGN) with standard deviation (std) σ. AWGN serves as a perfect test bed for supervised methods in the deep learning era [18, 19, 22, 38, 39]. Numerous supervised * Jun Xu ([email protected]) is the corresponding author. The first two authors contribute equally. (a) Noisy: 24.62dB/0.4595 (b) Clean Image (c) CDnCNN [44]: 34.23dB/0.8695 (d) CDnCNN+NAC: 35.80dB/0.9116 Figure 1. Denoised images and PSNR/SSIM results of CD- nCNN [44] (c) and CDnCNN trained by our NAC strategy (“CD- nCNN+NAC”) (d) on the color image House corrupted by AWGN noise (σ = 15). networks [9, 17, 25, 29, 31, 35, 40, 42, 44] learn the image pri- ors and noise statistics on plenty pairs of clean and corrupted images, and achieve promising denoising performance on the images with similar priors and noise statistics (e.g., AWGN). With advances on AWGN noise removal [25, 29, 31, 35, 40, 44], a question arises is how these denoising net- works can exert their effect on real noisy photographs. Re- alistic noise is signal dependent and more complex than AWGN [4, 15, 28, 33, 34]. Thus, previous supervised denois- ing networks unavoidably suffer from a domain gap prob- lem: both the image priors and noise statistics in training are different from those of the real test photograph. Recently, several unsupervised networks [6, 10, 23, 24, 26, 27] have 1
10

Noisy-As-Clean: Learning Unsupervised Denoising from the ... · (d) CDnCNN+NAC: 35.80dB/0.9116 Figure 1. Denoised images and PSNR/SSIM results of CD-nCNN [44] (c) and CDnCNN trained

Jul 15, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Noisy-As-Clean: Learning Unsupervised Denoising from the ... · (d) CDnCNN+NAC: 35.80dB/0.9116 Figure 1. Denoised images and PSNR/SSIM results of CD-nCNN [44] (c) and CDnCNN trained

Noisy-As-Clean Learning Unsupervised Denoising from the Corrupted Image

Jun Xu1lowast Yuan Huang2 Ming-Ming Cheng1 Li Liu3 Fan Zhu3 Xingsong Hou2 Ling Shao3

1College of Computer Science Nankai University Tianjin China2School of Electronic and Information Engineering Xirsquoan Jiaotong University Xirsquoan China

3Inception Institute of Artificial Intelligence (IIAI) Abu Dhabi UAE

Abstract

Recently supervised deep networks have achieved promis-ing performance on image denoising by learning image pri-ors and noise statistics on plenty pairs of noisy and cleanimages Unsupervised denoising networks are also proposedto use external noisy images for training However for anunseen test image these denoising networks ignore either itsparticular image prior the noise statistics or both That isthe networks learned from external images inherently sufferfrom a domain gap problem that the image priors and noisestatistics are very different between the training and test im-ages This problem becomes more clear when dealing withthe signal dependent realistic noise

To bridge this gap in this work we propose a novelldquoNoisy-As-Cleanrdquo (NAC) strategy of training unsuperviseddenoising networks In NAC the corrupted test image istaken as the ldquocleanrdquo target while the inputs are simulatedimages consisting of the test image and similar corruptionsA simple and useful observation on our NAC is as longas the noise is weak it is feasible to learn an unsupervisednetwork only with the corrupted image approximating theoptimal parameters of a supervised network learned withpairs of noisy and clean images Experiments show thatthe unsupervised networks trained with our NAC strategyoutperform previous networks including supervised oneson synthetic and realistic noise removal

1 Introduction

Image denoising is an ill-posed inverse problem to re-cover a clean image x from the observed noisy imagey = x + no where no is the observed corrupted noiseOne popular assumption on n is the additive white Gaus-sian noise (AWGN) with standard deviation (std) σ AWGNserves as a perfect test bed for supervised methods in thedeep learning era [18 19 22 38 39] Numerous supervised

lowastJun Xu (nankaimathxujungmailcom) is the corresponding authorThe first two authors contribute equally

(a) Noisy 2462dB04595 (b) Clean Image

(c) CDnCNN [44]3423dB08695

(d) CDnCNN+NAC3580dB09116

Figure 1 Denoised images and PSNRSSIM results of CD-nCNN [44] (c) and CDnCNN trained by our NAC strategy (ldquoCD-nCNN+NACrdquo) (d) on the color image House corrupted by AWGNnoise (σ = 15)

networks [9 17 25 29 31 35 40 42 44] learn the image pri-ors and noise statistics on plenty pairs of clean and corruptedimages and achieve promising denoising performance on theimages with similar priors and noise statistics (eg AWGN)

With advances on AWGN noise removal [25 29 3135 40 44] a question arises is how these denoising net-works can exert their effect on real noisy photographs Re-alistic noise is signal dependent and more complex thanAWGN [4 15 28 33 34] Thus previous supervised denois-ing networks unavoidably suffer from a domain gap prob-lem both the image priors and noise statistics in trainingare different from those of the real test photograph Recentlyseveral unsupervised networks [6 10 23 24 26 27] have

1

been proposed to remove the dependence on clean trainingimages However when processing a corrupted test imagethese networks are still subjected to the domain gap on ei-ther image priors or noise statistics between the externaltraining images and the test one Two interesting works areNoise2Noise [26] and Deep Image Prior (DIP) [27] whichsucceed on the zero-mean noise But the realistic noise onreal photographs is usually not zero-mean [4 33 34]

In this paper we propose a ldquoNoisy-As-Cleanrdquo (NAC)strategy to alleviate the domain gap problem In our NACthe noisy test image y = x + no is directly taken as theldquocleanrdquo target to train an image-specific network Thus thedomain gap on image priors are largely bridged by our NACTo reduce the gap on noise statistics for the noisy test imagey as target we take as the input of our NAC a simulated noisyimage z = y + ns consisting of the corrupted test image yand a simulated noise ns which is statistically close to thecorrupted noise no in y By this way in the training stageour NAC network learns to remove the simulated noise nsfrom z and thus is able to remove the observed noise nofrom the noisy test image y in the testing stage

A simple and useful observation about our NAC strategyis as long as the corrupted noise is ldquoweakrdquo it is feasibleto train an unsupervised denoising network only with thecorrupted test image and the learned parameters are veryclose to those of a supervised network trained with a pair ofnoisy and clean images Though being very simple our NACstrategy is very effective for image denoising In Figure 1 wecompare the denoised images by the vanilla CDnCNN [44]and the CDnCNN trained with our NAC (CDnCNN+NAC)on the image ldquoHouserdquo corrupted by AWGN (σ = 15) Weobserve that the ldquoCDnCNN+NACrdquo achieves better visualquality and higher PSNRSSIM results than CDnCNN [44]which is trained on plenty of noisy and clean image pairsExperiments on diverse synthetic and real-world benchmarksdemonstrate that when trained with our NAC strategy anunpolished ResNet [18] outperforms previous superviseddenoising networks on removing ldquoweakrdquo noise Our workreveals that when the noise is ldquoweakrdquo an unsupervisednetwork trained with only the corrupted test image can obtainbetter denoising performance than the supervised ones

2 Related WorkSupervised denoising networks are trained with plentypairs of noisy and clean images This category of networkscan learn external image priors and noise statistics from thetraining data Numerous methods [9 17 25 29 31 35 40 4244] have been developed with achieving promising perfor-mance on AWGN noise removal where the statistics of train-ing and test noise are similar However due to the aforemen-tioned domain gap problem the performance of these net-works degrade severely on real noisy photographs [13434]Unsupervised denoising networks are developed to re-

Image NoiseType Method YearrsquoPub Prior Stat

S DnCNN [44] 17rsquoTIP Ext 3CBDNet [17] 19rsquoCVPR Ext 3

U

N2N [26] 18rsquoICML Ext 3DIP [27] 18rsquoCVPR IntN2S [6] 19rsquoICML ExtN2V [23] 19rsquoCVPR Ext

SS [24] 19rsquoNeurIPS ExtNAC (Ours) 20rsquoSubmit Int 3

Table 1 Summary of representative networks for image de-noising S Supervised networks U Unsupervised networksN2N Noise2Noise [26] DIP Deep Image Prior [27] N2SNoise2Self [6] N2V Noise2Void [23] SS Self-Supervised [24]Pub Publication Int Internal image priors Ext External imagepriors Stat Statistics The networks with ldquo3rdquo are able to learnthe noise statistics from training data

move the need on plenty of clean images Along this di-rection Noise2Noise (N2N) [26] trains the network betweenpairs of corrupted images with the same scene but indepen-dently sampled noise This work is feasible to learn externalimage priors and noise statistics from the training data How-ever in real-world scenarios it is difficult to collect largeamounts of paired images with independent corruption fortraining Noise2Void (N2V) [23] predicts a pixel from itssurroundings by learning blind-spot networks but it stillsuffers from the domain gap on image priors between thetraining images and test images This work assumes thatthe corruption is zero-mean and independent between pixelsHowever as mentioned in Noise2Self (N2S) [6] N2V [23]significantly degrades the training efficiency and denois-ing performance at test time Recently Deep Image Prior(DIP) [27] reveals that the network structure can resonatewith the natural image priors and can be utilized in imagerestoration without external images However it is not prac-tical to select a suitable network and early-stop its trainingat right moments for each corrupted imageInternal and external image priors are widely used fordiverse image restoration tasks [14 36 43 45 46] Internalpriors are directly learned from the input test image itselfwhile the external ones are learned on external images (aslong as not the test one) The internal priors are adaptiveto its image contents but somewhat affected by the corrup-tions [14 45] By contrast the external priors are effectivefor restoring images with general contents but may not beoptimal for specific test image [11 14 36 37 43 46]Noise statistics is of key importance for image denoisingThe AWGN noise is one typical noise with widespread studyRecently researchers shift more attention to the realisticnoise produced in camera sensors [4 34] which is usuallymodeled as mixed Poisson and Gaussian distribution [15]The Poisson component mainly comes from the irregular

photons hitting the sensor [28] while Gaussian noise is ma-jorly produced by dark current [33] Though performing wellon the synthetic noise being trained with supervised denois-ers [9 17 25 29 31 40 42 44] still suffer from the domaingap problem when processing the real noisy photographs

In Table 1 we summarize several recently published de-noising networks [6 8 23 26 27 29 31 35 40 44] fromthe aspects of supervised and unsupervised networks imagepriors and noise statistics In this work to bridge the domaingap problem we propose a ldquoNoisy-As-Cleanrdquo strategy tolearn the image-specific internal priors and noise statisticsdirectly from the corrupted test image

3 Theoretical Background of ldquoNoisy-As-Cleanrdquo Strategy

Training a supervised network fθ (parameterized by θ)requires many pairs (yixi) of noisy image yi and cleanimage xi by minimizing an empirical loss function L as

argminθ

sumi=1

L(fθ(yi)xi) (1)

Assume that the probability of occurrence for pair (yixi)is p(yixi) then statistically we have

θlowast = argminθ

sumi=1

p(yixi)L(fθ(yi)xi)

= argminθ

E(yx)[L(fθ(y)x)](2)

where y and x are random variables of noisy and clean im-ages respectively The paired variables (yx) are dependentand their relationship is y = x+no where no is the randomvariable of observed noise By exploring the dependence ofp(yixi) = p(xi)p(yi|xi) Eqn (2) is equivalent to

θlowast = argminθ

sumi=1

p(xi)p(yi|xi)L(fθ(yi)xi)

= argminθ

Ex[Ey|x[L(fθ(y)x)]](3)

Eqn (3) indicates that the network fθ can minimize the lossfunction by solving the same problem separately for eachclean image sample

Different with the ldquozero-meanrdquo assumption in [23 26]here we study a practical assumption on noise statistics iethe expectation E[x] and variance Var[x] of signal intensityare much stronger than those of noise E[no] and Var[no](such that they are negligible but not necessarily zero)

E[x] E[no] Var[x] Var[no] (4)

This is actually valid in real-world scenarios since we canclearly observe the contents in most real photographs withlittle influence of the noise The noise therein is often mod-eled by zero-mean Gaussian or mixed Poisson and Gaussian

(for realistic noise) Hence the noisy image y should havesimilar expectation with the clean image x

E[y] = E[x+ no] = E[x] + E[no] asymp E[x] (5)

Now we add simulated noise ns to the observed noisy imagey and generate a new noisy image z = y + ns We assumethat ns is statisticly close to no ie E[ns] asymp E[no] andVar[ns] asymp Var[no] Then we have

E[z] E[ns] Var[z] Var[ns] (6)

Therefore the simulated noisy image z has similar expecta-tion with the observed noisy image y

E[z] = E[y + ns] asymp E[y] (7)

By the Law of Total Expectation [7] we have

Ey[Ez[z|y]] = E[z] asymp E[y] = Ex[Ey[y|x]] (8)

Since the loss function L (usually `2) and the conditionalprobability density functions p(y|x) and p(z|y) are all con-tinuous everywhere the optimal network parameters θlowast

of Eqn (3) changes little with the addition of negligiblenoise no or ns With Eqns (4)-(8) when the x-conditionedexpectation of Ey|x[L(fθ(y)x)] are replaced with the y-conditioned expectation of Ez|y[L(fθ(z)y)] fθ obtainssimilar y-conditioned optimal parameters θlowast

argminθ

Ey[Ez|y[L(fθ(z)y)]]

asymp argminθ

Ex[Ey|x[L(fθ(y)x)]] = θlowast(9)

The network fθ minimizes the loss function L for each inputimage pair separately which equals to minimize it on all fi-nite pairs of images Through simple manipulations Eqn (9)is equivalent to

argminθ

sumi=1

p(yi)p(zi|yi)L(fθ(zi)yi)

= argminθ

Ey[Ez|y[L(fθ(z)y)]] asymp θlowast(10)

By exploring the dependence of p(ziyi) = p(yi)p(zi|yi)Eqn (10) is equivalent to

argminθ

E(zy)[L(fθ(z)y)]

= argminθ

sumi=1

p(ziyi)L(fθ(zi)yi) asymp θlowast(11)

Our observation is very simple and useful as long asthe noise is weak the optimal parameters of unsupervisednetwork trained on noisy image pairs (ziyi) are veryclose to the optimal parameters of the supervised networkstrained on pairs of noisy and clean images (yixi)

x

CleanImage

no

ObservedNoise

ns

SimulatedNoise

z

InputImage

Element-wiseSum

x

CleanImage

OutputImage

TargetImage

ObservedNoise

( (z) y)fθ

( (z) y)fθ LossFunction

Network(z)fθ

no

y

Figure 2 Proposed ldquoNoisy-As-Cleanrdquo strategy for training unsupervised image denoising networks In this strategy we take theobserved noisy image y = x+ no as the ldquocleanrdquo target and take the simulated noisy image z = y + ns as the input

Consistency of noise statistics Since our contexts are thereal-world scenarios the noise can be modeled by mixedPoisson and Gaussian distribution [15] Fortunately both thetwo distributions are linear additive ie the addition variableof two Poisson (or Gaussian) distributed variables are stillPoisson (or Gaussian) distributed Assume that the observed(simulated) noise no (ns) follows a mixed x-dependent (y-dependent) Poisson distribution parameterized by λo (λs)and Gaussian distribution N (0 σ2

o) (N (0 σ2s)) ie

no sim x P(λo) +N (0 σ2o)

ns sim y P(λs) +N (0 σ2s)

asymp x P(λs) +N (0 σ2s)

(12)

where x P(λo) and y P (λs) indicates that the noiseno and ns are element-wisely dependent on x and y re-spectively The ldquoasymprdquo indicates that the observed noise no isnegligible Thus we have

no+ns sim xP(λo+λs)+N (0 σ2o+σ

2s+2ρσoσs) (13)

where ρ is the correlation between no and ns (ρ = 0 ifthey are independent) This indicates that the summed noisevariable no + ns still follows a mixed x dependent Poissonand Gaussian distribution guaranteeing the consistency innoise statistics between the observed realistic noise and thesimulated noise As can be seen by the experiments (sect5) thisproperty makes our ldquoNoisy-As-Cleanrdquo strategy consistentlyeffective on different noise removal tasks

4 Learning ldquoNoisy-As-Cleanrdquo Networks forUnsupervised Image DenoisingWith our statistical analysis here we propose to learn

unsupervised networks with our ldquoNoisy-As-Cleanrdquo (NAC)strategy for image denoising Note that we only need the ob-served noisy image y to generate noisy image pairs (zy)with simulated noise ns Our idea is illustrated in Figure 2

Training NAC networks For real-world images capturedby camera sensors one can hardly distinguish the realisticnoise from the signal Our observation is that the signalintensity x is usually stronger than the noise intensity Thatis the expectation of the observed (realistic) noise no isusually much smaller than that of the latent clean image xWe can observe that if we train an image-specific networkfor the new noisy image z and regard the original noisyimage y as the ground-truth image then the trained image-specific network basically joint learn the image-specific priorand noise statistics It has the capacity to remove the noise nsfrom the new noisy image z Then if we perform denoisingon the original noisy image y the observed noise no caneasily be removed Note that we do not use the clean imagex as ldquoground-truthrdquo in training our NAC networks

Training blind denoising Most of existing supervised de-noising networks train a specific model to process a fixednoise pattern [8 9 29 33 40] To tackle the unknown noiseone feasible solution for these networks is to assume thenoise as AWGN and estimate its noise deviation The cor-responding noise is removed by using the networks trainedwith the estimated level But this strategy largely degradesthe denoising performance when the noise deviation is notestimated accurately Besides this solution can hardly dealwith realistic noise which is usually not AWGN capturedon real photographs In order to be effective on removingrealistic noise our NAC networks should have the ability toblindly remove the unknown noise from real photographsInspired by [17 44] we propose to train a blind versionof our NAC networks by using the AWGN noise within arange of levels (eg [0 55]) for removing unknown AWGNnoise We also use the mixed AWGN and Poisson noise (bothwithin a range of intensities) for removing the realistic noiseMore details will be introduced in sect52

Testing is performed by directly regarding an observed noisyimage y = x+ no as input We only test the image y onceThe denoised image can be represented as y = fθlowast(y) with

Noise Level σ = 5 σ = 10 σ = 15 σ = 20 σ = 25

Metric PSNRuarr SSIMuarr PSNRuarr SSIMuarr PSNRuarr SSIMuarr PSNRuarr SSIMuarr PSNRuarr SSIMuarrBM3D [13] 3807 09580 3440 09234 3238 08957 3100 08717 2997 08503

DnCNN [44] 3876 09633 3478 09270 3286 09027 3145 08799 3043 08617N2N [26] 3972 09665 3618 09446 3399 09149 3210 08788 3072 08446DIP [27] 3249 09344 3149 09299 2959 08636 2767 08531 2582 07723N2V [23] 2706 08174 2679 07859 2612 07468 2589 07405 2501 06564

NAC 3999 09820 3655 09569 3424 09277 3246 08961 3108 08654Blind-NAC 3848 09805 3665 09564 3477 09275 3313 09024 3178 08802

Table 2 Average PSNR (dB) and SSIM [41] results of different methods on Set12 dataset corrupted by AWGN noise The best andsecond best results are highlighted in red and blue respectively

Noise Level σ = 5 σ = 10 σ = 15 σ = 20 σ = 25

Metric PSNRuarr SSIMuarr PSNRuarr SSIMuarr PSNRuarr SSIMuarr PSNRuarr SSIMuarr PSNRuarr SSIMuarrBM3D [13] 3759 09640 3332 09163 3107 08720 2962 08342 2857 08017

DnCNN [44] 3807 09695 3388 09270 3173 08706 3027 08563 2923 08278N2N [26] 3858 09627 3407 09200 3181 08770 3014 08550 2867 08123DIP [27] 2974 08435 2816 08310 2707 07867 2580 07205 2463 06680N2V [23] 2670 07915 2639 07621 2577 07126 2541 06678 2483 06305

NAC 3900 09707 3460 09324 3213 08942 3047 08636 2896 08185Blind-NAC 3826 09605 3426 09266 3206 08919 3050 08609 2933 08327

Table 3 Average PSNR (dB) and SSIM [41] results of different methods on BSD68 dataset corrupted by AWGN noise The best andsecond best results are highlighted in red and blue respectively

which the objective metrics eg PSNR and SSIM [41] canbe computed with the clean image xImplementation details We employ the ResNet-20 net-work used in [27] as the backbone network which includes10 residual blocks Each block contains two convolutionallayers followed by a Batch Normalization (BN) [20] TheRectified Linear Units (ReLU) activation operator [32] isused after the first BN Its parameters are randomly initial-ized without being pretrained The optimizer is Adam [21]with default parameters The learning rate is fixed at 0001in all experiments We use the `2 loss function The networkis trained in 1000 epochs for each test image For data aug-mentation we employ 4 rotations 0 90 180 270combined with 2 mirror (vertical and horizontal) reflectionsresulting in totally 8 transformations We implement ourResNet based NAC networks in PyTorch [2]

5 ExperimentsIn this section we evaluate the performance of our

ldquoNoisy-As-Cleanrdquo (NAC) networks on image denoising Inall experiments we train a denoising network using onlythe noisy test image y as the target and using the simulatednoisy image z (with data augmentation) as the input For allcomparison methods the source codes or trained models aredownloaded from the corresponding authorsrsquo websites Weuse the default parameter settings unless otherwise speci-fied The PSNR SSIM [41] and visual quality of differentmethods are used to evaluate the comparison We first testwith synthetic noise such as additive white Gaussian noise(AWGN) in sect51 continue to perform blind image denoising

in sect52 and finally tackle the realistic noise in sect53 In sect54we conduct comprehensive ablation studies to gain deeperinsights into the proposed NAC strategy

51 Synthetic Noise Removal With Known Noise

We evaluate the proposed NAC networks on images cor-rupted by synthetic noise such as AWGN More experimentalresults on signal dependent Poisson noise and mixed Poisson-AWGN noise are provided in the Supplementary FileTraining NAC networks Here we train an image-specificdenoising network using the observed noisy test image yas the target and the simulated noisy image z as the inputEach observed noisy image y = x + no is generated byadding the observed noise no to the clean image x Thesimulated noisy image z = y + ns is generated by addingsimulated noise ns to observed noisy image yComparison methods We compare our NAC networkswith state-of-the-art image denoising methods [13 26 2730 44] On AWGN noise we compare with BM3D [13]DnCNN [44] Noise2Noise (N2N) [26] Deep Image Prior(DIP) [27] and Noise2Void (N2V) [23]Test datasets We evaluate the comparison methods on theSet12 and BSD68 datasets which are widely tested by su-pervised denoising networks [29 31 40 44] and previousmethods [13 16 43 46] The Set12 dataset contains 12 im-ages of sizes 512 times 512 or 256 times 256 while the BSD68dataset contains 68 images of different sizesResults on AWGN noise We test AWGN with noise devi-ation (noise level) of σ isin 5 10 15 20 25 ie the ob-served noise no is AWGN with standard deviation (std) of

Type Traditional Methods Supervised Networks Unsupervised NetworksDataset

Method CBM3D [12] NI [5] DnCNN+ [44] CBDNet [17] GCBD [10] N2N [26] DIP [27] NAC

CC [33]PSNRuarr 3519 3533 3540 3644 NA 3532 3569 3659SSIMuarr 09063 09212 09115 09460 NA 09160 09259 09502

DND [34]PSNRuarr 3451 3511 3790 3806 3558 3310 NA 3620SSIMuarr 08507 08778 09430 09421 09217 08110 NA 09252

Table 4 Average PSNR (dB) and SSIM [41] of different methods on the CC dataset [33] and the DND dataset [34] The best results arehighlighted in bold ldquoNArdquo means ldquoNot Availablerdquo due to unavailable code (GCBD on CC [33]) or difficult experiments (DIP on DND [34])

σ Since AWGN noise is signal independent the simulatednoise ns is set with the same σ as that of no The com-parison results are listed in Tables 2 and 3 It can be seenthat the network trained with the proposed NAC networksachieve much better performance on PSNR and SSIM [41]than BM3D [13] and DnCNN [44] two previous leadingimage denoising methods Note that DnCNN are supervisednetworks trained on clean and synthetic noisy image pairsOur NAC networks outperform the other unsupervised net-works N2N [26] DIP [27] and N2V [23] by a significantmargin on PSNR and SSIM [41]

52 Synthetic Noise Removal With Unknown Noise

To deal with unknown noise we propose to train a blindversion of our NAC networks for removing unknown noiseHere we test our NAC networks on AWGN noise withunknown noise deviation We use the same training strategycomparison methods and test datasets as in sect51

Training blind NAC networks For each test image wetrain a NAC network corrupted by AWGN with unknownnoise levels (deviations) The noise levels are randomly sam-pled (in Gaussian distribution) within [0 55] We also test onnoise levels in uniform distribution and obtain similar resultsWe repeat the training of NAC network on the test image withdifferent deviations Our NAC networks trained on AWGNwith unknown noise levels is termed as ldquoBlind-NACrdquo

Results on blind denoising For the same test image weadd to it the AWGN noise whose deviation is also in5 10 15 20 25 The blindly trained NAC network is di-rectly utilized to denoise the test image without estimatingits deviation The results are also listed in Tables 2 and 3 Weobserve that our Blind-NAC networks trained on AWGNnoise with unknown levels can achieve even better PSNRand SSIM [41] results than our NAC networks trained on spe-cific noise levels Note that on BSD68 our Blind-NAC net-works achieve better performance than DnCNN [44] whichachieves higher PSNR and SSIM results than our NAC net-works This demonstrate the effectiveness of our NAC net-works on blind image denoising With the success on blindimage next we will turn to real-world image denoising inwhich the noise is unknown and complex

53 Practice on Real Photographs

With the promising performance on blind image denois-ing here we tackle the realistic noise for practical appli-cations The observed realistic noise no can be roughlymodeled as mixed Poisson noise and AWGN noise [15 17]Hence for each observed noisy image y we generate thesimulated noise ns by sampling the y-dependent Poissonpart and the independent AWGN noiseTraining blind NAC networks is also performed for eachtest image ie the observed noisy image y In real-worldscenarios each observed noisy image y is corrupted withoutknowing the specific noise statistics of the observed noiseno Therefore the simulated noise ns is directly estimatedon y as mixed y-dependent Poisson and AWGN noise Foreach transformation image in data augmentation the Poissonnoise is randomly sampled with the parameter λ in 0 lt λ le25 and the AWGN noise is randomly sampled with the noiselevel σ in 0 lt σ le 25Comparison methods We compare with state-of-the-art methods on real-world image denoising includingCBM3D [12] the commercial software Neat Image [5] twosupervised networks DnCNN+ [44] and CBDNet [17] andthree unsupervised networks GCBD [10] Noise2Noise [26]and DIP [27] Note that DnCNN+ [44] and CBDNet [17]are two state-of-the-art supervised networks for real-worldimage denoising and DnCNN+ is an improved extension ofDnCNN [44] with much better performance (the authors ofDnCNN+ provide us the modelsresults of DnCNN+)Test datasets We evaluate the comparison methods on theCross-Channel (CC ) dataset [33] and DND dataset [34]The CC dataset [33] includes noisy images of 11 staticscenes captured by Canon 5D Mark 3 Nikon D600 andNikon D800 cameras The real-world noisy images are col-lected under a highly controlled indoor environment Eachscene is shot 500 times using the same camera and settingsThe average of the 500 shots is taken as the ldquoground-truthrdquoWe use the default 15 images of size 512times 512 cropped bythe authors to evaluate different image denoising methodsThe DND dataset [34] contains 50 scenarios captured bySony A7R Olympus E-M10 Sony RX100 IV and HuaweiNexus 6P Each scene is cropped to 20 bounding boxes of512times 512 pixels generating totally 1000 test images Thenoisy images are collected under higher ISO values with

(a) Noisy 3146dB09370

(e) CBDNet [17] 3934dB09905

(b) CBM3D [12] 3626dB09811

(f) GCBD [10] 3752dB09765

(c) NI [5] 3752dB09868

(g) N2N [26] 3495dB09621

(d) DnCNN+ [44] 3825dB09888

(h) NAC 3834dB09887Figure 3 Denoised images and PSNR(dB)SSIM by comparison methods on ldquo0017 3rdquo in DND [34] The ldquoground-truthrdquo image is notreleased but PSNR(dB)SSIM results are publicly provided on DND Benchmark

shorter exposure times while the ldquoground truthrdquo imagesare captured under lower ISO values with adjusted longerexposure times The ldquoground truthrdquo images are not releasedbut we can obtain the performance on PSNR and SSIM bysubmitting the denoised images to the DND rsquos Website

Results on PSNR and SSIM The comparisons on averagePSNR and SSIM results are listed in Table 4 As can be seenthe proposed NAC networks achieve better performance thanall previous denoising methods including the CBM3D [12]the supervised networks DnCNN+ [44] and CBDNet [17]and the unsupervised networks GCBD [10] N2N [26] andDIP [27] This demonstrates that the proposed NAC net-works can indeed handle the complex unknown and real-istic noise and achieve better performance than supervisednetworks such as DnCNN+ [44] and CBDNet [17]

Qualitative results In Figure 3 we show the denoised im-ages of our NAC network and the comparison methods onthe image ldquo0017 3rdquo from the DND dataset We observethat our unsupervised NAC networks are very effective onremoving realistic noise from the real photograph Besidesour NAC networks achieve competitive PSNR and SSIMresults when compared with the other methods includingthe supervised ones such as DnCNN+ [44] and CBDNet [17]Speed The work most similar to ours is Deep Image Prior(DIP) [27] which also trains an image-specific network foreach test image Averagely DIP needs 6039 seconds to pro-cess a 512times 512 color image on which our NAC network

needs 5832 seconds (on an NVIDIA Titan X GPU)

54 Ablation Study

To further study our NAC strategy we conduct detailedexamination of our NAC networks on image denoising1) Generality of our NAC strategy To evaluate the gen-erality of the proposed NAC strategy we apply it on theDnCNN [44] network and denote the resulting network asldquoDnCNN-NACrdquo We train DnCNN with our NAC strategy(DnCNN-NAC) and the comparison results with DnCNNare listed in Tab 5 One can see that DnCNN-NAC achievesbetter PSNR results than that of the original DnCNN whenσ = 5 10 15 (but worse when σ = 20 25) Note that theoriginal DnCNN network is trained offline on the BSD400dataset while here the DnCNN-NAC network is trainedonline for each specific test image

σ 5 10 15 20 25DnCNN [44] 3876 3478 3286 3145 3043DnCNN-NAC 4318 3716 3365 3116 2923

Table 5 PSNR (dB) results of DnCNN and DnCNN-NAC onSet12 corrupted by AWGN noise with different σ

2) Differences from DIP [27] Though the basic network inour work is the ResNet used in DIP [27] our NAC network isessentially different from DIP on at least two aspects Firstour NAC is a novel strategy for unsupervised learning ofadaptive network parameters for the degraded image while

0 2500 5000 7500 10000Epochs

000002004006008010012

Loss

Loss CameramanLoss House

6121824303642

PSNR

(dB)

PSNR CameramanPSNR House

(a) Curves of DIP [27] (b) Curves of our NACFigure 4 Training loss and PSNR (dB) curves of DIP [27] (a)and our NAC (b) networks wrt the number of epochs on theimages of ldquoCameramanrdquo and ldquoHouserdquo from Set12

DIP aims to investigate adaptive network structure withoutlearning the parameters Second our NAC learns a mappingfrom the synthetic noisy image z = y + ns to the noisyimage y which approximates the mapping from the noisyimage y = x + no to the clean image x But DIP maps arandom noise map to the noisy image y and the denoisedimage is obtained during the process Due to the two reasonsDIP needs early stop for different images while our NACachieves more robust (and better) denoising performancethan DIP on diverse images In Figure 4 we plot the curvesof training loss and test PSNR of DIP (a) and NAC (b)networks in 10000 epochs on two images of ldquoCameramanrdquoand ldquoHouserdquo We observe that DIP needs early stop to selectthe best results while our NAC can stably achieve betterdenoising results within 1000 epochs3) Influence on the number of residual blocks andepochs Our backbone network is the ResNet [27] with 10residual blocks trained in 1000 epochs Now we study howthe number of residual blocks and epochs influence the per-formance of NAC on image denoising The experiments areperformed on the Set12 dataset corrupted by AWGN noise(σ = 15) From Table 6 we observe that with more residualblocks the NAC networks the NAC networks can achievebetter PSNR and SSIM [41] results And 10 residual blocksare enough to achieve satisfactory results With more (eg15) blocks there is little improvement on PSRN and SSIMHence we use 10 residual blocks the same as [27] Then westudy how the number of epochs influence the performanceof NAC on image denoising From Table 7 one can see thaton the Set12 dataset corrupted by AWGN noise (σ = 15)with more training epochs our NAC networks achieve betterPSNR and SSIM results but with longer processing time

of Blocks 1 2 5 10 15PSNRuarr 3358 3385 3414 3424 3428SSIMuarr 09161 09226 09272 09277 09272

Table 6 Average PSNR (dB)SSIM of NAC with different num-ber of blocks on Set12 corrupted by AWGN noise (σ = 15)

4) Comparison with Oracle We also study the ldquoOraclerdquoperformance of our NAC networks In ldquoOraclerdquo we trainour NAC networks on the pair of observed noisy image yand its clean image x corrupted by AWGN noise or signal

5 10 15 20 25 AWGN Noise Level

2830323436384042444648

PSNR

(dB)

PSNR (dB) OraclePSNR (dB) Ours

080082084086088090092094096098100

SSIM

SSIM OracleSSIM Ours

(a)

BM3D DnCNN NACMethod

1416182022242628

PSNR

(dB)

PSNR (dB) = 50

066068070072074076078080

SSIM

SSIM = 50

(b)Figure 5 Comparisons of PSNR (dB) and SSIM results on Set12(a) by our NAC networks and its ldquoOraclerdquo version for AWGN withσ = 5 10 15 20 25 and (b) by BM3D [13] DnCNN [44] andour NAC networks for strong AWGN (σ = 50)

dependent Poisson noise The experiments are performed onSet12 dataset corrupted by AWGN or signal dependent Pois-son noise The noise deviations are in 5 10 15 20 25Figure 5 (a) shows comparisons of our NAC and its ldquoOr-aclerdquo networks on PSNR and SSIM It can be seen thatthe ldquoOraclerdquo networks trained on the pair of noisy-cleanimages only perform slightly better than the original NACnetworks trained with the simulated -observed noisy imagepairs (zy) With our NAC strategy the NAC networkstrained only with the noisy test image can achieve similardenoising performance on the weak noise5) Performance on strong noise Our NAC strategy isbased on the assumption of ldquoweak noiserdquo It is natural towonder how well NAC performs against strong noise Toanswer this question we compare the NAC networks withBM3D [13] and DnCNN [44] on Set12 corrupted by AWGNnoise with σ = 50 The PSNR and SSIM results are plottedin Figure 5 (b) One can see that our NAC networks arelimited in handling strong AWGN noise when comparedwith BM3D [13] and DnCNN [44]

of Epochs 100 200 500 1000 5000PSNRuarr 3180 3279 3377 3424 3432SSIMuarr 08714 09023 09189 09277 09280Timedarr 674 1325 3020 5832 28156

Table 7 Average PSNR (dB) and time (s) of NAC with differentnumber of epochs on Set12 corrupted by AWGN noise (σ = 15)

6 ConclusionIn this work we proposed a ldquoNoisy-As-Cleanrdquo (NAC)

strategy for learning unsupervised image denoising networksIn our NAC we trained an image-specific network by tak-ing the noisy test image as the target and adding to it thesimulated noise to generate the simulated noisy input Thesimulated noise is close to the observed noise in the noisytest image This strategy can be seamlessly embedded into ex-isting supervised denoising networks We provided a simpleand useful observation it is possible to learn an unsupervisednetwork only with the noisy image approximating the opti-mal parameters of a supervised network learned with pairsof noisy and clean images Extensive experiments on bench-

mark datasets demonstrate that the networks trained withour NAC strategy achieved better comparable performanceon PSNR SSIM and visual quality when compared to pre-vious state-of-the-art image denoising methods includingseveral supervised learning denoising networks These re-sults validate that our NAC strategy can learn image-specificpriors and noise statistics only from the corrupted test image

References[1] Darmstadt Noise Dataset Benchmark https

noisevisinftu-darmstadtdebenchmarkresults_srgb Accessed 2019-05-23 2

[2] PyTorch httpspytorchorg Accessed 2019-05-23 5

[3] Smartphone Image Denoising Dataset Benchmarkhttpswwweecsyorkuca˜kamelsiddbenchmarkphp Accessed 2019-05-23 2

[4] Abdelrahman Abdelhamed Stephen Lin and Michael SBrown A high-quality denoising dataset for smartphonecameras In CVPR June 2018 1 2

[5] Neatlab ABSoft Neat Image httpsnineatvideocomhome 6 7

[6] Joshua Batson and Loic Royer Noise2Self Blind denoisingby self-supervision In ICML volume 97 pages 524ndash533PMLR 2019 1 2 3

[7] Patrick Billingsley Probability and Measure Wiley Seriesin Probability and Statistics Wiley 1995 3

[8] Tim Brooks Ben Mildenhall Tianfan Xue Jiawen ChenDillon Sharlet and Jonathan T Barron Unprocessing imagesfor learned raw denoising In CVPR pages 9446ndash9454 20193 4

[9] Harold Christopher Burger Christian J Schuler and StefanHarmeling Image denoising Can plain neural networkscompete with BM3D In CVPR pages 2392ndash2399 2012 12 3 4

[10] Jingwen Chen Jiawei Chen Hongyang Chao and Ming YangImage blind denoising with generative adversarial networkbased noise modeling In CVPR pages 3155ndash3164 2018 16 7

[11] Yunjin Chen and Thomas Pock Trainable nonlinear reac-tion diffusion A flexible framework for fast and effectiveimage restoration IEEE Transactions on Pattern Analysisand Machine Intelligence 39(6)1256ndash1272 2017 2

[12] Kostadin Dabov Alessandro Foi Vladimir Katkovnik andKaren Egiazarian Color image denoising via sparse 3Dcollaborative filtering with grouping constraint in luminance-chrominance space In ICIP pages 313ndash316 IEEE 2007 67

[13] Kostadin Dabov Alessandro Foi Vladimir Katkovnik andKaren Egiazarian Image denoising by sparse 3-D transform-domain collaborative filtering IEEE Transactions on ImageProcessing 16(8)2080ndash2095 2007 5 6 8

[14] Michael Elad and Michal Aharon Image denoising via sparseand redundant representations over learned dictionaries IEEETransactions on Image Processing 15(12)3736ndash3745 20062

[15] Alessandro Foi Mejdi Trimeche Vladimir Katkovnik andKaren Egiazarian Practical poissonian-gaussian noise model-ing and fitting for single-image raw-data IEEE Transactionson Image Processing 17(10)1737ndash1754 Oct 2008 1 2 4 6

[16] Shuhang Gu Qi Xie Deyu Meng Wangmeng Zuo XiangchuFeng and Lei Zhang Weighted nuclear norm minimizationand its applications to low level vision International Journalof Computer Vision 121(2)183ndash208 2017 5

[17] Shi Guo Zifei Yan Kai Zhang Wangmeng Zuo and LeiZhang Toward convolutional blind denoising of real pho-tographs In CVPR 2019 1 2 3 4 6 7

[18] Kaiming He Xiangyu Zhang Shaoqing Ren and Jian SunDeep residual learning for image recognition In CVPR pages770ndash778 2016 1 2

[19] Gao Huang Zhuang Liu Laurens van der Maaten and Kil-ian Q Weinberger Densely connected convolutional net-works In CVPR pages 4700ndash4708 2017 1

[20] Sergey Ioffe and Christian Szegedy Batch normalizationAccelerating deep network training by reducing internal co-variate shift In ICML 2015 5

[21] Diederik P Kingma and Jimmy Ba Adam A method forstochastic optimization In ICLR 2015 5

[22] Alex Krizhevsky Ilya Sutskever and Geoffrey E Hinton Im-agenet classification with deep convolutional neural networksIn NIPS pages 1097ndash1105 2012 1

[23] Alexander Krull Tim-Oliver Buchholz and Florian JugNoise2Void-learning denoising from single noisy imagesIn CVPR 2019 1 2 3 5 6

[24] Samuli Laine Tero Karras Jaakko Lehtinen and TimoAila High-quality self-supervised deep image denoisingIn NeurIPS 2019 1 2

[25] Stamatios Lefkimmiatis Non-local color image denoisingwith convolutional neural networks In CVPR pages 3587ndash3596 2017 1 2 3

[26] Jaakko Lehtinen Jacob Munkberg Jon Hasselgren SamuliLaine Tero Karras Miika Aittala and Timo AilaNoise2Noise Learning image restoration without clean dataIn ICML pages 2971ndash2980 2018 1 2 3 5 6 7

[27] Victor Lempitsky Dmitry Ulya Andrea Vedaldi and VictorLempitsky Deep image prior In CVPR pages 9446ndash94542018 1 2 3 5 6 7 8

[28] Ce Liu William T Freeman Richard Szeliski and Sing BingKang Noise estimation from a single image CVPR 1901ndash908 2006 1 3

[29] Ding Liu Bihan Wen Yuchen Fan Chen Change Loy andThomas S Huang Non-local recurrent network for imagerestoration In NeurIPS pages 1673ndash1682 2018 1 2 3 4 5

[30] Markku Makitalo and Alessandro Foi Optimal inversionof the anscombe transformation in low-count poisson imagedenoising IEEE Transactions on Image Processing 20(1)99ndash109 2011 5

[31] Xiao-Jiao Mao Chunhua Shen and Yu-Bin Yang Imagerestoration using convolutional auto-encoders with symmetricskip connections In NIPS 2016 1 2 3 5

[32] Vinod Nair and Geoffrey E Hinton Rectified linear unitsimprove restricted boltzmann machines In ICML pages807ndash814 2010 5

[33] Seonghyeon Nam Youngbae Hwang Yasuyuki Matsushitaand Seon Joo Kim A holistic approach to cross-channelimage noise modeling and its application to image denoisingIn CVPR pages 1683ndash1691 2016 1 2 3 4 6

[34] Tobias Plotz and Stefan Roth Benchmarking denoising al-gorithms with real photographs In CVPR 2017 1 2 67

[35] Tobias Plotz and Stefan Roth Neural nearest neighbors net-works In NeurIPS 2018 1 2 3

[36] Stefan Roth and Michael J Black Fields of experts Inter-national Journal of Computer Vision 82(2)205ndash229 20092

[37] Uwe Schmidt and Stefan Roth Shrinkage fields for effectiveimage restoration In CVPR pages 2774ndash2781 June 2014 2

[38] Karen Simonyan and Andrew Zisserman Very deep convolu-tional networks for large-scale image recognition In ICLR2015 1

[39] Christian Szegedy Wei Liu Yangqing Jia Pierre SermanetScott Reed Dragomir Anguelov Dumitru Erhan VincentVanhoucke and Andrew Rabinovich Going deeper withconvolutions In CVPR pages 1ndash9 2015 1

[40] Ying Tai Jian Yang Xiaoming Liu and Chunyan Xu Mem-net A persistent memory network for image restoration InICCV 2017 1 2 3 4 5

[41] Zhou Wang Alan C Bovik Hamid R Sheikh and Eero PSimoncelli Image quality assessment from error visibility tostructural similarity IEEE Transactions on Image Processing13(4)600ndash612 2004 5 6 8

[42] Junyuan Xie Linli Xu and Enhong Chen Image denoisingand inpainting with deep neural networks In NIPS pages341ndash349 2012 1 2 3

[43] Jun Xu Wangmeng Zuo Lei Zhang David Zhang and XFeng Patch group based nonlocal self-similarity prior learn-ing for image denoising In ICCV pages 244ndash252 2015 25

[44] Kai Zhang Wangmeng Zuo Yunjin Chen Deyu Meng andLei Zhang Beyond a Gaussian denoiser Residual learning ofdeep cnn for image denoising IEEE Transactions on ImageProcessing 2017 1 2 3 4 5 6 7 8

[45] Maria Zontak and Michal Irani Internal statistics of a singlenatural image In CVPR 2011 2

[46] Daniel Zoran and Yair Weiss From learning models of naturalimage patches to whole image restoration In ICCV pages479ndash486 2011 2 5

Page 2: Noisy-As-Clean: Learning Unsupervised Denoising from the ... · (d) CDnCNN+NAC: 35.80dB/0.9116 Figure 1. Denoised images and PSNR/SSIM results of CD-nCNN [44] (c) and CDnCNN trained

been proposed to remove the dependence on clean trainingimages However when processing a corrupted test imagethese networks are still subjected to the domain gap on ei-ther image priors or noise statistics between the externaltraining images and the test one Two interesting works areNoise2Noise [26] and Deep Image Prior (DIP) [27] whichsucceed on the zero-mean noise But the realistic noise onreal photographs is usually not zero-mean [4 33 34]

In this paper we propose a ldquoNoisy-As-Cleanrdquo (NAC)strategy to alleviate the domain gap problem In our NACthe noisy test image y = x + no is directly taken as theldquocleanrdquo target to train an image-specific network Thus thedomain gap on image priors are largely bridged by our NACTo reduce the gap on noise statistics for the noisy test imagey as target we take as the input of our NAC a simulated noisyimage z = y + ns consisting of the corrupted test image yand a simulated noise ns which is statistically close to thecorrupted noise no in y By this way in the training stageour NAC network learns to remove the simulated noise nsfrom z and thus is able to remove the observed noise nofrom the noisy test image y in the testing stage

A simple and useful observation about our NAC strategyis as long as the corrupted noise is ldquoweakrdquo it is feasibleto train an unsupervised denoising network only with thecorrupted test image and the learned parameters are veryclose to those of a supervised network trained with a pair ofnoisy and clean images Though being very simple our NACstrategy is very effective for image denoising In Figure 1 wecompare the denoised images by the vanilla CDnCNN [44]and the CDnCNN trained with our NAC (CDnCNN+NAC)on the image ldquoHouserdquo corrupted by AWGN (σ = 15) Weobserve that the ldquoCDnCNN+NACrdquo achieves better visualquality and higher PSNRSSIM results than CDnCNN [44]which is trained on plenty of noisy and clean image pairsExperiments on diverse synthetic and real-world benchmarksdemonstrate that when trained with our NAC strategy anunpolished ResNet [18] outperforms previous superviseddenoising networks on removing ldquoweakrdquo noise Our workreveals that when the noise is ldquoweakrdquo an unsupervisednetwork trained with only the corrupted test image can obtainbetter denoising performance than the supervised ones

2 Related WorkSupervised denoising networks are trained with plentypairs of noisy and clean images This category of networkscan learn external image priors and noise statistics from thetraining data Numerous methods [9 17 25 29 31 35 40 4244] have been developed with achieving promising perfor-mance on AWGN noise removal where the statistics of train-ing and test noise are similar However due to the aforemen-tioned domain gap problem the performance of these net-works degrade severely on real noisy photographs [13434]Unsupervised denoising networks are developed to re-

Image NoiseType Method YearrsquoPub Prior Stat

S DnCNN [44] 17rsquoTIP Ext 3CBDNet [17] 19rsquoCVPR Ext 3

U

N2N [26] 18rsquoICML Ext 3DIP [27] 18rsquoCVPR IntN2S [6] 19rsquoICML ExtN2V [23] 19rsquoCVPR Ext

SS [24] 19rsquoNeurIPS ExtNAC (Ours) 20rsquoSubmit Int 3

Table 1 Summary of representative networks for image de-noising S Supervised networks U Unsupervised networksN2N Noise2Noise [26] DIP Deep Image Prior [27] N2SNoise2Self [6] N2V Noise2Void [23] SS Self-Supervised [24]Pub Publication Int Internal image priors Ext External imagepriors Stat Statistics The networks with ldquo3rdquo are able to learnthe noise statistics from training data

move the need on plenty of clean images Along this di-rection Noise2Noise (N2N) [26] trains the network betweenpairs of corrupted images with the same scene but indepen-dently sampled noise This work is feasible to learn externalimage priors and noise statistics from the training data How-ever in real-world scenarios it is difficult to collect largeamounts of paired images with independent corruption fortraining Noise2Void (N2V) [23] predicts a pixel from itssurroundings by learning blind-spot networks but it stillsuffers from the domain gap on image priors between thetraining images and test images This work assumes thatthe corruption is zero-mean and independent between pixelsHowever as mentioned in Noise2Self (N2S) [6] N2V [23]significantly degrades the training efficiency and denois-ing performance at test time Recently Deep Image Prior(DIP) [27] reveals that the network structure can resonatewith the natural image priors and can be utilized in imagerestoration without external images However it is not prac-tical to select a suitable network and early-stop its trainingat right moments for each corrupted imageInternal and external image priors are widely used fordiverse image restoration tasks [14 36 43 45 46] Internalpriors are directly learned from the input test image itselfwhile the external ones are learned on external images (aslong as not the test one) The internal priors are adaptiveto its image contents but somewhat affected by the corrup-tions [14 45] By contrast the external priors are effectivefor restoring images with general contents but may not beoptimal for specific test image [11 14 36 37 43 46]Noise statistics is of key importance for image denoisingThe AWGN noise is one typical noise with widespread studyRecently researchers shift more attention to the realisticnoise produced in camera sensors [4 34] which is usuallymodeled as mixed Poisson and Gaussian distribution [15]The Poisson component mainly comes from the irregular

photons hitting the sensor [28] while Gaussian noise is ma-jorly produced by dark current [33] Though performing wellon the synthetic noise being trained with supervised denois-ers [9 17 25 29 31 40 42 44] still suffer from the domaingap problem when processing the real noisy photographs

In Table 1 we summarize several recently published de-noising networks [6 8 23 26 27 29 31 35 40 44] fromthe aspects of supervised and unsupervised networks imagepriors and noise statistics In this work to bridge the domaingap problem we propose a ldquoNoisy-As-Cleanrdquo strategy tolearn the image-specific internal priors and noise statisticsdirectly from the corrupted test image

3 Theoretical Background of ldquoNoisy-As-Cleanrdquo Strategy

Training a supervised network fθ (parameterized by θ)requires many pairs (yixi) of noisy image yi and cleanimage xi by minimizing an empirical loss function L as

argminθ

sumi=1

L(fθ(yi)xi) (1)

Assume that the probability of occurrence for pair (yixi)is p(yixi) then statistically we have

θlowast = argminθ

sumi=1

p(yixi)L(fθ(yi)xi)

= argminθ

E(yx)[L(fθ(y)x)](2)

where y and x are random variables of noisy and clean im-ages respectively The paired variables (yx) are dependentand their relationship is y = x+no where no is the randomvariable of observed noise By exploring the dependence ofp(yixi) = p(xi)p(yi|xi) Eqn (2) is equivalent to

θlowast = argminθ

sumi=1

p(xi)p(yi|xi)L(fθ(yi)xi)

= argminθ

Ex[Ey|x[L(fθ(y)x)]](3)

Eqn (3) indicates that the network fθ can minimize the lossfunction by solving the same problem separately for eachclean image sample

Different with the ldquozero-meanrdquo assumption in [23 26]here we study a practical assumption on noise statistics iethe expectation E[x] and variance Var[x] of signal intensityare much stronger than those of noise E[no] and Var[no](such that they are negligible but not necessarily zero)

E[x] E[no] Var[x] Var[no] (4)

This is actually valid in real-world scenarios since we canclearly observe the contents in most real photographs withlittle influence of the noise The noise therein is often mod-eled by zero-mean Gaussian or mixed Poisson and Gaussian

(for realistic noise) Hence the noisy image y should havesimilar expectation with the clean image x

E[y] = E[x+ no] = E[x] + E[no] asymp E[x] (5)

Now we add simulated noise ns to the observed noisy imagey and generate a new noisy image z = y + ns We assumethat ns is statisticly close to no ie E[ns] asymp E[no] andVar[ns] asymp Var[no] Then we have

E[z] E[ns] Var[z] Var[ns] (6)

Therefore the simulated noisy image z has similar expecta-tion with the observed noisy image y

E[z] = E[y + ns] asymp E[y] (7)

By the Law of Total Expectation [7] we have

Ey[Ez[z|y]] = E[z] asymp E[y] = Ex[Ey[y|x]] (8)

Since the loss function L (usually `2) and the conditionalprobability density functions p(y|x) and p(z|y) are all con-tinuous everywhere the optimal network parameters θlowast

of Eqn (3) changes little with the addition of negligiblenoise no or ns With Eqns (4)-(8) when the x-conditionedexpectation of Ey|x[L(fθ(y)x)] are replaced with the y-conditioned expectation of Ez|y[L(fθ(z)y)] fθ obtainssimilar y-conditioned optimal parameters θlowast

argminθ

Ey[Ez|y[L(fθ(z)y)]]

asymp argminθ

Ex[Ey|x[L(fθ(y)x)]] = θlowast(9)

The network fθ minimizes the loss function L for each inputimage pair separately which equals to minimize it on all fi-nite pairs of images Through simple manipulations Eqn (9)is equivalent to

argminθ

sumi=1

p(yi)p(zi|yi)L(fθ(zi)yi)

= argminθ

Ey[Ez|y[L(fθ(z)y)]] asymp θlowast(10)

By exploring the dependence of p(ziyi) = p(yi)p(zi|yi)Eqn (10) is equivalent to

argminθ

E(zy)[L(fθ(z)y)]

= argminθ

sumi=1

p(ziyi)L(fθ(zi)yi) asymp θlowast(11)

Our observation is very simple and useful as long asthe noise is weak the optimal parameters of unsupervisednetwork trained on noisy image pairs (ziyi) are veryclose to the optimal parameters of the supervised networkstrained on pairs of noisy and clean images (yixi)

x

CleanImage

no

ObservedNoise

ns

SimulatedNoise

z

InputImage

Element-wiseSum

x

CleanImage

OutputImage

TargetImage

ObservedNoise

( (z) y)fθ

( (z) y)fθ LossFunction

Network(z)fθ

no

y

Figure 2 Proposed ldquoNoisy-As-Cleanrdquo strategy for training unsupervised image denoising networks In this strategy we take theobserved noisy image y = x+ no as the ldquocleanrdquo target and take the simulated noisy image z = y + ns as the input

Consistency of noise statistics Since our contexts are thereal-world scenarios the noise can be modeled by mixedPoisson and Gaussian distribution [15] Fortunately both thetwo distributions are linear additive ie the addition variableof two Poisson (or Gaussian) distributed variables are stillPoisson (or Gaussian) distributed Assume that the observed(simulated) noise no (ns) follows a mixed x-dependent (y-dependent) Poisson distribution parameterized by λo (λs)and Gaussian distribution N (0 σ2

o) (N (0 σ2s)) ie

no sim x P(λo) +N (0 σ2o)

ns sim y P(λs) +N (0 σ2s)

asymp x P(λs) +N (0 σ2s)

(12)

where x P(λo) and y P (λs) indicates that the noiseno and ns are element-wisely dependent on x and y re-spectively The ldquoasymprdquo indicates that the observed noise no isnegligible Thus we have

no+ns sim xP(λo+λs)+N (0 σ2o+σ

2s+2ρσoσs) (13)

where ρ is the correlation between no and ns (ρ = 0 ifthey are independent) This indicates that the summed noisevariable no + ns still follows a mixed x dependent Poissonand Gaussian distribution guaranteeing the consistency innoise statistics between the observed realistic noise and thesimulated noise As can be seen by the experiments (sect5) thisproperty makes our ldquoNoisy-As-Cleanrdquo strategy consistentlyeffective on different noise removal tasks

4 Learning ldquoNoisy-As-Cleanrdquo Networks forUnsupervised Image DenoisingWith our statistical analysis here we propose to learn

unsupervised networks with our ldquoNoisy-As-Cleanrdquo (NAC)strategy for image denoising Note that we only need the ob-served noisy image y to generate noisy image pairs (zy)with simulated noise ns Our idea is illustrated in Figure 2

Training NAC networks For real-world images capturedby camera sensors one can hardly distinguish the realisticnoise from the signal Our observation is that the signalintensity x is usually stronger than the noise intensity Thatis the expectation of the observed (realistic) noise no isusually much smaller than that of the latent clean image xWe can observe that if we train an image-specific networkfor the new noisy image z and regard the original noisyimage y as the ground-truth image then the trained image-specific network basically joint learn the image-specific priorand noise statistics It has the capacity to remove the noise nsfrom the new noisy image z Then if we perform denoisingon the original noisy image y the observed noise no caneasily be removed Note that we do not use the clean imagex as ldquoground-truthrdquo in training our NAC networks

Training blind denoising Most of existing supervised de-noising networks train a specific model to process a fixednoise pattern [8 9 29 33 40] To tackle the unknown noiseone feasible solution for these networks is to assume thenoise as AWGN and estimate its noise deviation The cor-responding noise is removed by using the networks trainedwith the estimated level But this strategy largely degradesthe denoising performance when the noise deviation is notestimated accurately Besides this solution can hardly dealwith realistic noise which is usually not AWGN capturedon real photographs In order to be effective on removingrealistic noise our NAC networks should have the ability toblindly remove the unknown noise from real photographsInspired by [17 44] we propose to train a blind versionof our NAC networks by using the AWGN noise within arange of levels (eg [0 55]) for removing unknown AWGNnoise We also use the mixed AWGN and Poisson noise (bothwithin a range of intensities) for removing the realistic noiseMore details will be introduced in sect52

Testing is performed by directly regarding an observed noisyimage y = x+ no as input We only test the image y onceThe denoised image can be represented as y = fθlowast(y) with

Noise Level σ = 5 σ = 10 σ = 15 σ = 20 σ = 25

Metric PSNRuarr SSIMuarr PSNRuarr SSIMuarr PSNRuarr SSIMuarr PSNRuarr SSIMuarr PSNRuarr SSIMuarrBM3D [13] 3807 09580 3440 09234 3238 08957 3100 08717 2997 08503

DnCNN [44] 3876 09633 3478 09270 3286 09027 3145 08799 3043 08617N2N [26] 3972 09665 3618 09446 3399 09149 3210 08788 3072 08446DIP [27] 3249 09344 3149 09299 2959 08636 2767 08531 2582 07723N2V [23] 2706 08174 2679 07859 2612 07468 2589 07405 2501 06564

NAC 3999 09820 3655 09569 3424 09277 3246 08961 3108 08654Blind-NAC 3848 09805 3665 09564 3477 09275 3313 09024 3178 08802

Table 2 Average PSNR (dB) and SSIM [41] results of different methods on Set12 dataset corrupted by AWGN noise The best andsecond best results are highlighted in red and blue respectively

Noise Level σ = 5 σ = 10 σ = 15 σ = 20 σ = 25

Metric PSNRuarr SSIMuarr PSNRuarr SSIMuarr PSNRuarr SSIMuarr PSNRuarr SSIMuarr PSNRuarr SSIMuarrBM3D [13] 3759 09640 3332 09163 3107 08720 2962 08342 2857 08017

DnCNN [44] 3807 09695 3388 09270 3173 08706 3027 08563 2923 08278N2N [26] 3858 09627 3407 09200 3181 08770 3014 08550 2867 08123DIP [27] 2974 08435 2816 08310 2707 07867 2580 07205 2463 06680N2V [23] 2670 07915 2639 07621 2577 07126 2541 06678 2483 06305

NAC 3900 09707 3460 09324 3213 08942 3047 08636 2896 08185Blind-NAC 3826 09605 3426 09266 3206 08919 3050 08609 2933 08327

Table 3 Average PSNR (dB) and SSIM [41] results of different methods on BSD68 dataset corrupted by AWGN noise The best andsecond best results are highlighted in red and blue respectively

which the objective metrics eg PSNR and SSIM [41] canbe computed with the clean image xImplementation details We employ the ResNet-20 net-work used in [27] as the backbone network which includes10 residual blocks Each block contains two convolutionallayers followed by a Batch Normalization (BN) [20] TheRectified Linear Units (ReLU) activation operator [32] isused after the first BN Its parameters are randomly initial-ized without being pretrained The optimizer is Adam [21]with default parameters The learning rate is fixed at 0001in all experiments We use the `2 loss function The networkis trained in 1000 epochs for each test image For data aug-mentation we employ 4 rotations 0 90 180 270combined with 2 mirror (vertical and horizontal) reflectionsresulting in totally 8 transformations We implement ourResNet based NAC networks in PyTorch [2]

5 ExperimentsIn this section we evaluate the performance of our

ldquoNoisy-As-Cleanrdquo (NAC) networks on image denoising Inall experiments we train a denoising network using onlythe noisy test image y as the target and using the simulatednoisy image z (with data augmentation) as the input For allcomparison methods the source codes or trained models aredownloaded from the corresponding authorsrsquo websites Weuse the default parameter settings unless otherwise speci-fied The PSNR SSIM [41] and visual quality of differentmethods are used to evaluate the comparison We first testwith synthetic noise such as additive white Gaussian noise(AWGN) in sect51 continue to perform blind image denoising

in sect52 and finally tackle the realistic noise in sect53 In sect54we conduct comprehensive ablation studies to gain deeperinsights into the proposed NAC strategy

51 Synthetic Noise Removal With Known Noise

We evaluate the proposed NAC networks on images cor-rupted by synthetic noise such as AWGN More experimentalresults on signal dependent Poisson noise and mixed Poisson-AWGN noise are provided in the Supplementary FileTraining NAC networks Here we train an image-specificdenoising network using the observed noisy test image yas the target and the simulated noisy image z as the inputEach observed noisy image y = x + no is generated byadding the observed noise no to the clean image x Thesimulated noisy image z = y + ns is generated by addingsimulated noise ns to observed noisy image yComparison methods We compare our NAC networkswith state-of-the-art image denoising methods [13 26 2730 44] On AWGN noise we compare with BM3D [13]DnCNN [44] Noise2Noise (N2N) [26] Deep Image Prior(DIP) [27] and Noise2Void (N2V) [23]Test datasets We evaluate the comparison methods on theSet12 and BSD68 datasets which are widely tested by su-pervised denoising networks [29 31 40 44] and previousmethods [13 16 43 46] The Set12 dataset contains 12 im-ages of sizes 512 times 512 or 256 times 256 while the BSD68dataset contains 68 images of different sizesResults on AWGN noise We test AWGN with noise devi-ation (noise level) of σ isin 5 10 15 20 25 ie the ob-served noise no is AWGN with standard deviation (std) of

Type Traditional Methods Supervised Networks Unsupervised NetworksDataset

Method CBM3D [12] NI [5] DnCNN+ [44] CBDNet [17] GCBD [10] N2N [26] DIP [27] NAC

CC [33]PSNRuarr 3519 3533 3540 3644 NA 3532 3569 3659SSIMuarr 09063 09212 09115 09460 NA 09160 09259 09502

DND [34]PSNRuarr 3451 3511 3790 3806 3558 3310 NA 3620SSIMuarr 08507 08778 09430 09421 09217 08110 NA 09252

Table 4 Average PSNR (dB) and SSIM [41] of different methods on the CC dataset [33] and the DND dataset [34] The best results arehighlighted in bold ldquoNArdquo means ldquoNot Availablerdquo due to unavailable code (GCBD on CC [33]) or difficult experiments (DIP on DND [34])

σ Since AWGN noise is signal independent the simulatednoise ns is set with the same σ as that of no The com-parison results are listed in Tables 2 and 3 It can be seenthat the network trained with the proposed NAC networksachieve much better performance on PSNR and SSIM [41]than BM3D [13] and DnCNN [44] two previous leadingimage denoising methods Note that DnCNN are supervisednetworks trained on clean and synthetic noisy image pairsOur NAC networks outperform the other unsupervised net-works N2N [26] DIP [27] and N2V [23] by a significantmargin on PSNR and SSIM [41]

52 Synthetic Noise Removal With Unknown Noise

To deal with unknown noise we propose to train a blindversion of our NAC networks for removing unknown noiseHere we test our NAC networks on AWGN noise withunknown noise deviation We use the same training strategycomparison methods and test datasets as in sect51

Training blind NAC networks For each test image wetrain a NAC network corrupted by AWGN with unknownnoise levels (deviations) The noise levels are randomly sam-pled (in Gaussian distribution) within [0 55] We also test onnoise levels in uniform distribution and obtain similar resultsWe repeat the training of NAC network on the test image withdifferent deviations Our NAC networks trained on AWGNwith unknown noise levels is termed as ldquoBlind-NACrdquo

Results on blind denoising For the same test image weadd to it the AWGN noise whose deviation is also in5 10 15 20 25 The blindly trained NAC network is di-rectly utilized to denoise the test image without estimatingits deviation The results are also listed in Tables 2 and 3 Weobserve that our Blind-NAC networks trained on AWGNnoise with unknown levels can achieve even better PSNRand SSIM [41] results than our NAC networks trained on spe-cific noise levels Note that on BSD68 our Blind-NAC net-works achieve better performance than DnCNN [44] whichachieves higher PSNR and SSIM results than our NAC net-works This demonstrate the effectiveness of our NAC net-works on blind image denoising With the success on blindimage next we will turn to real-world image denoising inwhich the noise is unknown and complex

53 Practice on Real Photographs

With the promising performance on blind image denois-ing here we tackle the realistic noise for practical appli-cations The observed realistic noise no can be roughlymodeled as mixed Poisson noise and AWGN noise [15 17]Hence for each observed noisy image y we generate thesimulated noise ns by sampling the y-dependent Poissonpart and the independent AWGN noiseTraining blind NAC networks is also performed for eachtest image ie the observed noisy image y In real-worldscenarios each observed noisy image y is corrupted withoutknowing the specific noise statistics of the observed noiseno Therefore the simulated noise ns is directly estimatedon y as mixed y-dependent Poisson and AWGN noise Foreach transformation image in data augmentation the Poissonnoise is randomly sampled with the parameter λ in 0 lt λ le25 and the AWGN noise is randomly sampled with the noiselevel σ in 0 lt σ le 25Comparison methods We compare with state-of-the-art methods on real-world image denoising includingCBM3D [12] the commercial software Neat Image [5] twosupervised networks DnCNN+ [44] and CBDNet [17] andthree unsupervised networks GCBD [10] Noise2Noise [26]and DIP [27] Note that DnCNN+ [44] and CBDNet [17]are two state-of-the-art supervised networks for real-worldimage denoising and DnCNN+ is an improved extension ofDnCNN [44] with much better performance (the authors ofDnCNN+ provide us the modelsresults of DnCNN+)Test datasets We evaluate the comparison methods on theCross-Channel (CC ) dataset [33] and DND dataset [34]The CC dataset [33] includes noisy images of 11 staticscenes captured by Canon 5D Mark 3 Nikon D600 andNikon D800 cameras The real-world noisy images are col-lected under a highly controlled indoor environment Eachscene is shot 500 times using the same camera and settingsThe average of the 500 shots is taken as the ldquoground-truthrdquoWe use the default 15 images of size 512times 512 cropped bythe authors to evaluate different image denoising methodsThe DND dataset [34] contains 50 scenarios captured bySony A7R Olympus E-M10 Sony RX100 IV and HuaweiNexus 6P Each scene is cropped to 20 bounding boxes of512times 512 pixels generating totally 1000 test images Thenoisy images are collected under higher ISO values with

(a) Noisy 3146dB09370

(e) CBDNet [17] 3934dB09905

(b) CBM3D [12] 3626dB09811

(f) GCBD [10] 3752dB09765

(c) NI [5] 3752dB09868

(g) N2N [26] 3495dB09621

(d) DnCNN+ [44] 3825dB09888

(h) NAC 3834dB09887Figure 3 Denoised images and PSNR(dB)SSIM by comparison methods on ldquo0017 3rdquo in DND [34] The ldquoground-truthrdquo image is notreleased but PSNR(dB)SSIM results are publicly provided on DND Benchmark

shorter exposure times while the ldquoground truthrdquo imagesare captured under lower ISO values with adjusted longerexposure times The ldquoground truthrdquo images are not releasedbut we can obtain the performance on PSNR and SSIM bysubmitting the denoised images to the DND rsquos Website

Results on PSNR and SSIM The comparisons on averagePSNR and SSIM results are listed in Table 4 As can be seenthe proposed NAC networks achieve better performance thanall previous denoising methods including the CBM3D [12]the supervised networks DnCNN+ [44] and CBDNet [17]and the unsupervised networks GCBD [10] N2N [26] andDIP [27] This demonstrates that the proposed NAC net-works can indeed handle the complex unknown and real-istic noise and achieve better performance than supervisednetworks such as DnCNN+ [44] and CBDNet [17]

Qualitative results In Figure 3 we show the denoised im-ages of our NAC network and the comparison methods onthe image ldquo0017 3rdquo from the DND dataset We observethat our unsupervised NAC networks are very effective onremoving realistic noise from the real photograph Besidesour NAC networks achieve competitive PSNR and SSIMresults when compared with the other methods includingthe supervised ones such as DnCNN+ [44] and CBDNet [17]Speed The work most similar to ours is Deep Image Prior(DIP) [27] which also trains an image-specific network foreach test image Averagely DIP needs 6039 seconds to pro-cess a 512times 512 color image on which our NAC network

needs 5832 seconds (on an NVIDIA Titan X GPU)

54 Ablation Study

To further study our NAC strategy we conduct detailedexamination of our NAC networks on image denoising1) Generality of our NAC strategy To evaluate the gen-erality of the proposed NAC strategy we apply it on theDnCNN [44] network and denote the resulting network asldquoDnCNN-NACrdquo We train DnCNN with our NAC strategy(DnCNN-NAC) and the comparison results with DnCNNare listed in Tab 5 One can see that DnCNN-NAC achievesbetter PSNR results than that of the original DnCNN whenσ = 5 10 15 (but worse when σ = 20 25) Note that theoriginal DnCNN network is trained offline on the BSD400dataset while here the DnCNN-NAC network is trainedonline for each specific test image

σ 5 10 15 20 25DnCNN [44] 3876 3478 3286 3145 3043DnCNN-NAC 4318 3716 3365 3116 2923

Table 5 PSNR (dB) results of DnCNN and DnCNN-NAC onSet12 corrupted by AWGN noise with different σ

2) Differences from DIP [27] Though the basic network inour work is the ResNet used in DIP [27] our NAC network isessentially different from DIP on at least two aspects Firstour NAC is a novel strategy for unsupervised learning ofadaptive network parameters for the degraded image while

0 2500 5000 7500 10000Epochs

000002004006008010012

Loss

Loss CameramanLoss House

6121824303642

PSNR

(dB)

PSNR CameramanPSNR House

(a) Curves of DIP [27] (b) Curves of our NACFigure 4 Training loss and PSNR (dB) curves of DIP [27] (a)and our NAC (b) networks wrt the number of epochs on theimages of ldquoCameramanrdquo and ldquoHouserdquo from Set12

DIP aims to investigate adaptive network structure withoutlearning the parameters Second our NAC learns a mappingfrom the synthetic noisy image z = y + ns to the noisyimage y which approximates the mapping from the noisyimage y = x + no to the clean image x But DIP maps arandom noise map to the noisy image y and the denoisedimage is obtained during the process Due to the two reasonsDIP needs early stop for different images while our NACachieves more robust (and better) denoising performancethan DIP on diverse images In Figure 4 we plot the curvesof training loss and test PSNR of DIP (a) and NAC (b)networks in 10000 epochs on two images of ldquoCameramanrdquoand ldquoHouserdquo We observe that DIP needs early stop to selectthe best results while our NAC can stably achieve betterdenoising results within 1000 epochs3) Influence on the number of residual blocks andepochs Our backbone network is the ResNet [27] with 10residual blocks trained in 1000 epochs Now we study howthe number of residual blocks and epochs influence the per-formance of NAC on image denoising The experiments areperformed on the Set12 dataset corrupted by AWGN noise(σ = 15) From Table 6 we observe that with more residualblocks the NAC networks the NAC networks can achievebetter PSNR and SSIM [41] results And 10 residual blocksare enough to achieve satisfactory results With more (eg15) blocks there is little improvement on PSRN and SSIMHence we use 10 residual blocks the same as [27] Then westudy how the number of epochs influence the performanceof NAC on image denoising From Table 7 one can see thaton the Set12 dataset corrupted by AWGN noise (σ = 15)with more training epochs our NAC networks achieve betterPSNR and SSIM results but with longer processing time

of Blocks 1 2 5 10 15PSNRuarr 3358 3385 3414 3424 3428SSIMuarr 09161 09226 09272 09277 09272

Table 6 Average PSNR (dB)SSIM of NAC with different num-ber of blocks on Set12 corrupted by AWGN noise (σ = 15)

4) Comparison with Oracle We also study the ldquoOraclerdquoperformance of our NAC networks In ldquoOraclerdquo we trainour NAC networks on the pair of observed noisy image yand its clean image x corrupted by AWGN noise or signal

5 10 15 20 25 AWGN Noise Level

2830323436384042444648

PSNR

(dB)

PSNR (dB) OraclePSNR (dB) Ours

080082084086088090092094096098100

SSIM

SSIM OracleSSIM Ours

(a)

BM3D DnCNN NACMethod

1416182022242628

PSNR

(dB)

PSNR (dB) = 50

066068070072074076078080

SSIM

SSIM = 50

(b)Figure 5 Comparisons of PSNR (dB) and SSIM results on Set12(a) by our NAC networks and its ldquoOraclerdquo version for AWGN withσ = 5 10 15 20 25 and (b) by BM3D [13] DnCNN [44] andour NAC networks for strong AWGN (σ = 50)

dependent Poisson noise The experiments are performed onSet12 dataset corrupted by AWGN or signal dependent Pois-son noise The noise deviations are in 5 10 15 20 25Figure 5 (a) shows comparisons of our NAC and its ldquoOr-aclerdquo networks on PSNR and SSIM It can be seen thatthe ldquoOraclerdquo networks trained on the pair of noisy-cleanimages only perform slightly better than the original NACnetworks trained with the simulated -observed noisy imagepairs (zy) With our NAC strategy the NAC networkstrained only with the noisy test image can achieve similardenoising performance on the weak noise5) Performance on strong noise Our NAC strategy isbased on the assumption of ldquoweak noiserdquo It is natural towonder how well NAC performs against strong noise Toanswer this question we compare the NAC networks withBM3D [13] and DnCNN [44] on Set12 corrupted by AWGNnoise with σ = 50 The PSNR and SSIM results are plottedin Figure 5 (b) One can see that our NAC networks arelimited in handling strong AWGN noise when comparedwith BM3D [13] and DnCNN [44]

of Epochs 100 200 500 1000 5000PSNRuarr 3180 3279 3377 3424 3432SSIMuarr 08714 09023 09189 09277 09280Timedarr 674 1325 3020 5832 28156

Table 7 Average PSNR (dB) and time (s) of NAC with differentnumber of epochs on Set12 corrupted by AWGN noise (σ = 15)

6 ConclusionIn this work we proposed a ldquoNoisy-As-Cleanrdquo (NAC)

strategy for learning unsupervised image denoising networksIn our NAC we trained an image-specific network by tak-ing the noisy test image as the target and adding to it thesimulated noise to generate the simulated noisy input Thesimulated noise is close to the observed noise in the noisytest image This strategy can be seamlessly embedded into ex-isting supervised denoising networks We provided a simpleand useful observation it is possible to learn an unsupervisednetwork only with the noisy image approximating the opti-mal parameters of a supervised network learned with pairsof noisy and clean images Extensive experiments on bench-

mark datasets demonstrate that the networks trained withour NAC strategy achieved better comparable performanceon PSNR SSIM and visual quality when compared to pre-vious state-of-the-art image denoising methods includingseveral supervised learning denoising networks These re-sults validate that our NAC strategy can learn image-specificpriors and noise statistics only from the corrupted test image

References[1] Darmstadt Noise Dataset Benchmark https

noisevisinftu-darmstadtdebenchmarkresults_srgb Accessed 2019-05-23 2

[2] PyTorch httpspytorchorg Accessed 2019-05-23 5

[3] Smartphone Image Denoising Dataset Benchmarkhttpswwweecsyorkuca˜kamelsiddbenchmarkphp Accessed 2019-05-23 2

[4] Abdelrahman Abdelhamed Stephen Lin and Michael SBrown A high-quality denoising dataset for smartphonecameras In CVPR June 2018 1 2

[5] Neatlab ABSoft Neat Image httpsnineatvideocomhome 6 7

[6] Joshua Batson and Loic Royer Noise2Self Blind denoisingby self-supervision In ICML volume 97 pages 524ndash533PMLR 2019 1 2 3

[7] Patrick Billingsley Probability and Measure Wiley Seriesin Probability and Statistics Wiley 1995 3

[8] Tim Brooks Ben Mildenhall Tianfan Xue Jiawen ChenDillon Sharlet and Jonathan T Barron Unprocessing imagesfor learned raw denoising In CVPR pages 9446ndash9454 20193 4

[9] Harold Christopher Burger Christian J Schuler and StefanHarmeling Image denoising Can plain neural networkscompete with BM3D In CVPR pages 2392ndash2399 2012 12 3 4

[10] Jingwen Chen Jiawei Chen Hongyang Chao and Ming YangImage blind denoising with generative adversarial networkbased noise modeling In CVPR pages 3155ndash3164 2018 16 7

[11] Yunjin Chen and Thomas Pock Trainable nonlinear reac-tion diffusion A flexible framework for fast and effectiveimage restoration IEEE Transactions on Pattern Analysisand Machine Intelligence 39(6)1256ndash1272 2017 2

[12] Kostadin Dabov Alessandro Foi Vladimir Katkovnik andKaren Egiazarian Color image denoising via sparse 3Dcollaborative filtering with grouping constraint in luminance-chrominance space In ICIP pages 313ndash316 IEEE 2007 67

[13] Kostadin Dabov Alessandro Foi Vladimir Katkovnik andKaren Egiazarian Image denoising by sparse 3-D transform-domain collaborative filtering IEEE Transactions on ImageProcessing 16(8)2080ndash2095 2007 5 6 8

[14] Michael Elad and Michal Aharon Image denoising via sparseand redundant representations over learned dictionaries IEEETransactions on Image Processing 15(12)3736ndash3745 20062

[15] Alessandro Foi Mejdi Trimeche Vladimir Katkovnik andKaren Egiazarian Practical poissonian-gaussian noise model-ing and fitting for single-image raw-data IEEE Transactionson Image Processing 17(10)1737ndash1754 Oct 2008 1 2 4 6

[16] Shuhang Gu Qi Xie Deyu Meng Wangmeng Zuo XiangchuFeng and Lei Zhang Weighted nuclear norm minimizationand its applications to low level vision International Journalof Computer Vision 121(2)183ndash208 2017 5

[17] Shi Guo Zifei Yan Kai Zhang Wangmeng Zuo and LeiZhang Toward convolutional blind denoising of real pho-tographs In CVPR 2019 1 2 3 4 6 7

[18] Kaiming He Xiangyu Zhang Shaoqing Ren and Jian SunDeep residual learning for image recognition In CVPR pages770ndash778 2016 1 2

[19] Gao Huang Zhuang Liu Laurens van der Maaten and Kil-ian Q Weinberger Densely connected convolutional net-works In CVPR pages 4700ndash4708 2017 1

[20] Sergey Ioffe and Christian Szegedy Batch normalizationAccelerating deep network training by reducing internal co-variate shift In ICML 2015 5

[21] Diederik P Kingma and Jimmy Ba Adam A method forstochastic optimization In ICLR 2015 5

[22] Alex Krizhevsky Ilya Sutskever and Geoffrey E Hinton Im-agenet classification with deep convolutional neural networksIn NIPS pages 1097ndash1105 2012 1

[23] Alexander Krull Tim-Oliver Buchholz and Florian JugNoise2Void-learning denoising from single noisy imagesIn CVPR 2019 1 2 3 5 6

[24] Samuli Laine Tero Karras Jaakko Lehtinen and TimoAila High-quality self-supervised deep image denoisingIn NeurIPS 2019 1 2

[25] Stamatios Lefkimmiatis Non-local color image denoisingwith convolutional neural networks In CVPR pages 3587ndash3596 2017 1 2 3

[26] Jaakko Lehtinen Jacob Munkberg Jon Hasselgren SamuliLaine Tero Karras Miika Aittala and Timo AilaNoise2Noise Learning image restoration without clean dataIn ICML pages 2971ndash2980 2018 1 2 3 5 6 7

[27] Victor Lempitsky Dmitry Ulya Andrea Vedaldi and VictorLempitsky Deep image prior In CVPR pages 9446ndash94542018 1 2 3 5 6 7 8

[28] Ce Liu William T Freeman Richard Szeliski and Sing BingKang Noise estimation from a single image CVPR 1901ndash908 2006 1 3

[29] Ding Liu Bihan Wen Yuchen Fan Chen Change Loy andThomas S Huang Non-local recurrent network for imagerestoration In NeurIPS pages 1673ndash1682 2018 1 2 3 4 5

[30] Markku Makitalo and Alessandro Foi Optimal inversionof the anscombe transformation in low-count poisson imagedenoising IEEE Transactions on Image Processing 20(1)99ndash109 2011 5

[31] Xiao-Jiao Mao Chunhua Shen and Yu-Bin Yang Imagerestoration using convolutional auto-encoders with symmetricskip connections In NIPS 2016 1 2 3 5

[32] Vinod Nair and Geoffrey E Hinton Rectified linear unitsimprove restricted boltzmann machines In ICML pages807ndash814 2010 5

[33] Seonghyeon Nam Youngbae Hwang Yasuyuki Matsushitaand Seon Joo Kim A holistic approach to cross-channelimage noise modeling and its application to image denoisingIn CVPR pages 1683ndash1691 2016 1 2 3 4 6

[34] Tobias Plotz and Stefan Roth Benchmarking denoising al-gorithms with real photographs In CVPR 2017 1 2 67

[35] Tobias Plotz and Stefan Roth Neural nearest neighbors net-works In NeurIPS 2018 1 2 3

[36] Stefan Roth and Michael J Black Fields of experts Inter-national Journal of Computer Vision 82(2)205ndash229 20092

[37] Uwe Schmidt and Stefan Roth Shrinkage fields for effectiveimage restoration In CVPR pages 2774ndash2781 June 2014 2

[38] Karen Simonyan and Andrew Zisserman Very deep convolu-tional networks for large-scale image recognition In ICLR2015 1

[39] Christian Szegedy Wei Liu Yangqing Jia Pierre SermanetScott Reed Dragomir Anguelov Dumitru Erhan VincentVanhoucke and Andrew Rabinovich Going deeper withconvolutions In CVPR pages 1ndash9 2015 1

[40] Ying Tai Jian Yang Xiaoming Liu and Chunyan Xu Mem-net A persistent memory network for image restoration InICCV 2017 1 2 3 4 5

[41] Zhou Wang Alan C Bovik Hamid R Sheikh and Eero PSimoncelli Image quality assessment from error visibility tostructural similarity IEEE Transactions on Image Processing13(4)600ndash612 2004 5 6 8

[42] Junyuan Xie Linli Xu and Enhong Chen Image denoisingand inpainting with deep neural networks In NIPS pages341ndash349 2012 1 2 3

[43] Jun Xu Wangmeng Zuo Lei Zhang David Zhang and XFeng Patch group based nonlocal self-similarity prior learn-ing for image denoising In ICCV pages 244ndash252 2015 25

[44] Kai Zhang Wangmeng Zuo Yunjin Chen Deyu Meng andLei Zhang Beyond a Gaussian denoiser Residual learning ofdeep cnn for image denoising IEEE Transactions on ImageProcessing 2017 1 2 3 4 5 6 7 8

[45] Maria Zontak and Michal Irani Internal statistics of a singlenatural image In CVPR 2011 2

[46] Daniel Zoran and Yair Weiss From learning models of naturalimage patches to whole image restoration In ICCV pages479ndash486 2011 2 5

Page 3: Noisy-As-Clean: Learning Unsupervised Denoising from the ... · (d) CDnCNN+NAC: 35.80dB/0.9116 Figure 1. Denoised images and PSNR/SSIM results of CD-nCNN [44] (c) and CDnCNN trained

photons hitting the sensor [28] while Gaussian noise is ma-jorly produced by dark current [33] Though performing wellon the synthetic noise being trained with supervised denois-ers [9 17 25 29 31 40 42 44] still suffer from the domaingap problem when processing the real noisy photographs

In Table 1 we summarize several recently published de-noising networks [6 8 23 26 27 29 31 35 40 44] fromthe aspects of supervised and unsupervised networks imagepriors and noise statistics In this work to bridge the domaingap problem we propose a ldquoNoisy-As-Cleanrdquo strategy tolearn the image-specific internal priors and noise statisticsdirectly from the corrupted test image

3 Theoretical Background of ldquoNoisy-As-Cleanrdquo Strategy

Training a supervised network fθ (parameterized by θ)requires many pairs (yixi) of noisy image yi and cleanimage xi by minimizing an empirical loss function L as

argminθ

sumi=1

L(fθ(yi)xi) (1)

Assume that the probability of occurrence for pair (yixi)is p(yixi) then statistically we have

θlowast = argminθ

sumi=1

p(yixi)L(fθ(yi)xi)

= argminθ

E(yx)[L(fθ(y)x)](2)

where y and x are random variables of noisy and clean im-ages respectively The paired variables (yx) are dependentand their relationship is y = x+no where no is the randomvariable of observed noise By exploring the dependence ofp(yixi) = p(xi)p(yi|xi) Eqn (2) is equivalent to

θlowast = argminθ

sumi=1

p(xi)p(yi|xi)L(fθ(yi)xi)

= argminθ

Ex[Ey|x[L(fθ(y)x)]](3)

Eqn (3) indicates that the network fθ can minimize the lossfunction by solving the same problem separately for eachclean image sample

Different with the ldquozero-meanrdquo assumption in [23 26]here we study a practical assumption on noise statistics iethe expectation E[x] and variance Var[x] of signal intensityare much stronger than those of noise E[no] and Var[no](such that they are negligible but not necessarily zero)

E[x] E[no] Var[x] Var[no] (4)

This is actually valid in real-world scenarios since we canclearly observe the contents in most real photographs withlittle influence of the noise The noise therein is often mod-eled by zero-mean Gaussian or mixed Poisson and Gaussian

(for realistic noise) Hence the noisy image y should havesimilar expectation with the clean image x

E[y] = E[x+ no] = E[x] + E[no] asymp E[x] (5)

Now we add simulated noise ns to the observed noisy imagey and generate a new noisy image z = y + ns We assumethat ns is statisticly close to no ie E[ns] asymp E[no] andVar[ns] asymp Var[no] Then we have

E[z] E[ns] Var[z] Var[ns] (6)

Therefore the simulated noisy image z has similar expecta-tion with the observed noisy image y

E[z] = E[y + ns] asymp E[y] (7)

By the Law of Total Expectation [7] we have

Ey[Ez[z|y]] = E[z] asymp E[y] = Ex[Ey[y|x]] (8)

Since the loss function L (usually `2) and the conditionalprobability density functions p(y|x) and p(z|y) are all con-tinuous everywhere the optimal network parameters θlowast

of Eqn (3) changes little with the addition of negligiblenoise no or ns With Eqns (4)-(8) when the x-conditionedexpectation of Ey|x[L(fθ(y)x)] are replaced with the y-conditioned expectation of Ez|y[L(fθ(z)y)] fθ obtainssimilar y-conditioned optimal parameters θlowast

argminθ

Ey[Ez|y[L(fθ(z)y)]]

asymp argminθ

Ex[Ey|x[L(fθ(y)x)]] = θlowast(9)

The network fθ minimizes the loss function L for each inputimage pair separately which equals to minimize it on all fi-nite pairs of images Through simple manipulations Eqn (9)is equivalent to

argminθ

sumi=1

p(yi)p(zi|yi)L(fθ(zi)yi)

= argminθ

Ey[Ez|y[L(fθ(z)y)]] asymp θlowast(10)

By exploring the dependence of p(ziyi) = p(yi)p(zi|yi)Eqn (10) is equivalent to

argminθ

E(zy)[L(fθ(z)y)]

= argminθ

sumi=1

p(ziyi)L(fθ(zi)yi) asymp θlowast(11)

Our observation is very simple and useful as long asthe noise is weak the optimal parameters of unsupervisednetwork trained on noisy image pairs (ziyi) are veryclose to the optimal parameters of the supervised networkstrained on pairs of noisy and clean images (yixi)

x

CleanImage

no

ObservedNoise

ns

SimulatedNoise

z

InputImage

Element-wiseSum

x

CleanImage

OutputImage

TargetImage

ObservedNoise

( (z) y)fθ

( (z) y)fθ LossFunction

Network(z)fθ

no

y

Figure 2 Proposed ldquoNoisy-As-Cleanrdquo strategy for training unsupervised image denoising networks In this strategy we take theobserved noisy image y = x+ no as the ldquocleanrdquo target and take the simulated noisy image z = y + ns as the input

Consistency of noise statistics Since our contexts are thereal-world scenarios the noise can be modeled by mixedPoisson and Gaussian distribution [15] Fortunately both thetwo distributions are linear additive ie the addition variableof two Poisson (or Gaussian) distributed variables are stillPoisson (or Gaussian) distributed Assume that the observed(simulated) noise no (ns) follows a mixed x-dependent (y-dependent) Poisson distribution parameterized by λo (λs)and Gaussian distribution N (0 σ2

o) (N (0 σ2s)) ie

no sim x P(λo) +N (0 σ2o)

ns sim y P(λs) +N (0 σ2s)

asymp x P(λs) +N (0 σ2s)

(12)

where x P(λo) and y P (λs) indicates that the noiseno and ns are element-wisely dependent on x and y re-spectively The ldquoasymprdquo indicates that the observed noise no isnegligible Thus we have

no+ns sim xP(λo+λs)+N (0 σ2o+σ

2s+2ρσoσs) (13)

where ρ is the correlation between no and ns (ρ = 0 ifthey are independent) This indicates that the summed noisevariable no + ns still follows a mixed x dependent Poissonand Gaussian distribution guaranteeing the consistency innoise statistics between the observed realistic noise and thesimulated noise As can be seen by the experiments (sect5) thisproperty makes our ldquoNoisy-As-Cleanrdquo strategy consistentlyeffective on different noise removal tasks

4 Learning ldquoNoisy-As-Cleanrdquo Networks forUnsupervised Image DenoisingWith our statistical analysis here we propose to learn

unsupervised networks with our ldquoNoisy-As-Cleanrdquo (NAC)strategy for image denoising Note that we only need the ob-served noisy image y to generate noisy image pairs (zy)with simulated noise ns Our idea is illustrated in Figure 2

Training NAC networks For real-world images capturedby camera sensors one can hardly distinguish the realisticnoise from the signal Our observation is that the signalintensity x is usually stronger than the noise intensity Thatis the expectation of the observed (realistic) noise no isusually much smaller than that of the latent clean image xWe can observe that if we train an image-specific networkfor the new noisy image z and regard the original noisyimage y as the ground-truth image then the trained image-specific network basically joint learn the image-specific priorand noise statistics It has the capacity to remove the noise nsfrom the new noisy image z Then if we perform denoisingon the original noisy image y the observed noise no caneasily be removed Note that we do not use the clean imagex as ldquoground-truthrdquo in training our NAC networks

Training blind denoising Most of existing supervised de-noising networks train a specific model to process a fixednoise pattern [8 9 29 33 40] To tackle the unknown noiseone feasible solution for these networks is to assume thenoise as AWGN and estimate its noise deviation The cor-responding noise is removed by using the networks trainedwith the estimated level But this strategy largely degradesthe denoising performance when the noise deviation is notestimated accurately Besides this solution can hardly dealwith realistic noise which is usually not AWGN capturedon real photographs In order to be effective on removingrealistic noise our NAC networks should have the ability toblindly remove the unknown noise from real photographsInspired by [17 44] we propose to train a blind versionof our NAC networks by using the AWGN noise within arange of levels (eg [0 55]) for removing unknown AWGNnoise We also use the mixed AWGN and Poisson noise (bothwithin a range of intensities) for removing the realistic noiseMore details will be introduced in sect52

Testing is performed by directly regarding an observed noisyimage y = x+ no as input We only test the image y onceThe denoised image can be represented as y = fθlowast(y) with

Noise Level σ = 5 σ = 10 σ = 15 σ = 20 σ = 25

Metric PSNRuarr SSIMuarr PSNRuarr SSIMuarr PSNRuarr SSIMuarr PSNRuarr SSIMuarr PSNRuarr SSIMuarrBM3D [13] 3807 09580 3440 09234 3238 08957 3100 08717 2997 08503

DnCNN [44] 3876 09633 3478 09270 3286 09027 3145 08799 3043 08617N2N [26] 3972 09665 3618 09446 3399 09149 3210 08788 3072 08446DIP [27] 3249 09344 3149 09299 2959 08636 2767 08531 2582 07723N2V [23] 2706 08174 2679 07859 2612 07468 2589 07405 2501 06564

NAC 3999 09820 3655 09569 3424 09277 3246 08961 3108 08654Blind-NAC 3848 09805 3665 09564 3477 09275 3313 09024 3178 08802

Table 2 Average PSNR (dB) and SSIM [41] results of different methods on Set12 dataset corrupted by AWGN noise The best andsecond best results are highlighted in red and blue respectively

Noise Level σ = 5 σ = 10 σ = 15 σ = 20 σ = 25

Metric PSNRuarr SSIMuarr PSNRuarr SSIMuarr PSNRuarr SSIMuarr PSNRuarr SSIMuarr PSNRuarr SSIMuarrBM3D [13] 3759 09640 3332 09163 3107 08720 2962 08342 2857 08017

DnCNN [44] 3807 09695 3388 09270 3173 08706 3027 08563 2923 08278N2N [26] 3858 09627 3407 09200 3181 08770 3014 08550 2867 08123DIP [27] 2974 08435 2816 08310 2707 07867 2580 07205 2463 06680N2V [23] 2670 07915 2639 07621 2577 07126 2541 06678 2483 06305

NAC 3900 09707 3460 09324 3213 08942 3047 08636 2896 08185Blind-NAC 3826 09605 3426 09266 3206 08919 3050 08609 2933 08327

Table 3 Average PSNR (dB) and SSIM [41] results of different methods on BSD68 dataset corrupted by AWGN noise The best andsecond best results are highlighted in red and blue respectively

which the objective metrics eg PSNR and SSIM [41] canbe computed with the clean image xImplementation details We employ the ResNet-20 net-work used in [27] as the backbone network which includes10 residual blocks Each block contains two convolutionallayers followed by a Batch Normalization (BN) [20] TheRectified Linear Units (ReLU) activation operator [32] isused after the first BN Its parameters are randomly initial-ized without being pretrained The optimizer is Adam [21]with default parameters The learning rate is fixed at 0001in all experiments We use the `2 loss function The networkis trained in 1000 epochs for each test image For data aug-mentation we employ 4 rotations 0 90 180 270combined with 2 mirror (vertical and horizontal) reflectionsresulting in totally 8 transformations We implement ourResNet based NAC networks in PyTorch [2]

5 ExperimentsIn this section we evaluate the performance of our

ldquoNoisy-As-Cleanrdquo (NAC) networks on image denoising Inall experiments we train a denoising network using onlythe noisy test image y as the target and using the simulatednoisy image z (with data augmentation) as the input For allcomparison methods the source codes or trained models aredownloaded from the corresponding authorsrsquo websites Weuse the default parameter settings unless otherwise speci-fied The PSNR SSIM [41] and visual quality of differentmethods are used to evaluate the comparison We first testwith synthetic noise such as additive white Gaussian noise(AWGN) in sect51 continue to perform blind image denoising

in sect52 and finally tackle the realistic noise in sect53 In sect54we conduct comprehensive ablation studies to gain deeperinsights into the proposed NAC strategy

51 Synthetic Noise Removal With Known Noise

We evaluate the proposed NAC networks on images cor-rupted by synthetic noise such as AWGN More experimentalresults on signal dependent Poisson noise and mixed Poisson-AWGN noise are provided in the Supplementary FileTraining NAC networks Here we train an image-specificdenoising network using the observed noisy test image yas the target and the simulated noisy image z as the inputEach observed noisy image y = x + no is generated byadding the observed noise no to the clean image x Thesimulated noisy image z = y + ns is generated by addingsimulated noise ns to observed noisy image yComparison methods We compare our NAC networkswith state-of-the-art image denoising methods [13 26 2730 44] On AWGN noise we compare with BM3D [13]DnCNN [44] Noise2Noise (N2N) [26] Deep Image Prior(DIP) [27] and Noise2Void (N2V) [23]Test datasets We evaluate the comparison methods on theSet12 and BSD68 datasets which are widely tested by su-pervised denoising networks [29 31 40 44] and previousmethods [13 16 43 46] The Set12 dataset contains 12 im-ages of sizes 512 times 512 or 256 times 256 while the BSD68dataset contains 68 images of different sizesResults on AWGN noise We test AWGN with noise devi-ation (noise level) of σ isin 5 10 15 20 25 ie the ob-served noise no is AWGN with standard deviation (std) of

Type Traditional Methods Supervised Networks Unsupervised NetworksDataset

Method CBM3D [12] NI [5] DnCNN+ [44] CBDNet [17] GCBD [10] N2N [26] DIP [27] NAC

CC [33]PSNRuarr 3519 3533 3540 3644 NA 3532 3569 3659SSIMuarr 09063 09212 09115 09460 NA 09160 09259 09502

DND [34]PSNRuarr 3451 3511 3790 3806 3558 3310 NA 3620SSIMuarr 08507 08778 09430 09421 09217 08110 NA 09252

Table 4 Average PSNR (dB) and SSIM [41] of different methods on the CC dataset [33] and the DND dataset [34] The best results arehighlighted in bold ldquoNArdquo means ldquoNot Availablerdquo due to unavailable code (GCBD on CC [33]) or difficult experiments (DIP on DND [34])

σ Since AWGN noise is signal independent the simulatednoise ns is set with the same σ as that of no The com-parison results are listed in Tables 2 and 3 It can be seenthat the network trained with the proposed NAC networksachieve much better performance on PSNR and SSIM [41]than BM3D [13] and DnCNN [44] two previous leadingimage denoising methods Note that DnCNN are supervisednetworks trained on clean and synthetic noisy image pairsOur NAC networks outperform the other unsupervised net-works N2N [26] DIP [27] and N2V [23] by a significantmargin on PSNR and SSIM [41]

52 Synthetic Noise Removal With Unknown Noise

To deal with unknown noise we propose to train a blindversion of our NAC networks for removing unknown noiseHere we test our NAC networks on AWGN noise withunknown noise deviation We use the same training strategycomparison methods and test datasets as in sect51

Training blind NAC networks For each test image wetrain a NAC network corrupted by AWGN with unknownnoise levels (deviations) The noise levels are randomly sam-pled (in Gaussian distribution) within [0 55] We also test onnoise levels in uniform distribution and obtain similar resultsWe repeat the training of NAC network on the test image withdifferent deviations Our NAC networks trained on AWGNwith unknown noise levels is termed as ldquoBlind-NACrdquo

Results on blind denoising For the same test image weadd to it the AWGN noise whose deviation is also in5 10 15 20 25 The blindly trained NAC network is di-rectly utilized to denoise the test image without estimatingits deviation The results are also listed in Tables 2 and 3 Weobserve that our Blind-NAC networks trained on AWGNnoise with unknown levels can achieve even better PSNRand SSIM [41] results than our NAC networks trained on spe-cific noise levels Note that on BSD68 our Blind-NAC net-works achieve better performance than DnCNN [44] whichachieves higher PSNR and SSIM results than our NAC net-works This demonstrate the effectiveness of our NAC net-works on blind image denoising With the success on blindimage next we will turn to real-world image denoising inwhich the noise is unknown and complex

53 Practice on Real Photographs

With the promising performance on blind image denois-ing here we tackle the realistic noise for practical appli-cations The observed realistic noise no can be roughlymodeled as mixed Poisson noise and AWGN noise [15 17]Hence for each observed noisy image y we generate thesimulated noise ns by sampling the y-dependent Poissonpart and the independent AWGN noiseTraining blind NAC networks is also performed for eachtest image ie the observed noisy image y In real-worldscenarios each observed noisy image y is corrupted withoutknowing the specific noise statistics of the observed noiseno Therefore the simulated noise ns is directly estimatedon y as mixed y-dependent Poisson and AWGN noise Foreach transformation image in data augmentation the Poissonnoise is randomly sampled with the parameter λ in 0 lt λ le25 and the AWGN noise is randomly sampled with the noiselevel σ in 0 lt σ le 25Comparison methods We compare with state-of-the-art methods on real-world image denoising includingCBM3D [12] the commercial software Neat Image [5] twosupervised networks DnCNN+ [44] and CBDNet [17] andthree unsupervised networks GCBD [10] Noise2Noise [26]and DIP [27] Note that DnCNN+ [44] and CBDNet [17]are two state-of-the-art supervised networks for real-worldimage denoising and DnCNN+ is an improved extension ofDnCNN [44] with much better performance (the authors ofDnCNN+ provide us the modelsresults of DnCNN+)Test datasets We evaluate the comparison methods on theCross-Channel (CC ) dataset [33] and DND dataset [34]The CC dataset [33] includes noisy images of 11 staticscenes captured by Canon 5D Mark 3 Nikon D600 andNikon D800 cameras The real-world noisy images are col-lected under a highly controlled indoor environment Eachscene is shot 500 times using the same camera and settingsThe average of the 500 shots is taken as the ldquoground-truthrdquoWe use the default 15 images of size 512times 512 cropped bythe authors to evaluate different image denoising methodsThe DND dataset [34] contains 50 scenarios captured bySony A7R Olympus E-M10 Sony RX100 IV and HuaweiNexus 6P Each scene is cropped to 20 bounding boxes of512times 512 pixels generating totally 1000 test images Thenoisy images are collected under higher ISO values with

(a) Noisy 3146dB09370

(e) CBDNet [17] 3934dB09905

(b) CBM3D [12] 3626dB09811

(f) GCBD [10] 3752dB09765

(c) NI [5] 3752dB09868

(g) N2N [26] 3495dB09621

(d) DnCNN+ [44] 3825dB09888

(h) NAC 3834dB09887Figure 3 Denoised images and PSNR(dB)SSIM by comparison methods on ldquo0017 3rdquo in DND [34] The ldquoground-truthrdquo image is notreleased but PSNR(dB)SSIM results are publicly provided on DND Benchmark

shorter exposure times while the ldquoground truthrdquo imagesare captured under lower ISO values with adjusted longerexposure times The ldquoground truthrdquo images are not releasedbut we can obtain the performance on PSNR and SSIM bysubmitting the denoised images to the DND rsquos Website

Results on PSNR and SSIM The comparisons on averagePSNR and SSIM results are listed in Table 4 As can be seenthe proposed NAC networks achieve better performance thanall previous denoising methods including the CBM3D [12]the supervised networks DnCNN+ [44] and CBDNet [17]and the unsupervised networks GCBD [10] N2N [26] andDIP [27] This demonstrates that the proposed NAC net-works can indeed handle the complex unknown and real-istic noise and achieve better performance than supervisednetworks such as DnCNN+ [44] and CBDNet [17]

Qualitative results In Figure 3 we show the denoised im-ages of our NAC network and the comparison methods onthe image ldquo0017 3rdquo from the DND dataset We observethat our unsupervised NAC networks are very effective onremoving realistic noise from the real photograph Besidesour NAC networks achieve competitive PSNR and SSIMresults when compared with the other methods includingthe supervised ones such as DnCNN+ [44] and CBDNet [17]Speed The work most similar to ours is Deep Image Prior(DIP) [27] which also trains an image-specific network foreach test image Averagely DIP needs 6039 seconds to pro-cess a 512times 512 color image on which our NAC network

needs 5832 seconds (on an NVIDIA Titan X GPU)

54 Ablation Study

To further study our NAC strategy we conduct detailedexamination of our NAC networks on image denoising1) Generality of our NAC strategy To evaluate the gen-erality of the proposed NAC strategy we apply it on theDnCNN [44] network and denote the resulting network asldquoDnCNN-NACrdquo We train DnCNN with our NAC strategy(DnCNN-NAC) and the comparison results with DnCNNare listed in Tab 5 One can see that DnCNN-NAC achievesbetter PSNR results than that of the original DnCNN whenσ = 5 10 15 (but worse when σ = 20 25) Note that theoriginal DnCNN network is trained offline on the BSD400dataset while here the DnCNN-NAC network is trainedonline for each specific test image

σ 5 10 15 20 25DnCNN [44] 3876 3478 3286 3145 3043DnCNN-NAC 4318 3716 3365 3116 2923

Table 5 PSNR (dB) results of DnCNN and DnCNN-NAC onSet12 corrupted by AWGN noise with different σ

2) Differences from DIP [27] Though the basic network inour work is the ResNet used in DIP [27] our NAC network isessentially different from DIP on at least two aspects Firstour NAC is a novel strategy for unsupervised learning ofadaptive network parameters for the degraded image while

0 2500 5000 7500 10000Epochs

000002004006008010012

Loss

Loss CameramanLoss House

6121824303642

PSNR

(dB)

PSNR CameramanPSNR House

(a) Curves of DIP [27] (b) Curves of our NACFigure 4 Training loss and PSNR (dB) curves of DIP [27] (a)and our NAC (b) networks wrt the number of epochs on theimages of ldquoCameramanrdquo and ldquoHouserdquo from Set12

DIP aims to investigate adaptive network structure withoutlearning the parameters Second our NAC learns a mappingfrom the synthetic noisy image z = y + ns to the noisyimage y which approximates the mapping from the noisyimage y = x + no to the clean image x But DIP maps arandom noise map to the noisy image y and the denoisedimage is obtained during the process Due to the two reasonsDIP needs early stop for different images while our NACachieves more robust (and better) denoising performancethan DIP on diverse images In Figure 4 we plot the curvesof training loss and test PSNR of DIP (a) and NAC (b)networks in 10000 epochs on two images of ldquoCameramanrdquoand ldquoHouserdquo We observe that DIP needs early stop to selectthe best results while our NAC can stably achieve betterdenoising results within 1000 epochs3) Influence on the number of residual blocks andepochs Our backbone network is the ResNet [27] with 10residual blocks trained in 1000 epochs Now we study howthe number of residual blocks and epochs influence the per-formance of NAC on image denoising The experiments areperformed on the Set12 dataset corrupted by AWGN noise(σ = 15) From Table 6 we observe that with more residualblocks the NAC networks the NAC networks can achievebetter PSNR and SSIM [41] results And 10 residual blocksare enough to achieve satisfactory results With more (eg15) blocks there is little improvement on PSRN and SSIMHence we use 10 residual blocks the same as [27] Then westudy how the number of epochs influence the performanceof NAC on image denoising From Table 7 one can see thaton the Set12 dataset corrupted by AWGN noise (σ = 15)with more training epochs our NAC networks achieve betterPSNR and SSIM results but with longer processing time

of Blocks 1 2 5 10 15PSNRuarr 3358 3385 3414 3424 3428SSIMuarr 09161 09226 09272 09277 09272

Table 6 Average PSNR (dB)SSIM of NAC with different num-ber of blocks on Set12 corrupted by AWGN noise (σ = 15)

4) Comparison with Oracle We also study the ldquoOraclerdquoperformance of our NAC networks In ldquoOraclerdquo we trainour NAC networks on the pair of observed noisy image yand its clean image x corrupted by AWGN noise or signal

5 10 15 20 25 AWGN Noise Level

2830323436384042444648

PSNR

(dB)

PSNR (dB) OraclePSNR (dB) Ours

080082084086088090092094096098100

SSIM

SSIM OracleSSIM Ours

(a)

BM3D DnCNN NACMethod

1416182022242628

PSNR

(dB)

PSNR (dB) = 50

066068070072074076078080

SSIM

SSIM = 50

(b)Figure 5 Comparisons of PSNR (dB) and SSIM results on Set12(a) by our NAC networks and its ldquoOraclerdquo version for AWGN withσ = 5 10 15 20 25 and (b) by BM3D [13] DnCNN [44] andour NAC networks for strong AWGN (σ = 50)

dependent Poisson noise The experiments are performed onSet12 dataset corrupted by AWGN or signal dependent Pois-son noise The noise deviations are in 5 10 15 20 25Figure 5 (a) shows comparisons of our NAC and its ldquoOr-aclerdquo networks on PSNR and SSIM It can be seen thatthe ldquoOraclerdquo networks trained on the pair of noisy-cleanimages only perform slightly better than the original NACnetworks trained with the simulated -observed noisy imagepairs (zy) With our NAC strategy the NAC networkstrained only with the noisy test image can achieve similardenoising performance on the weak noise5) Performance on strong noise Our NAC strategy isbased on the assumption of ldquoweak noiserdquo It is natural towonder how well NAC performs against strong noise Toanswer this question we compare the NAC networks withBM3D [13] and DnCNN [44] on Set12 corrupted by AWGNnoise with σ = 50 The PSNR and SSIM results are plottedin Figure 5 (b) One can see that our NAC networks arelimited in handling strong AWGN noise when comparedwith BM3D [13] and DnCNN [44]

of Epochs 100 200 500 1000 5000PSNRuarr 3180 3279 3377 3424 3432SSIMuarr 08714 09023 09189 09277 09280Timedarr 674 1325 3020 5832 28156

Table 7 Average PSNR (dB) and time (s) of NAC with differentnumber of epochs on Set12 corrupted by AWGN noise (σ = 15)

6 ConclusionIn this work we proposed a ldquoNoisy-As-Cleanrdquo (NAC)

strategy for learning unsupervised image denoising networksIn our NAC we trained an image-specific network by tak-ing the noisy test image as the target and adding to it thesimulated noise to generate the simulated noisy input Thesimulated noise is close to the observed noise in the noisytest image This strategy can be seamlessly embedded into ex-isting supervised denoising networks We provided a simpleand useful observation it is possible to learn an unsupervisednetwork only with the noisy image approximating the opti-mal parameters of a supervised network learned with pairsof noisy and clean images Extensive experiments on bench-

mark datasets demonstrate that the networks trained withour NAC strategy achieved better comparable performanceon PSNR SSIM and visual quality when compared to pre-vious state-of-the-art image denoising methods includingseveral supervised learning denoising networks These re-sults validate that our NAC strategy can learn image-specificpriors and noise statistics only from the corrupted test image

References[1] Darmstadt Noise Dataset Benchmark https

noisevisinftu-darmstadtdebenchmarkresults_srgb Accessed 2019-05-23 2

[2] PyTorch httpspytorchorg Accessed 2019-05-23 5

[3] Smartphone Image Denoising Dataset Benchmarkhttpswwweecsyorkuca˜kamelsiddbenchmarkphp Accessed 2019-05-23 2

[4] Abdelrahman Abdelhamed Stephen Lin and Michael SBrown A high-quality denoising dataset for smartphonecameras In CVPR June 2018 1 2

[5] Neatlab ABSoft Neat Image httpsnineatvideocomhome 6 7

[6] Joshua Batson and Loic Royer Noise2Self Blind denoisingby self-supervision In ICML volume 97 pages 524ndash533PMLR 2019 1 2 3

[7] Patrick Billingsley Probability and Measure Wiley Seriesin Probability and Statistics Wiley 1995 3

[8] Tim Brooks Ben Mildenhall Tianfan Xue Jiawen ChenDillon Sharlet and Jonathan T Barron Unprocessing imagesfor learned raw denoising In CVPR pages 9446ndash9454 20193 4

[9] Harold Christopher Burger Christian J Schuler and StefanHarmeling Image denoising Can plain neural networkscompete with BM3D In CVPR pages 2392ndash2399 2012 12 3 4

[10] Jingwen Chen Jiawei Chen Hongyang Chao and Ming YangImage blind denoising with generative adversarial networkbased noise modeling In CVPR pages 3155ndash3164 2018 16 7

[11] Yunjin Chen and Thomas Pock Trainable nonlinear reac-tion diffusion A flexible framework for fast and effectiveimage restoration IEEE Transactions on Pattern Analysisand Machine Intelligence 39(6)1256ndash1272 2017 2

[12] Kostadin Dabov Alessandro Foi Vladimir Katkovnik andKaren Egiazarian Color image denoising via sparse 3Dcollaborative filtering with grouping constraint in luminance-chrominance space In ICIP pages 313ndash316 IEEE 2007 67

[13] Kostadin Dabov Alessandro Foi Vladimir Katkovnik andKaren Egiazarian Image denoising by sparse 3-D transform-domain collaborative filtering IEEE Transactions on ImageProcessing 16(8)2080ndash2095 2007 5 6 8

[14] Michael Elad and Michal Aharon Image denoising via sparseand redundant representations over learned dictionaries IEEETransactions on Image Processing 15(12)3736ndash3745 20062

[15] Alessandro Foi Mejdi Trimeche Vladimir Katkovnik andKaren Egiazarian Practical poissonian-gaussian noise model-ing and fitting for single-image raw-data IEEE Transactionson Image Processing 17(10)1737ndash1754 Oct 2008 1 2 4 6

[16] Shuhang Gu Qi Xie Deyu Meng Wangmeng Zuo XiangchuFeng and Lei Zhang Weighted nuclear norm minimizationand its applications to low level vision International Journalof Computer Vision 121(2)183ndash208 2017 5

[17] Shi Guo Zifei Yan Kai Zhang Wangmeng Zuo and LeiZhang Toward convolutional blind denoising of real pho-tographs In CVPR 2019 1 2 3 4 6 7

[18] Kaiming He Xiangyu Zhang Shaoqing Ren and Jian SunDeep residual learning for image recognition In CVPR pages770ndash778 2016 1 2

[19] Gao Huang Zhuang Liu Laurens van der Maaten and Kil-ian Q Weinberger Densely connected convolutional net-works In CVPR pages 4700ndash4708 2017 1

[20] Sergey Ioffe and Christian Szegedy Batch normalizationAccelerating deep network training by reducing internal co-variate shift In ICML 2015 5

[21] Diederik P Kingma and Jimmy Ba Adam A method forstochastic optimization In ICLR 2015 5

[22] Alex Krizhevsky Ilya Sutskever and Geoffrey E Hinton Im-agenet classification with deep convolutional neural networksIn NIPS pages 1097ndash1105 2012 1

[23] Alexander Krull Tim-Oliver Buchholz and Florian JugNoise2Void-learning denoising from single noisy imagesIn CVPR 2019 1 2 3 5 6

[24] Samuli Laine Tero Karras Jaakko Lehtinen and TimoAila High-quality self-supervised deep image denoisingIn NeurIPS 2019 1 2

[25] Stamatios Lefkimmiatis Non-local color image denoisingwith convolutional neural networks In CVPR pages 3587ndash3596 2017 1 2 3

[26] Jaakko Lehtinen Jacob Munkberg Jon Hasselgren SamuliLaine Tero Karras Miika Aittala and Timo AilaNoise2Noise Learning image restoration without clean dataIn ICML pages 2971ndash2980 2018 1 2 3 5 6 7

[27] Victor Lempitsky Dmitry Ulya Andrea Vedaldi and VictorLempitsky Deep image prior In CVPR pages 9446ndash94542018 1 2 3 5 6 7 8

[28] Ce Liu William T Freeman Richard Szeliski and Sing BingKang Noise estimation from a single image CVPR 1901ndash908 2006 1 3

[29] Ding Liu Bihan Wen Yuchen Fan Chen Change Loy andThomas S Huang Non-local recurrent network for imagerestoration In NeurIPS pages 1673ndash1682 2018 1 2 3 4 5

[30] Markku Makitalo and Alessandro Foi Optimal inversionof the anscombe transformation in low-count poisson imagedenoising IEEE Transactions on Image Processing 20(1)99ndash109 2011 5

[31] Xiao-Jiao Mao Chunhua Shen and Yu-Bin Yang Imagerestoration using convolutional auto-encoders with symmetricskip connections In NIPS 2016 1 2 3 5

[32] Vinod Nair and Geoffrey E Hinton Rectified linear unitsimprove restricted boltzmann machines In ICML pages807ndash814 2010 5

[33] Seonghyeon Nam Youngbae Hwang Yasuyuki Matsushitaand Seon Joo Kim A holistic approach to cross-channelimage noise modeling and its application to image denoisingIn CVPR pages 1683ndash1691 2016 1 2 3 4 6

[34] Tobias Plotz and Stefan Roth Benchmarking denoising al-gorithms with real photographs In CVPR 2017 1 2 67

[35] Tobias Plotz and Stefan Roth Neural nearest neighbors net-works In NeurIPS 2018 1 2 3

[36] Stefan Roth and Michael J Black Fields of experts Inter-national Journal of Computer Vision 82(2)205ndash229 20092

[37] Uwe Schmidt and Stefan Roth Shrinkage fields for effectiveimage restoration In CVPR pages 2774ndash2781 June 2014 2

[38] Karen Simonyan and Andrew Zisserman Very deep convolu-tional networks for large-scale image recognition In ICLR2015 1

[39] Christian Szegedy Wei Liu Yangqing Jia Pierre SermanetScott Reed Dragomir Anguelov Dumitru Erhan VincentVanhoucke and Andrew Rabinovich Going deeper withconvolutions In CVPR pages 1ndash9 2015 1

[40] Ying Tai Jian Yang Xiaoming Liu and Chunyan Xu Mem-net A persistent memory network for image restoration InICCV 2017 1 2 3 4 5

[41] Zhou Wang Alan C Bovik Hamid R Sheikh and Eero PSimoncelli Image quality assessment from error visibility tostructural similarity IEEE Transactions on Image Processing13(4)600ndash612 2004 5 6 8

[42] Junyuan Xie Linli Xu and Enhong Chen Image denoisingand inpainting with deep neural networks In NIPS pages341ndash349 2012 1 2 3

[43] Jun Xu Wangmeng Zuo Lei Zhang David Zhang and XFeng Patch group based nonlocal self-similarity prior learn-ing for image denoising In ICCV pages 244ndash252 2015 25

[44] Kai Zhang Wangmeng Zuo Yunjin Chen Deyu Meng andLei Zhang Beyond a Gaussian denoiser Residual learning ofdeep cnn for image denoising IEEE Transactions on ImageProcessing 2017 1 2 3 4 5 6 7 8

[45] Maria Zontak and Michal Irani Internal statistics of a singlenatural image In CVPR 2011 2

[46] Daniel Zoran and Yair Weiss From learning models of naturalimage patches to whole image restoration In ICCV pages479ndash486 2011 2 5

Page 4: Noisy-As-Clean: Learning Unsupervised Denoising from the ... · (d) CDnCNN+NAC: 35.80dB/0.9116 Figure 1. Denoised images and PSNR/SSIM results of CD-nCNN [44] (c) and CDnCNN trained

x

CleanImage

no

ObservedNoise

ns

SimulatedNoise

z

InputImage

Element-wiseSum

x

CleanImage

OutputImage

TargetImage

ObservedNoise

( (z) y)fθ

( (z) y)fθ LossFunction

Network(z)fθ

no

y

Figure 2 Proposed ldquoNoisy-As-Cleanrdquo strategy for training unsupervised image denoising networks In this strategy we take theobserved noisy image y = x+ no as the ldquocleanrdquo target and take the simulated noisy image z = y + ns as the input

Consistency of noise statistics Since our contexts are thereal-world scenarios the noise can be modeled by mixedPoisson and Gaussian distribution [15] Fortunately both thetwo distributions are linear additive ie the addition variableof two Poisson (or Gaussian) distributed variables are stillPoisson (or Gaussian) distributed Assume that the observed(simulated) noise no (ns) follows a mixed x-dependent (y-dependent) Poisson distribution parameterized by λo (λs)and Gaussian distribution N (0 σ2

o) (N (0 σ2s)) ie

no sim x P(λo) +N (0 σ2o)

ns sim y P(λs) +N (0 σ2s)

asymp x P(λs) +N (0 σ2s)

(12)

where x P(λo) and y P (λs) indicates that the noiseno and ns are element-wisely dependent on x and y re-spectively The ldquoasymprdquo indicates that the observed noise no isnegligible Thus we have

no+ns sim xP(λo+λs)+N (0 σ2o+σ

2s+2ρσoσs) (13)

where ρ is the correlation between no and ns (ρ = 0 ifthey are independent) This indicates that the summed noisevariable no + ns still follows a mixed x dependent Poissonand Gaussian distribution guaranteeing the consistency innoise statistics between the observed realistic noise and thesimulated noise As can be seen by the experiments (sect5) thisproperty makes our ldquoNoisy-As-Cleanrdquo strategy consistentlyeffective on different noise removal tasks

4 Learning ldquoNoisy-As-Cleanrdquo Networks forUnsupervised Image DenoisingWith our statistical analysis here we propose to learn

unsupervised networks with our ldquoNoisy-As-Cleanrdquo (NAC)strategy for image denoising Note that we only need the ob-served noisy image y to generate noisy image pairs (zy)with simulated noise ns Our idea is illustrated in Figure 2

Training NAC networks For real-world images capturedby camera sensors one can hardly distinguish the realisticnoise from the signal Our observation is that the signalintensity x is usually stronger than the noise intensity Thatis the expectation of the observed (realistic) noise no isusually much smaller than that of the latent clean image xWe can observe that if we train an image-specific networkfor the new noisy image z and regard the original noisyimage y as the ground-truth image then the trained image-specific network basically joint learn the image-specific priorand noise statistics It has the capacity to remove the noise nsfrom the new noisy image z Then if we perform denoisingon the original noisy image y the observed noise no caneasily be removed Note that we do not use the clean imagex as ldquoground-truthrdquo in training our NAC networks

Training blind denoising Most of existing supervised de-noising networks train a specific model to process a fixednoise pattern [8 9 29 33 40] To tackle the unknown noiseone feasible solution for these networks is to assume thenoise as AWGN and estimate its noise deviation The cor-responding noise is removed by using the networks trainedwith the estimated level But this strategy largely degradesthe denoising performance when the noise deviation is notestimated accurately Besides this solution can hardly dealwith realistic noise which is usually not AWGN capturedon real photographs In order to be effective on removingrealistic noise our NAC networks should have the ability toblindly remove the unknown noise from real photographsInspired by [17 44] we propose to train a blind versionof our NAC networks by using the AWGN noise within arange of levels (eg [0 55]) for removing unknown AWGNnoise We also use the mixed AWGN and Poisson noise (bothwithin a range of intensities) for removing the realistic noiseMore details will be introduced in sect52

Testing is performed by directly regarding an observed noisyimage y = x+ no as input We only test the image y onceThe denoised image can be represented as y = fθlowast(y) with

Noise Level σ = 5 σ = 10 σ = 15 σ = 20 σ = 25

Metric PSNRuarr SSIMuarr PSNRuarr SSIMuarr PSNRuarr SSIMuarr PSNRuarr SSIMuarr PSNRuarr SSIMuarrBM3D [13] 3807 09580 3440 09234 3238 08957 3100 08717 2997 08503

DnCNN [44] 3876 09633 3478 09270 3286 09027 3145 08799 3043 08617N2N [26] 3972 09665 3618 09446 3399 09149 3210 08788 3072 08446DIP [27] 3249 09344 3149 09299 2959 08636 2767 08531 2582 07723N2V [23] 2706 08174 2679 07859 2612 07468 2589 07405 2501 06564

NAC 3999 09820 3655 09569 3424 09277 3246 08961 3108 08654Blind-NAC 3848 09805 3665 09564 3477 09275 3313 09024 3178 08802

Table 2 Average PSNR (dB) and SSIM [41] results of different methods on Set12 dataset corrupted by AWGN noise The best andsecond best results are highlighted in red and blue respectively

Noise Level σ = 5 σ = 10 σ = 15 σ = 20 σ = 25

Metric PSNRuarr SSIMuarr PSNRuarr SSIMuarr PSNRuarr SSIMuarr PSNRuarr SSIMuarr PSNRuarr SSIMuarrBM3D [13] 3759 09640 3332 09163 3107 08720 2962 08342 2857 08017

DnCNN [44] 3807 09695 3388 09270 3173 08706 3027 08563 2923 08278N2N [26] 3858 09627 3407 09200 3181 08770 3014 08550 2867 08123DIP [27] 2974 08435 2816 08310 2707 07867 2580 07205 2463 06680N2V [23] 2670 07915 2639 07621 2577 07126 2541 06678 2483 06305

NAC 3900 09707 3460 09324 3213 08942 3047 08636 2896 08185Blind-NAC 3826 09605 3426 09266 3206 08919 3050 08609 2933 08327

Table 3 Average PSNR (dB) and SSIM [41] results of different methods on BSD68 dataset corrupted by AWGN noise The best andsecond best results are highlighted in red and blue respectively

which the objective metrics eg PSNR and SSIM [41] canbe computed with the clean image xImplementation details We employ the ResNet-20 net-work used in [27] as the backbone network which includes10 residual blocks Each block contains two convolutionallayers followed by a Batch Normalization (BN) [20] TheRectified Linear Units (ReLU) activation operator [32] isused after the first BN Its parameters are randomly initial-ized without being pretrained The optimizer is Adam [21]with default parameters The learning rate is fixed at 0001in all experiments We use the `2 loss function The networkis trained in 1000 epochs for each test image For data aug-mentation we employ 4 rotations 0 90 180 270combined with 2 mirror (vertical and horizontal) reflectionsresulting in totally 8 transformations We implement ourResNet based NAC networks in PyTorch [2]

5 ExperimentsIn this section we evaluate the performance of our

ldquoNoisy-As-Cleanrdquo (NAC) networks on image denoising Inall experiments we train a denoising network using onlythe noisy test image y as the target and using the simulatednoisy image z (with data augmentation) as the input For allcomparison methods the source codes or trained models aredownloaded from the corresponding authorsrsquo websites Weuse the default parameter settings unless otherwise speci-fied The PSNR SSIM [41] and visual quality of differentmethods are used to evaluate the comparison We first testwith synthetic noise such as additive white Gaussian noise(AWGN) in sect51 continue to perform blind image denoising

in sect52 and finally tackle the realistic noise in sect53 In sect54we conduct comprehensive ablation studies to gain deeperinsights into the proposed NAC strategy

51 Synthetic Noise Removal With Known Noise

We evaluate the proposed NAC networks on images cor-rupted by synthetic noise such as AWGN More experimentalresults on signal dependent Poisson noise and mixed Poisson-AWGN noise are provided in the Supplementary FileTraining NAC networks Here we train an image-specificdenoising network using the observed noisy test image yas the target and the simulated noisy image z as the inputEach observed noisy image y = x + no is generated byadding the observed noise no to the clean image x Thesimulated noisy image z = y + ns is generated by addingsimulated noise ns to observed noisy image yComparison methods We compare our NAC networkswith state-of-the-art image denoising methods [13 26 2730 44] On AWGN noise we compare with BM3D [13]DnCNN [44] Noise2Noise (N2N) [26] Deep Image Prior(DIP) [27] and Noise2Void (N2V) [23]Test datasets We evaluate the comparison methods on theSet12 and BSD68 datasets which are widely tested by su-pervised denoising networks [29 31 40 44] and previousmethods [13 16 43 46] The Set12 dataset contains 12 im-ages of sizes 512 times 512 or 256 times 256 while the BSD68dataset contains 68 images of different sizesResults on AWGN noise We test AWGN with noise devi-ation (noise level) of σ isin 5 10 15 20 25 ie the ob-served noise no is AWGN with standard deviation (std) of

Type Traditional Methods Supervised Networks Unsupervised NetworksDataset

Method CBM3D [12] NI [5] DnCNN+ [44] CBDNet [17] GCBD [10] N2N [26] DIP [27] NAC

CC [33]PSNRuarr 3519 3533 3540 3644 NA 3532 3569 3659SSIMuarr 09063 09212 09115 09460 NA 09160 09259 09502

DND [34]PSNRuarr 3451 3511 3790 3806 3558 3310 NA 3620SSIMuarr 08507 08778 09430 09421 09217 08110 NA 09252

Table 4 Average PSNR (dB) and SSIM [41] of different methods on the CC dataset [33] and the DND dataset [34] The best results arehighlighted in bold ldquoNArdquo means ldquoNot Availablerdquo due to unavailable code (GCBD on CC [33]) or difficult experiments (DIP on DND [34])

σ Since AWGN noise is signal independent the simulatednoise ns is set with the same σ as that of no The com-parison results are listed in Tables 2 and 3 It can be seenthat the network trained with the proposed NAC networksachieve much better performance on PSNR and SSIM [41]than BM3D [13] and DnCNN [44] two previous leadingimage denoising methods Note that DnCNN are supervisednetworks trained on clean and synthetic noisy image pairsOur NAC networks outperform the other unsupervised net-works N2N [26] DIP [27] and N2V [23] by a significantmargin on PSNR and SSIM [41]

52 Synthetic Noise Removal With Unknown Noise

To deal with unknown noise we propose to train a blindversion of our NAC networks for removing unknown noiseHere we test our NAC networks on AWGN noise withunknown noise deviation We use the same training strategycomparison methods and test datasets as in sect51

Training blind NAC networks For each test image wetrain a NAC network corrupted by AWGN with unknownnoise levels (deviations) The noise levels are randomly sam-pled (in Gaussian distribution) within [0 55] We also test onnoise levels in uniform distribution and obtain similar resultsWe repeat the training of NAC network on the test image withdifferent deviations Our NAC networks trained on AWGNwith unknown noise levels is termed as ldquoBlind-NACrdquo

Results on blind denoising For the same test image weadd to it the AWGN noise whose deviation is also in5 10 15 20 25 The blindly trained NAC network is di-rectly utilized to denoise the test image without estimatingits deviation The results are also listed in Tables 2 and 3 Weobserve that our Blind-NAC networks trained on AWGNnoise with unknown levels can achieve even better PSNRand SSIM [41] results than our NAC networks trained on spe-cific noise levels Note that on BSD68 our Blind-NAC net-works achieve better performance than DnCNN [44] whichachieves higher PSNR and SSIM results than our NAC net-works This demonstrate the effectiveness of our NAC net-works on blind image denoising With the success on blindimage next we will turn to real-world image denoising inwhich the noise is unknown and complex

53 Practice on Real Photographs

With the promising performance on blind image denois-ing here we tackle the realistic noise for practical appli-cations The observed realistic noise no can be roughlymodeled as mixed Poisson noise and AWGN noise [15 17]Hence for each observed noisy image y we generate thesimulated noise ns by sampling the y-dependent Poissonpart and the independent AWGN noiseTraining blind NAC networks is also performed for eachtest image ie the observed noisy image y In real-worldscenarios each observed noisy image y is corrupted withoutknowing the specific noise statistics of the observed noiseno Therefore the simulated noise ns is directly estimatedon y as mixed y-dependent Poisson and AWGN noise Foreach transformation image in data augmentation the Poissonnoise is randomly sampled with the parameter λ in 0 lt λ le25 and the AWGN noise is randomly sampled with the noiselevel σ in 0 lt σ le 25Comparison methods We compare with state-of-the-art methods on real-world image denoising includingCBM3D [12] the commercial software Neat Image [5] twosupervised networks DnCNN+ [44] and CBDNet [17] andthree unsupervised networks GCBD [10] Noise2Noise [26]and DIP [27] Note that DnCNN+ [44] and CBDNet [17]are two state-of-the-art supervised networks for real-worldimage denoising and DnCNN+ is an improved extension ofDnCNN [44] with much better performance (the authors ofDnCNN+ provide us the modelsresults of DnCNN+)Test datasets We evaluate the comparison methods on theCross-Channel (CC ) dataset [33] and DND dataset [34]The CC dataset [33] includes noisy images of 11 staticscenes captured by Canon 5D Mark 3 Nikon D600 andNikon D800 cameras The real-world noisy images are col-lected under a highly controlled indoor environment Eachscene is shot 500 times using the same camera and settingsThe average of the 500 shots is taken as the ldquoground-truthrdquoWe use the default 15 images of size 512times 512 cropped bythe authors to evaluate different image denoising methodsThe DND dataset [34] contains 50 scenarios captured bySony A7R Olympus E-M10 Sony RX100 IV and HuaweiNexus 6P Each scene is cropped to 20 bounding boxes of512times 512 pixels generating totally 1000 test images Thenoisy images are collected under higher ISO values with

(a) Noisy 3146dB09370

(e) CBDNet [17] 3934dB09905

(b) CBM3D [12] 3626dB09811

(f) GCBD [10] 3752dB09765

(c) NI [5] 3752dB09868

(g) N2N [26] 3495dB09621

(d) DnCNN+ [44] 3825dB09888

(h) NAC 3834dB09887Figure 3 Denoised images and PSNR(dB)SSIM by comparison methods on ldquo0017 3rdquo in DND [34] The ldquoground-truthrdquo image is notreleased but PSNR(dB)SSIM results are publicly provided on DND Benchmark

shorter exposure times while the ldquoground truthrdquo imagesare captured under lower ISO values with adjusted longerexposure times The ldquoground truthrdquo images are not releasedbut we can obtain the performance on PSNR and SSIM bysubmitting the denoised images to the DND rsquos Website

Results on PSNR and SSIM The comparisons on averagePSNR and SSIM results are listed in Table 4 As can be seenthe proposed NAC networks achieve better performance thanall previous denoising methods including the CBM3D [12]the supervised networks DnCNN+ [44] and CBDNet [17]and the unsupervised networks GCBD [10] N2N [26] andDIP [27] This demonstrates that the proposed NAC net-works can indeed handle the complex unknown and real-istic noise and achieve better performance than supervisednetworks such as DnCNN+ [44] and CBDNet [17]

Qualitative results In Figure 3 we show the denoised im-ages of our NAC network and the comparison methods onthe image ldquo0017 3rdquo from the DND dataset We observethat our unsupervised NAC networks are very effective onremoving realistic noise from the real photograph Besidesour NAC networks achieve competitive PSNR and SSIMresults when compared with the other methods includingthe supervised ones such as DnCNN+ [44] and CBDNet [17]Speed The work most similar to ours is Deep Image Prior(DIP) [27] which also trains an image-specific network foreach test image Averagely DIP needs 6039 seconds to pro-cess a 512times 512 color image on which our NAC network

needs 5832 seconds (on an NVIDIA Titan X GPU)

54 Ablation Study

To further study our NAC strategy we conduct detailedexamination of our NAC networks on image denoising1) Generality of our NAC strategy To evaluate the gen-erality of the proposed NAC strategy we apply it on theDnCNN [44] network and denote the resulting network asldquoDnCNN-NACrdquo We train DnCNN with our NAC strategy(DnCNN-NAC) and the comparison results with DnCNNare listed in Tab 5 One can see that DnCNN-NAC achievesbetter PSNR results than that of the original DnCNN whenσ = 5 10 15 (but worse when σ = 20 25) Note that theoriginal DnCNN network is trained offline on the BSD400dataset while here the DnCNN-NAC network is trainedonline for each specific test image

σ 5 10 15 20 25DnCNN [44] 3876 3478 3286 3145 3043DnCNN-NAC 4318 3716 3365 3116 2923

Table 5 PSNR (dB) results of DnCNN and DnCNN-NAC onSet12 corrupted by AWGN noise with different σ

2) Differences from DIP [27] Though the basic network inour work is the ResNet used in DIP [27] our NAC network isessentially different from DIP on at least two aspects Firstour NAC is a novel strategy for unsupervised learning ofadaptive network parameters for the degraded image while

0 2500 5000 7500 10000Epochs

000002004006008010012

Loss

Loss CameramanLoss House

6121824303642

PSNR

(dB)

PSNR CameramanPSNR House

(a) Curves of DIP [27] (b) Curves of our NACFigure 4 Training loss and PSNR (dB) curves of DIP [27] (a)and our NAC (b) networks wrt the number of epochs on theimages of ldquoCameramanrdquo and ldquoHouserdquo from Set12

DIP aims to investigate adaptive network structure withoutlearning the parameters Second our NAC learns a mappingfrom the synthetic noisy image z = y + ns to the noisyimage y which approximates the mapping from the noisyimage y = x + no to the clean image x But DIP maps arandom noise map to the noisy image y and the denoisedimage is obtained during the process Due to the two reasonsDIP needs early stop for different images while our NACachieves more robust (and better) denoising performancethan DIP on diverse images In Figure 4 we plot the curvesof training loss and test PSNR of DIP (a) and NAC (b)networks in 10000 epochs on two images of ldquoCameramanrdquoand ldquoHouserdquo We observe that DIP needs early stop to selectthe best results while our NAC can stably achieve betterdenoising results within 1000 epochs3) Influence on the number of residual blocks andepochs Our backbone network is the ResNet [27] with 10residual blocks trained in 1000 epochs Now we study howthe number of residual blocks and epochs influence the per-formance of NAC on image denoising The experiments areperformed on the Set12 dataset corrupted by AWGN noise(σ = 15) From Table 6 we observe that with more residualblocks the NAC networks the NAC networks can achievebetter PSNR and SSIM [41] results And 10 residual blocksare enough to achieve satisfactory results With more (eg15) blocks there is little improvement on PSRN and SSIMHence we use 10 residual blocks the same as [27] Then westudy how the number of epochs influence the performanceof NAC on image denoising From Table 7 one can see thaton the Set12 dataset corrupted by AWGN noise (σ = 15)with more training epochs our NAC networks achieve betterPSNR and SSIM results but with longer processing time

of Blocks 1 2 5 10 15PSNRuarr 3358 3385 3414 3424 3428SSIMuarr 09161 09226 09272 09277 09272

Table 6 Average PSNR (dB)SSIM of NAC with different num-ber of blocks on Set12 corrupted by AWGN noise (σ = 15)

4) Comparison with Oracle We also study the ldquoOraclerdquoperformance of our NAC networks In ldquoOraclerdquo we trainour NAC networks on the pair of observed noisy image yand its clean image x corrupted by AWGN noise or signal

5 10 15 20 25 AWGN Noise Level

2830323436384042444648

PSNR

(dB)

PSNR (dB) OraclePSNR (dB) Ours

080082084086088090092094096098100

SSIM

SSIM OracleSSIM Ours

(a)

BM3D DnCNN NACMethod

1416182022242628

PSNR

(dB)

PSNR (dB) = 50

066068070072074076078080

SSIM

SSIM = 50

(b)Figure 5 Comparisons of PSNR (dB) and SSIM results on Set12(a) by our NAC networks and its ldquoOraclerdquo version for AWGN withσ = 5 10 15 20 25 and (b) by BM3D [13] DnCNN [44] andour NAC networks for strong AWGN (σ = 50)

dependent Poisson noise The experiments are performed onSet12 dataset corrupted by AWGN or signal dependent Pois-son noise The noise deviations are in 5 10 15 20 25Figure 5 (a) shows comparisons of our NAC and its ldquoOr-aclerdquo networks on PSNR and SSIM It can be seen thatthe ldquoOraclerdquo networks trained on the pair of noisy-cleanimages only perform slightly better than the original NACnetworks trained with the simulated -observed noisy imagepairs (zy) With our NAC strategy the NAC networkstrained only with the noisy test image can achieve similardenoising performance on the weak noise5) Performance on strong noise Our NAC strategy isbased on the assumption of ldquoweak noiserdquo It is natural towonder how well NAC performs against strong noise Toanswer this question we compare the NAC networks withBM3D [13] and DnCNN [44] on Set12 corrupted by AWGNnoise with σ = 50 The PSNR and SSIM results are plottedin Figure 5 (b) One can see that our NAC networks arelimited in handling strong AWGN noise when comparedwith BM3D [13] and DnCNN [44]

of Epochs 100 200 500 1000 5000PSNRuarr 3180 3279 3377 3424 3432SSIMuarr 08714 09023 09189 09277 09280Timedarr 674 1325 3020 5832 28156

Table 7 Average PSNR (dB) and time (s) of NAC with differentnumber of epochs on Set12 corrupted by AWGN noise (σ = 15)

6 ConclusionIn this work we proposed a ldquoNoisy-As-Cleanrdquo (NAC)

strategy for learning unsupervised image denoising networksIn our NAC we trained an image-specific network by tak-ing the noisy test image as the target and adding to it thesimulated noise to generate the simulated noisy input Thesimulated noise is close to the observed noise in the noisytest image This strategy can be seamlessly embedded into ex-isting supervised denoising networks We provided a simpleand useful observation it is possible to learn an unsupervisednetwork only with the noisy image approximating the opti-mal parameters of a supervised network learned with pairsof noisy and clean images Extensive experiments on bench-

mark datasets demonstrate that the networks trained withour NAC strategy achieved better comparable performanceon PSNR SSIM and visual quality when compared to pre-vious state-of-the-art image denoising methods includingseveral supervised learning denoising networks These re-sults validate that our NAC strategy can learn image-specificpriors and noise statistics only from the corrupted test image

References[1] Darmstadt Noise Dataset Benchmark https

noisevisinftu-darmstadtdebenchmarkresults_srgb Accessed 2019-05-23 2

[2] PyTorch httpspytorchorg Accessed 2019-05-23 5

[3] Smartphone Image Denoising Dataset Benchmarkhttpswwweecsyorkuca˜kamelsiddbenchmarkphp Accessed 2019-05-23 2

[4] Abdelrahman Abdelhamed Stephen Lin and Michael SBrown A high-quality denoising dataset for smartphonecameras In CVPR June 2018 1 2

[5] Neatlab ABSoft Neat Image httpsnineatvideocomhome 6 7

[6] Joshua Batson and Loic Royer Noise2Self Blind denoisingby self-supervision In ICML volume 97 pages 524ndash533PMLR 2019 1 2 3

[7] Patrick Billingsley Probability and Measure Wiley Seriesin Probability and Statistics Wiley 1995 3

[8] Tim Brooks Ben Mildenhall Tianfan Xue Jiawen ChenDillon Sharlet and Jonathan T Barron Unprocessing imagesfor learned raw denoising In CVPR pages 9446ndash9454 20193 4

[9] Harold Christopher Burger Christian J Schuler and StefanHarmeling Image denoising Can plain neural networkscompete with BM3D In CVPR pages 2392ndash2399 2012 12 3 4

[10] Jingwen Chen Jiawei Chen Hongyang Chao and Ming YangImage blind denoising with generative adversarial networkbased noise modeling In CVPR pages 3155ndash3164 2018 16 7

[11] Yunjin Chen and Thomas Pock Trainable nonlinear reac-tion diffusion A flexible framework for fast and effectiveimage restoration IEEE Transactions on Pattern Analysisand Machine Intelligence 39(6)1256ndash1272 2017 2

[12] Kostadin Dabov Alessandro Foi Vladimir Katkovnik andKaren Egiazarian Color image denoising via sparse 3Dcollaborative filtering with grouping constraint in luminance-chrominance space In ICIP pages 313ndash316 IEEE 2007 67

[13] Kostadin Dabov Alessandro Foi Vladimir Katkovnik andKaren Egiazarian Image denoising by sparse 3-D transform-domain collaborative filtering IEEE Transactions on ImageProcessing 16(8)2080ndash2095 2007 5 6 8

[14] Michael Elad and Michal Aharon Image denoising via sparseand redundant representations over learned dictionaries IEEETransactions on Image Processing 15(12)3736ndash3745 20062

[15] Alessandro Foi Mejdi Trimeche Vladimir Katkovnik andKaren Egiazarian Practical poissonian-gaussian noise model-ing and fitting for single-image raw-data IEEE Transactionson Image Processing 17(10)1737ndash1754 Oct 2008 1 2 4 6

[16] Shuhang Gu Qi Xie Deyu Meng Wangmeng Zuo XiangchuFeng and Lei Zhang Weighted nuclear norm minimizationand its applications to low level vision International Journalof Computer Vision 121(2)183ndash208 2017 5

[17] Shi Guo Zifei Yan Kai Zhang Wangmeng Zuo and LeiZhang Toward convolutional blind denoising of real pho-tographs In CVPR 2019 1 2 3 4 6 7

[18] Kaiming He Xiangyu Zhang Shaoqing Ren and Jian SunDeep residual learning for image recognition In CVPR pages770ndash778 2016 1 2

[19] Gao Huang Zhuang Liu Laurens van der Maaten and Kil-ian Q Weinberger Densely connected convolutional net-works In CVPR pages 4700ndash4708 2017 1

[20] Sergey Ioffe and Christian Szegedy Batch normalizationAccelerating deep network training by reducing internal co-variate shift In ICML 2015 5

[21] Diederik P Kingma and Jimmy Ba Adam A method forstochastic optimization In ICLR 2015 5

[22] Alex Krizhevsky Ilya Sutskever and Geoffrey E Hinton Im-agenet classification with deep convolutional neural networksIn NIPS pages 1097ndash1105 2012 1

[23] Alexander Krull Tim-Oliver Buchholz and Florian JugNoise2Void-learning denoising from single noisy imagesIn CVPR 2019 1 2 3 5 6

[24] Samuli Laine Tero Karras Jaakko Lehtinen and TimoAila High-quality self-supervised deep image denoisingIn NeurIPS 2019 1 2

[25] Stamatios Lefkimmiatis Non-local color image denoisingwith convolutional neural networks In CVPR pages 3587ndash3596 2017 1 2 3

[26] Jaakko Lehtinen Jacob Munkberg Jon Hasselgren SamuliLaine Tero Karras Miika Aittala and Timo AilaNoise2Noise Learning image restoration without clean dataIn ICML pages 2971ndash2980 2018 1 2 3 5 6 7

[27] Victor Lempitsky Dmitry Ulya Andrea Vedaldi and VictorLempitsky Deep image prior In CVPR pages 9446ndash94542018 1 2 3 5 6 7 8

[28] Ce Liu William T Freeman Richard Szeliski and Sing BingKang Noise estimation from a single image CVPR 1901ndash908 2006 1 3

[29] Ding Liu Bihan Wen Yuchen Fan Chen Change Loy andThomas S Huang Non-local recurrent network for imagerestoration In NeurIPS pages 1673ndash1682 2018 1 2 3 4 5

[30] Markku Makitalo and Alessandro Foi Optimal inversionof the anscombe transformation in low-count poisson imagedenoising IEEE Transactions on Image Processing 20(1)99ndash109 2011 5

[31] Xiao-Jiao Mao Chunhua Shen and Yu-Bin Yang Imagerestoration using convolutional auto-encoders with symmetricskip connections In NIPS 2016 1 2 3 5

[32] Vinod Nair and Geoffrey E Hinton Rectified linear unitsimprove restricted boltzmann machines In ICML pages807ndash814 2010 5

[33] Seonghyeon Nam Youngbae Hwang Yasuyuki Matsushitaand Seon Joo Kim A holistic approach to cross-channelimage noise modeling and its application to image denoisingIn CVPR pages 1683ndash1691 2016 1 2 3 4 6

[34] Tobias Plotz and Stefan Roth Benchmarking denoising al-gorithms with real photographs In CVPR 2017 1 2 67

[35] Tobias Plotz and Stefan Roth Neural nearest neighbors net-works In NeurIPS 2018 1 2 3

[36] Stefan Roth and Michael J Black Fields of experts Inter-national Journal of Computer Vision 82(2)205ndash229 20092

[37] Uwe Schmidt and Stefan Roth Shrinkage fields for effectiveimage restoration In CVPR pages 2774ndash2781 June 2014 2

[38] Karen Simonyan and Andrew Zisserman Very deep convolu-tional networks for large-scale image recognition In ICLR2015 1

[39] Christian Szegedy Wei Liu Yangqing Jia Pierre SermanetScott Reed Dragomir Anguelov Dumitru Erhan VincentVanhoucke and Andrew Rabinovich Going deeper withconvolutions In CVPR pages 1ndash9 2015 1

[40] Ying Tai Jian Yang Xiaoming Liu and Chunyan Xu Mem-net A persistent memory network for image restoration InICCV 2017 1 2 3 4 5

[41] Zhou Wang Alan C Bovik Hamid R Sheikh and Eero PSimoncelli Image quality assessment from error visibility tostructural similarity IEEE Transactions on Image Processing13(4)600ndash612 2004 5 6 8

[42] Junyuan Xie Linli Xu and Enhong Chen Image denoisingand inpainting with deep neural networks In NIPS pages341ndash349 2012 1 2 3

[43] Jun Xu Wangmeng Zuo Lei Zhang David Zhang and XFeng Patch group based nonlocal self-similarity prior learn-ing for image denoising In ICCV pages 244ndash252 2015 25

[44] Kai Zhang Wangmeng Zuo Yunjin Chen Deyu Meng andLei Zhang Beyond a Gaussian denoiser Residual learning ofdeep cnn for image denoising IEEE Transactions on ImageProcessing 2017 1 2 3 4 5 6 7 8

[45] Maria Zontak and Michal Irani Internal statistics of a singlenatural image In CVPR 2011 2

[46] Daniel Zoran and Yair Weiss From learning models of naturalimage patches to whole image restoration In ICCV pages479ndash486 2011 2 5

Page 5: Noisy-As-Clean: Learning Unsupervised Denoising from the ... · (d) CDnCNN+NAC: 35.80dB/0.9116 Figure 1. Denoised images and PSNR/SSIM results of CD-nCNN [44] (c) and CDnCNN trained

Noise Level σ = 5 σ = 10 σ = 15 σ = 20 σ = 25

Metric PSNRuarr SSIMuarr PSNRuarr SSIMuarr PSNRuarr SSIMuarr PSNRuarr SSIMuarr PSNRuarr SSIMuarrBM3D [13] 3807 09580 3440 09234 3238 08957 3100 08717 2997 08503

DnCNN [44] 3876 09633 3478 09270 3286 09027 3145 08799 3043 08617N2N [26] 3972 09665 3618 09446 3399 09149 3210 08788 3072 08446DIP [27] 3249 09344 3149 09299 2959 08636 2767 08531 2582 07723N2V [23] 2706 08174 2679 07859 2612 07468 2589 07405 2501 06564

NAC 3999 09820 3655 09569 3424 09277 3246 08961 3108 08654Blind-NAC 3848 09805 3665 09564 3477 09275 3313 09024 3178 08802

Table 2 Average PSNR (dB) and SSIM [41] results of different methods on Set12 dataset corrupted by AWGN noise The best andsecond best results are highlighted in red and blue respectively

Noise Level σ = 5 σ = 10 σ = 15 σ = 20 σ = 25

Metric PSNRuarr SSIMuarr PSNRuarr SSIMuarr PSNRuarr SSIMuarr PSNRuarr SSIMuarr PSNRuarr SSIMuarrBM3D [13] 3759 09640 3332 09163 3107 08720 2962 08342 2857 08017

DnCNN [44] 3807 09695 3388 09270 3173 08706 3027 08563 2923 08278N2N [26] 3858 09627 3407 09200 3181 08770 3014 08550 2867 08123DIP [27] 2974 08435 2816 08310 2707 07867 2580 07205 2463 06680N2V [23] 2670 07915 2639 07621 2577 07126 2541 06678 2483 06305

NAC 3900 09707 3460 09324 3213 08942 3047 08636 2896 08185Blind-NAC 3826 09605 3426 09266 3206 08919 3050 08609 2933 08327

Table 3 Average PSNR (dB) and SSIM [41] results of different methods on BSD68 dataset corrupted by AWGN noise The best andsecond best results are highlighted in red and blue respectively

which the objective metrics eg PSNR and SSIM [41] canbe computed with the clean image xImplementation details We employ the ResNet-20 net-work used in [27] as the backbone network which includes10 residual blocks Each block contains two convolutionallayers followed by a Batch Normalization (BN) [20] TheRectified Linear Units (ReLU) activation operator [32] isused after the first BN Its parameters are randomly initial-ized without being pretrained The optimizer is Adam [21]with default parameters The learning rate is fixed at 0001in all experiments We use the `2 loss function The networkis trained in 1000 epochs for each test image For data aug-mentation we employ 4 rotations 0 90 180 270combined with 2 mirror (vertical and horizontal) reflectionsresulting in totally 8 transformations We implement ourResNet based NAC networks in PyTorch [2]

5 ExperimentsIn this section we evaluate the performance of our

ldquoNoisy-As-Cleanrdquo (NAC) networks on image denoising Inall experiments we train a denoising network using onlythe noisy test image y as the target and using the simulatednoisy image z (with data augmentation) as the input For allcomparison methods the source codes or trained models aredownloaded from the corresponding authorsrsquo websites Weuse the default parameter settings unless otherwise speci-fied The PSNR SSIM [41] and visual quality of differentmethods are used to evaluate the comparison We first testwith synthetic noise such as additive white Gaussian noise(AWGN) in sect51 continue to perform blind image denoising

in sect52 and finally tackle the realistic noise in sect53 In sect54we conduct comprehensive ablation studies to gain deeperinsights into the proposed NAC strategy

51 Synthetic Noise Removal With Known Noise

We evaluate the proposed NAC networks on images cor-rupted by synthetic noise such as AWGN More experimentalresults on signal dependent Poisson noise and mixed Poisson-AWGN noise are provided in the Supplementary FileTraining NAC networks Here we train an image-specificdenoising network using the observed noisy test image yas the target and the simulated noisy image z as the inputEach observed noisy image y = x + no is generated byadding the observed noise no to the clean image x Thesimulated noisy image z = y + ns is generated by addingsimulated noise ns to observed noisy image yComparison methods We compare our NAC networkswith state-of-the-art image denoising methods [13 26 2730 44] On AWGN noise we compare with BM3D [13]DnCNN [44] Noise2Noise (N2N) [26] Deep Image Prior(DIP) [27] and Noise2Void (N2V) [23]Test datasets We evaluate the comparison methods on theSet12 and BSD68 datasets which are widely tested by su-pervised denoising networks [29 31 40 44] and previousmethods [13 16 43 46] The Set12 dataset contains 12 im-ages of sizes 512 times 512 or 256 times 256 while the BSD68dataset contains 68 images of different sizesResults on AWGN noise We test AWGN with noise devi-ation (noise level) of σ isin 5 10 15 20 25 ie the ob-served noise no is AWGN with standard deviation (std) of

Type Traditional Methods Supervised Networks Unsupervised NetworksDataset

Method CBM3D [12] NI [5] DnCNN+ [44] CBDNet [17] GCBD [10] N2N [26] DIP [27] NAC

CC [33]PSNRuarr 3519 3533 3540 3644 NA 3532 3569 3659SSIMuarr 09063 09212 09115 09460 NA 09160 09259 09502

DND [34]PSNRuarr 3451 3511 3790 3806 3558 3310 NA 3620SSIMuarr 08507 08778 09430 09421 09217 08110 NA 09252

Table 4 Average PSNR (dB) and SSIM [41] of different methods on the CC dataset [33] and the DND dataset [34] The best results arehighlighted in bold ldquoNArdquo means ldquoNot Availablerdquo due to unavailable code (GCBD on CC [33]) or difficult experiments (DIP on DND [34])

σ Since AWGN noise is signal independent the simulatednoise ns is set with the same σ as that of no The com-parison results are listed in Tables 2 and 3 It can be seenthat the network trained with the proposed NAC networksachieve much better performance on PSNR and SSIM [41]than BM3D [13] and DnCNN [44] two previous leadingimage denoising methods Note that DnCNN are supervisednetworks trained on clean and synthetic noisy image pairsOur NAC networks outperform the other unsupervised net-works N2N [26] DIP [27] and N2V [23] by a significantmargin on PSNR and SSIM [41]

52 Synthetic Noise Removal With Unknown Noise

To deal with unknown noise we propose to train a blindversion of our NAC networks for removing unknown noiseHere we test our NAC networks on AWGN noise withunknown noise deviation We use the same training strategycomparison methods and test datasets as in sect51

Training blind NAC networks For each test image wetrain a NAC network corrupted by AWGN with unknownnoise levels (deviations) The noise levels are randomly sam-pled (in Gaussian distribution) within [0 55] We also test onnoise levels in uniform distribution and obtain similar resultsWe repeat the training of NAC network on the test image withdifferent deviations Our NAC networks trained on AWGNwith unknown noise levels is termed as ldquoBlind-NACrdquo

Results on blind denoising For the same test image weadd to it the AWGN noise whose deviation is also in5 10 15 20 25 The blindly trained NAC network is di-rectly utilized to denoise the test image without estimatingits deviation The results are also listed in Tables 2 and 3 Weobserve that our Blind-NAC networks trained on AWGNnoise with unknown levels can achieve even better PSNRand SSIM [41] results than our NAC networks trained on spe-cific noise levels Note that on BSD68 our Blind-NAC net-works achieve better performance than DnCNN [44] whichachieves higher PSNR and SSIM results than our NAC net-works This demonstrate the effectiveness of our NAC net-works on blind image denoising With the success on blindimage next we will turn to real-world image denoising inwhich the noise is unknown and complex

53 Practice on Real Photographs

With the promising performance on blind image denois-ing here we tackle the realistic noise for practical appli-cations The observed realistic noise no can be roughlymodeled as mixed Poisson noise and AWGN noise [15 17]Hence for each observed noisy image y we generate thesimulated noise ns by sampling the y-dependent Poissonpart and the independent AWGN noiseTraining blind NAC networks is also performed for eachtest image ie the observed noisy image y In real-worldscenarios each observed noisy image y is corrupted withoutknowing the specific noise statistics of the observed noiseno Therefore the simulated noise ns is directly estimatedon y as mixed y-dependent Poisson and AWGN noise Foreach transformation image in data augmentation the Poissonnoise is randomly sampled with the parameter λ in 0 lt λ le25 and the AWGN noise is randomly sampled with the noiselevel σ in 0 lt σ le 25Comparison methods We compare with state-of-the-art methods on real-world image denoising includingCBM3D [12] the commercial software Neat Image [5] twosupervised networks DnCNN+ [44] and CBDNet [17] andthree unsupervised networks GCBD [10] Noise2Noise [26]and DIP [27] Note that DnCNN+ [44] and CBDNet [17]are two state-of-the-art supervised networks for real-worldimage denoising and DnCNN+ is an improved extension ofDnCNN [44] with much better performance (the authors ofDnCNN+ provide us the modelsresults of DnCNN+)Test datasets We evaluate the comparison methods on theCross-Channel (CC ) dataset [33] and DND dataset [34]The CC dataset [33] includes noisy images of 11 staticscenes captured by Canon 5D Mark 3 Nikon D600 andNikon D800 cameras The real-world noisy images are col-lected under a highly controlled indoor environment Eachscene is shot 500 times using the same camera and settingsThe average of the 500 shots is taken as the ldquoground-truthrdquoWe use the default 15 images of size 512times 512 cropped bythe authors to evaluate different image denoising methodsThe DND dataset [34] contains 50 scenarios captured bySony A7R Olympus E-M10 Sony RX100 IV and HuaweiNexus 6P Each scene is cropped to 20 bounding boxes of512times 512 pixels generating totally 1000 test images Thenoisy images are collected under higher ISO values with

(a) Noisy 3146dB09370

(e) CBDNet [17] 3934dB09905

(b) CBM3D [12] 3626dB09811

(f) GCBD [10] 3752dB09765

(c) NI [5] 3752dB09868

(g) N2N [26] 3495dB09621

(d) DnCNN+ [44] 3825dB09888

(h) NAC 3834dB09887Figure 3 Denoised images and PSNR(dB)SSIM by comparison methods on ldquo0017 3rdquo in DND [34] The ldquoground-truthrdquo image is notreleased but PSNR(dB)SSIM results are publicly provided on DND Benchmark

shorter exposure times while the ldquoground truthrdquo imagesare captured under lower ISO values with adjusted longerexposure times The ldquoground truthrdquo images are not releasedbut we can obtain the performance on PSNR and SSIM bysubmitting the denoised images to the DND rsquos Website

Results on PSNR and SSIM The comparisons on averagePSNR and SSIM results are listed in Table 4 As can be seenthe proposed NAC networks achieve better performance thanall previous denoising methods including the CBM3D [12]the supervised networks DnCNN+ [44] and CBDNet [17]and the unsupervised networks GCBD [10] N2N [26] andDIP [27] This demonstrates that the proposed NAC net-works can indeed handle the complex unknown and real-istic noise and achieve better performance than supervisednetworks such as DnCNN+ [44] and CBDNet [17]

Qualitative results In Figure 3 we show the denoised im-ages of our NAC network and the comparison methods onthe image ldquo0017 3rdquo from the DND dataset We observethat our unsupervised NAC networks are very effective onremoving realistic noise from the real photograph Besidesour NAC networks achieve competitive PSNR and SSIMresults when compared with the other methods includingthe supervised ones such as DnCNN+ [44] and CBDNet [17]Speed The work most similar to ours is Deep Image Prior(DIP) [27] which also trains an image-specific network foreach test image Averagely DIP needs 6039 seconds to pro-cess a 512times 512 color image on which our NAC network

needs 5832 seconds (on an NVIDIA Titan X GPU)

54 Ablation Study

To further study our NAC strategy we conduct detailedexamination of our NAC networks on image denoising1) Generality of our NAC strategy To evaluate the gen-erality of the proposed NAC strategy we apply it on theDnCNN [44] network and denote the resulting network asldquoDnCNN-NACrdquo We train DnCNN with our NAC strategy(DnCNN-NAC) and the comparison results with DnCNNare listed in Tab 5 One can see that DnCNN-NAC achievesbetter PSNR results than that of the original DnCNN whenσ = 5 10 15 (but worse when σ = 20 25) Note that theoriginal DnCNN network is trained offline on the BSD400dataset while here the DnCNN-NAC network is trainedonline for each specific test image

σ 5 10 15 20 25DnCNN [44] 3876 3478 3286 3145 3043DnCNN-NAC 4318 3716 3365 3116 2923

Table 5 PSNR (dB) results of DnCNN and DnCNN-NAC onSet12 corrupted by AWGN noise with different σ

2) Differences from DIP [27] Though the basic network inour work is the ResNet used in DIP [27] our NAC network isessentially different from DIP on at least two aspects Firstour NAC is a novel strategy for unsupervised learning ofadaptive network parameters for the degraded image while

0 2500 5000 7500 10000Epochs

000002004006008010012

Loss

Loss CameramanLoss House

6121824303642

PSNR

(dB)

PSNR CameramanPSNR House

(a) Curves of DIP [27] (b) Curves of our NACFigure 4 Training loss and PSNR (dB) curves of DIP [27] (a)and our NAC (b) networks wrt the number of epochs on theimages of ldquoCameramanrdquo and ldquoHouserdquo from Set12

DIP aims to investigate adaptive network structure withoutlearning the parameters Second our NAC learns a mappingfrom the synthetic noisy image z = y + ns to the noisyimage y which approximates the mapping from the noisyimage y = x + no to the clean image x But DIP maps arandom noise map to the noisy image y and the denoisedimage is obtained during the process Due to the two reasonsDIP needs early stop for different images while our NACachieves more robust (and better) denoising performancethan DIP on diverse images In Figure 4 we plot the curvesof training loss and test PSNR of DIP (a) and NAC (b)networks in 10000 epochs on two images of ldquoCameramanrdquoand ldquoHouserdquo We observe that DIP needs early stop to selectthe best results while our NAC can stably achieve betterdenoising results within 1000 epochs3) Influence on the number of residual blocks andepochs Our backbone network is the ResNet [27] with 10residual blocks trained in 1000 epochs Now we study howthe number of residual blocks and epochs influence the per-formance of NAC on image denoising The experiments areperformed on the Set12 dataset corrupted by AWGN noise(σ = 15) From Table 6 we observe that with more residualblocks the NAC networks the NAC networks can achievebetter PSNR and SSIM [41] results And 10 residual blocksare enough to achieve satisfactory results With more (eg15) blocks there is little improvement on PSRN and SSIMHence we use 10 residual blocks the same as [27] Then westudy how the number of epochs influence the performanceof NAC on image denoising From Table 7 one can see thaton the Set12 dataset corrupted by AWGN noise (σ = 15)with more training epochs our NAC networks achieve betterPSNR and SSIM results but with longer processing time

of Blocks 1 2 5 10 15PSNRuarr 3358 3385 3414 3424 3428SSIMuarr 09161 09226 09272 09277 09272

Table 6 Average PSNR (dB)SSIM of NAC with different num-ber of blocks on Set12 corrupted by AWGN noise (σ = 15)

4) Comparison with Oracle We also study the ldquoOraclerdquoperformance of our NAC networks In ldquoOraclerdquo we trainour NAC networks on the pair of observed noisy image yand its clean image x corrupted by AWGN noise or signal

5 10 15 20 25 AWGN Noise Level

2830323436384042444648

PSNR

(dB)

PSNR (dB) OraclePSNR (dB) Ours

080082084086088090092094096098100

SSIM

SSIM OracleSSIM Ours

(a)

BM3D DnCNN NACMethod

1416182022242628

PSNR

(dB)

PSNR (dB) = 50

066068070072074076078080

SSIM

SSIM = 50

(b)Figure 5 Comparisons of PSNR (dB) and SSIM results on Set12(a) by our NAC networks and its ldquoOraclerdquo version for AWGN withσ = 5 10 15 20 25 and (b) by BM3D [13] DnCNN [44] andour NAC networks for strong AWGN (σ = 50)

dependent Poisson noise The experiments are performed onSet12 dataset corrupted by AWGN or signal dependent Pois-son noise The noise deviations are in 5 10 15 20 25Figure 5 (a) shows comparisons of our NAC and its ldquoOr-aclerdquo networks on PSNR and SSIM It can be seen thatthe ldquoOraclerdquo networks trained on the pair of noisy-cleanimages only perform slightly better than the original NACnetworks trained with the simulated -observed noisy imagepairs (zy) With our NAC strategy the NAC networkstrained only with the noisy test image can achieve similardenoising performance on the weak noise5) Performance on strong noise Our NAC strategy isbased on the assumption of ldquoweak noiserdquo It is natural towonder how well NAC performs against strong noise Toanswer this question we compare the NAC networks withBM3D [13] and DnCNN [44] on Set12 corrupted by AWGNnoise with σ = 50 The PSNR and SSIM results are plottedin Figure 5 (b) One can see that our NAC networks arelimited in handling strong AWGN noise when comparedwith BM3D [13] and DnCNN [44]

of Epochs 100 200 500 1000 5000PSNRuarr 3180 3279 3377 3424 3432SSIMuarr 08714 09023 09189 09277 09280Timedarr 674 1325 3020 5832 28156

Table 7 Average PSNR (dB) and time (s) of NAC with differentnumber of epochs on Set12 corrupted by AWGN noise (σ = 15)

6 ConclusionIn this work we proposed a ldquoNoisy-As-Cleanrdquo (NAC)

strategy for learning unsupervised image denoising networksIn our NAC we trained an image-specific network by tak-ing the noisy test image as the target and adding to it thesimulated noise to generate the simulated noisy input Thesimulated noise is close to the observed noise in the noisytest image This strategy can be seamlessly embedded into ex-isting supervised denoising networks We provided a simpleand useful observation it is possible to learn an unsupervisednetwork only with the noisy image approximating the opti-mal parameters of a supervised network learned with pairsof noisy and clean images Extensive experiments on bench-

mark datasets demonstrate that the networks trained withour NAC strategy achieved better comparable performanceon PSNR SSIM and visual quality when compared to pre-vious state-of-the-art image denoising methods includingseveral supervised learning denoising networks These re-sults validate that our NAC strategy can learn image-specificpriors and noise statistics only from the corrupted test image

References[1] Darmstadt Noise Dataset Benchmark https

noisevisinftu-darmstadtdebenchmarkresults_srgb Accessed 2019-05-23 2

[2] PyTorch httpspytorchorg Accessed 2019-05-23 5

[3] Smartphone Image Denoising Dataset Benchmarkhttpswwweecsyorkuca˜kamelsiddbenchmarkphp Accessed 2019-05-23 2

[4] Abdelrahman Abdelhamed Stephen Lin and Michael SBrown A high-quality denoising dataset for smartphonecameras In CVPR June 2018 1 2

[5] Neatlab ABSoft Neat Image httpsnineatvideocomhome 6 7

[6] Joshua Batson and Loic Royer Noise2Self Blind denoisingby self-supervision In ICML volume 97 pages 524ndash533PMLR 2019 1 2 3

[7] Patrick Billingsley Probability and Measure Wiley Seriesin Probability and Statistics Wiley 1995 3

[8] Tim Brooks Ben Mildenhall Tianfan Xue Jiawen ChenDillon Sharlet and Jonathan T Barron Unprocessing imagesfor learned raw denoising In CVPR pages 9446ndash9454 20193 4

[9] Harold Christopher Burger Christian J Schuler and StefanHarmeling Image denoising Can plain neural networkscompete with BM3D In CVPR pages 2392ndash2399 2012 12 3 4

[10] Jingwen Chen Jiawei Chen Hongyang Chao and Ming YangImage blind denoising with generative adversarial networkbased noise modeling In CVPR pages 3155ndash3164 2018 16 7

[11] Yunjin Chen and Thomas Pock Trainable nonlinear reac-tion diffusion A flexible framework for fast and effectiveimage restoration IEEE Transactions on Pattern Analysisand Machine Intelligence 39(6)1256ndash1272 2017 2

[12] Kostadin Dabov Alessandro Foi Vladimir Katkovnik andKaren Egiazarian Color image denoising via sparse 3Dcollaborative filtering with grouping constraint in luminance-chrominance space In ICIP pages 313ndash316 IEEE 2007 67

[13] Kostadin Dabov Alessandro Foi Vladimir Katkovnik andKaren Egiazarian Image denoising by sparse 3-D transform-domain collaborative filtering IEEE Transactions on ImageProcessing 16(8)2080ndash2095 2007 5 6 8

[14] Michael Elad and Michal Aharon Image denoising via sparseand redundant representations over learned dictionaries IEEETransactions on Image Processing 15(12)3736ndash3745 20062

[15] Alessandro Foi Mejdi Trimeche Vladimir Katkovnik andKaren Egiazarian Practical poissonian-gaussian noise model-ing and fitting for single-image raw-data IEEE Transactionson Image Processing 17(10)1737ndash1754 Oct 2008 1 2 4 6

[16] Shuhang Gu Qi Xie Deyu Meng Wangmeng Zuo XiangchuFeng and Lei Zhang Weighted nuclear norm minimizationand its applications to low level vision International Journalof Computer Vision 121(2)183ndash208 2017 5

[17] Shi Guo Zifei Yan Kai Zhang Wangmeng Zuo and LeiZhang Toward convolutional blind denoising of real pho-tographs In CVPR 2019 1 2 3 4 6 7

[18] Kaiming He Xiangyu Zhang Shaoqing Ren and Jian SunDeep residual learning for image recognition In CVPR pages770ndash778 2016 1 2

[19] Gao Huang Zhuang Liu Laurens van der Maaten and Kil-ian Q Weinberger Densely connected convolutional net-works In CVPR pages 4700ndash4708 2017 1

[20] Sergey Ioffe and Christian Szegedy Batch normalizationAccelerating deep network training by reducing internal co-variate shift In ICML 2015 5

[21] Diederik P Kingma and Jimmy Ba Adam A method forstochastic optimization In ICLR 2015 5

[22] Alex Krizhevsky Ilya Sutskever and Geoffrey E Hinton Im-agenet classification with deep convolutional neural networksIn NIPS pages 1097ndash1105 2012 1

[23] Alexander Krull Tim-Oliver Buchholz and Florian JugNoise2Void-learning denoising from single noisy imagesIn CVPR 2019 1 2 3 5 6

[24] Samuli Laine Tero Karras Jaakko Lehtinen and TimoAila High-quality self-supervised deep image denoisingIn NeurIPS 2019 1 2

[25] Stamatios Lefkimmiatis Non-local color image denoisingwith convolutional neural networks In CVPR pages 3587ndash3596 2017 1 2 3

[26] Jaakko Lehtinen Jacob Munkberg Jon Hasselgren SamuliLaine Tero Karras Miika Aittala and Timo AilaNoise2Noise Learning image restoration without clean dataIn ICML pages 2971ndash2980 2018 1 2 3 5 6 7

[27] Victor Lempitsky Dmitry Ulya Andrea Vedaldi and VictorLempitsky Deep image prior In CVPR pages 9446ndash94542018 1 2 3 5 6 7 8

[28] Ce Liu William T Freeman Richard Szeliski and Sing BingKang Noise estimation from a single image CVPR 1901ndash908 2006 1 3

[29] Ding Liu Bihan Wen Yuchen Fan Chen Change Loy andThomas S Huang Non-local recurrent network for imagerestoration In NeurIPS pages 1673ndash1682 2018 1 2 3 4 5

[30] Markku Makitalo and Alessandro Foi Optimal inversionof the anscombe transformation in low-count poisson imagedenoising IEEE Transactions on Image Processing 20(1)99ndash109 2011 5

[31] Xiao-Jiao Mao Chunhua Shen and Yu-Bin Yang Imagerestoration using convolutional auto-encoders with symmetricskip connections In NIPS 2016 1 2 3 5

[32] Vinod Nair and Geoffrey E Hinton Rectified linear unitsimprove restricted boltzmann machines In ICML pages807ndash814 2010 5

[33] Seonghyeon Nam Youngbae Hwang Yasuyuki Matsushitaand Seon Joo Kim A holistic approach to cross-channelimage noise modeling and its application to image denoisingIn CVPR pages 1683ndash1691 2016 1 2 3 4 6

[34] Tobias Plotz and Stefan Roth Benchmarking denoising al-gorithms with real photographs In CVPR 2017 1 2 67

[35] Tobias Plotz and Stefan Roth Neural nearest neighbors net-works In NeurIPS 2018 1 2 3

[36] Stefan Roth and Michael J Black Fields of experts Inter-national Journal of Computer Vision 82(2)205ndash229 20092

[37] Uwe Schmidt and Stefan Roth Shrinkage fields for effectiveimage restoration In CVPR pages 2774ndash2781 June 2014 2

[38] Karen Simonyan and Andrew Zisserman Very deep convolu-tional networks for large-scale image recognition In ICLR2015 1

[39] Christian Szegedy Wei Liu Yangqing Jia Pierre SermanetScott Reed Dragomir Anguelov Dumitru Erhan VincentVanhoucke and Andrew Rabinovich Going deeper withconvolutions In CVPR pages 1ndash9 2015 1

[40] Ying Tai Jian Yang Xiaoming Liu and Chunyan Xu Mem-net A persistent memory network for image restoration InICCV 2017 1 2 3 4 5

[41] Zhou Wang Alan C Bovik Hamid R Sheikh and Eero PSimoncelli Image quality assessment from error visibility tostructural similarity IEEE Transactions on Image Processing13(4)600ndash612 2004 5 6 8

[42] Junyuan Xie Linli Xu and Enhong Chen Image denoisingand inpainting with deep neural networks In NIPS pages341ndash349 2012 1 2 3

[43] Jun Xu Wangmeng Zuo Lei Zhang David Zhang and XFeng Patch group based nonlocal self-similarity prior learn-ing for image denoising In ICCV pages 244ndash252 2015 25

[44] Kai Zhang Wangmeng Zuo Yunjin Chen Deyu Meng andLei Zhang Beyond a Gaussian denoiser Residual learning ofdeep cnn for image denoising IEEE Transactions on ImageProcessing 2017 1 2 3 4 5 6 7 8

[45] Maria Zontak and Michal Irani Internal statistics of a singlenatural image In CVPR 2011 2

[46] Daniel Zoran and Yair Weiss From learning models of naturalimage patches to whole image restoration In ICCV pages479ndash486 2011 2 5

Page 6: Noisy-As-Clean: Learning Unsupervised Denoising from the ... · (d) CDnCNN+NAC: 35.80dB/0.9116 Figure 1. Denoised images and PSNR/SSIM results of CD-nCNN [44] (c) and CDnCNN trained

Type Traditional Methods Supervised Networks Unsupervised NetworksDataset

Method CBM3D [12] NI [5] DnCNN+ [44] CBDNet [17] GCBD [10] N2N [26] DIP [27] NAC

CC [33]PSNRuarr 3519 3533 3540 3644 NA 3532 3569 3659SSIMuarr 09063 09212 09115 09460 NA 09160 09259 09502

DND [34]PSNRuarr 3451 3511 3790 3806 3558 3310 NA 3620SSIMuarr 08507 08778 09430 09421 09217 08110 NA 09252

Table 4 Average PSNR (dB) and SSIM [41] of different methods on the CC dataset [33] and the DND dataset [34] The best results arehighlighted in bold ldquoNArdquo means ldquoNot Availablerdquo due to unavailable code (GCBD on CC [33]) or difficult experiments (DIP on DND [34])

σ Since AWGN noise is signal independent the simulatednoise ns is set with the same σ as that of no The com-parison results are listed in Tables 2 and 3 It can be seenthat the network trained with the proposed NAC networksachieve much better performance on PSNR and SSIM [41]than BM3D [13] and DnCNN [44] two previous leadingimage denoising methods Note that DnCNN are supervisednetworks trained on clean and synthetic noisy image pairsOur NAC networks outperform the other unsupervised net-works N2N [26] DIP [27] and N2V [23] by a significantmargin on PSNR and SSIM [41]

52 Synthetic Noise Removal With Unknown Noise

To deal with unknown noise we propose to train a blindversion of our NAC networks for removing unknown noiseHere we test our NAC networks on AWGN noise withunknown noise deviation We use the same training strategycomparison methods and test datasets as in sect51

Training blind NAC networks For each test image wetrain a NAC network corrupted by AWGN with unknownnoise levels (deviations) The noise levels are randomly sam-pled (in Gaussian distribution) within [0 55] We also test onnoise levels in uniform distribution and obtain similar resultsWe repeat the training of NAC network on the test image withdifferent deviations Our NAC networks trained on AWGNwith unknown noise levels is termed as ldquoBlind-NACrdquo

Results on blind denoising For the same test image weadd to it the AWGN noise whose deviation is also in5 10 15 20 25 The blindly trained NAC network is di-rectly utilized to denoise the test image without estimatingits deviation The results are also listed in Tables 2 and 3 Weobserve that our Blind-NAC networks trained on AWGNnoise with unknown levels can achieve even better PSNRand SSIM [41] results than our NAC networks trained on spe-cific noise levels Note that on BSD68 our Blind-NAC net-works achieve better performance than DnCNN [44] whichachieves higher PSNR and SSIM results than our NAC net-works This demonstrate the effectiveness of our NAC net-works on blind image denoising With the success on blindimage next we will turn to real-world image denoising inwhich the noise is unknown and complex

53 Practice on Real Photographs

With the promising performance on blind image denois-ing here we tackle the realistic noise for practical appli-cations The observed realistic noise no can be roughlymodeled as mixed Poisson noise and AWGN noise [15 17]Hence for each observed noisy image y we generate thesimulated noise ns by sampling the y-dependent Poissonpart and the independent AWGN noiseTraining blind NAC networks is also performed for eachtest image ie the observed noisy image y In real-worldscenarios each observed noisy image y is corrupted withoutknowing the specific noise statistics of the observed noiseno Therefore the simulated noise ns is directly estimatedon y as mixed y-dependent Poisson and AWGN noise Foreach transformation image in data augmentation the Poissonnoise is randomly sampled with the parameter λ in 0 lt λ le25 and the AWGN noise is randomly sampled with the noiselevel σ in 0 lt σ le 25Comparison methods We compare with state-of-the-art methods on real-world image denoising includingCBM3D [12] the commercial software Neat Image [5] twosupervised networks DnCNN+ [44] and CBDNet [17] andthree unsupervised networks GCBD [10] Noise2Noise [26]and DIP [27] Note that DnCNN+ [44] and CBDNet [17]are two state-of-the-art supervised networks for real-worldimage denoising and DnCNN+ is an improved extension ofDnCNN [44] with much better performance (the authors ofDnCNN+ provide us the modelsresults of DnCNN+)Test datasets We evaluate the comparison methods on theCross-Channel (CC ) dataset [33] and DND dataset [34]The CC dataset [33] includes noisy images of 11 staticscenes captured by Canon 5D Mark 3 Nikon D600 andNikon D800 cameras The real-world noisy images are col-lected under a highly controlled indoor environment Eachscene is shot 500 times using the same camera and settingsThe average of the 500 shots is taken as the ldquoground-truthrdquoWe use the default 15 images of size 512times 512 cropped bythe authors to evaluate different image denoising methodsThe DND dataset [34] contains 50 scenarios captured bySony A7R Olympus E-M10 Sony RX100 IV and HuaweiNexus 6P Each scene is cropped to 20 bounding boxes of512times 512 pixels generating totally 1000 test images Thenoisy images are collected under higher ISO values with

(a) Noisy 3146dB09370

(e) CBDNet [17] 3934dB09905

(b) CBM3D [12] 3626dB09811

(f) GCBD [10] 3752dB09765

(c) NI [5] 3752dB09868

(g) N2N [26] 3495dB09621

(d) DnCNN+ [44] 3825dB09888

(h) NAC 3834dB09887Figure 3 Denoised images and PSNR(dB)SSIM by comparison methods on ldquo0017 3rdquo in DND [34] The ldquoground-truthrdquo image is notreleased but PSNR(dB)SSIM results are publicly provided on DND Benchmark

shorter exposure times while the ldquoground truthrdquo imagesare captured under lower ISO values with adjusted longerexposure times The ldquoground truthrdquo images are not releasedbut we can obtain the performance on PSNR and SSIM bysubmitting the denoised images to the DND rsquos Website

Results on PSNR and SSIM The comparisons on averagePSNR and SSIM results are listed in Table 4 As can be seenthe proposed NAC networks achieve better performance thanall previous denoising methods including the CBM3D [12]the supervised networks DnCNN+ [44] and CBDNet [17]and the unsupervised networks GCBD [10] N2N [26] andDIP [27] This demonstrates that the proposed NAC net-works can indeed handle the complex unknown and real-istic noise and achieve better performance than supervisednetworks such as DnCNN+ [44] and CBDNet [17]

Qualitative results In Figure 3 we show the denoised im-ages of our NAC network and the comparison methods onthe image ldquo0017 3rdquo from the DND dataset We observethat our unsupervised NAC networks are very effective onremoving realistic noise from the real photograph Besidesour NAC networks achieve competitive PSNR and SSIMresults when compared with the other methods includingthe supervised ones such as DnCNN+ [44] and CBDNet [17]Speed The work most similar to ours is Deep Image Prior(DIP) [27] which also trains an image-specific network foreach test image Averagely DIP needs 6039 seconds to pro-cess a 512times 512 color image on which our NAC network

needs 5832 seconds (on an NVIDIA Titan X GPU)

54 Ablation Study

To further study our NAC strategy we conduct detailedexamination of our NAC networks on image denoising1) Generality of our NAC strategy To evaluate the gen-erality of the proposed NAC strategy we apply it on theDnCNN [44] network and denote the resulting network asldquoDnCNN-NACrdquo We train DnCNN with our NAC strategy(DnCNN-NAC) and the comparison results with DnCNNare listed in Tab 5 One can see that DnCNN-NAC achievesbetter PSNR results than that of the original DnCNN whenσ = 5 10 15 (but worse when σ = 20 25) Note that theoriginal DnCNN network is trained offline on the BSD400dataset while here the DnCNN-NAC network is trainedonline for each specific test image

σ 5 10 15 20 25DnCNN [44] 3876 3478 3286 3145 3043DnCNN-NAC 4318 3716 3365 3116 2923

Table 5 PSNR (dB) results of DnCNN and DnCNN-NAC onSet12 corrupted by AWGN noise with different σ

2) Differences from DIP [27] Though the basic network inour work is the ResNet used in DIP [27] our NAC network isessentially different from DIP on at least two aspects Firstour NAC is a novel strategy for unsupervised learning ofadaptive network parameters for the degraded image while

0 2500 5000 7500 10000Epochs

000002004006008010012

Loss

Loss CameramanLoss House

6121824303642

PSNR

(dB)

PSNR CameramanPSNR House

(a) Curves of DIP [27] (b) Curves of our NACFigure 4 Training loss and PSNR (dB) curves of DIP [27] (a)and our NAC (b) networks wrt the number of epochs on theimages of ldquoCameramanrdquo and ldquoHouserdquo from Set12

DIP aims to investigate adaptive network structure withoutlearning the parameters Second our NAC learns a mappingfrom the synthetic noisy image z = y + ns to the noisyimage y which approximates the mapping from the noisyimage y = x + no to the clean image x But DIP maps arandom noise map to the noisy image y and the denoisedimage is obtained during the process Due to the two reasonsDIP needs early stop for different images while our NACachieves more robust (and better) denoising performancethan DIP on diverse images In Figure 4 we plot the curvesof training loss and test PSNR of DIP (a) and NAC (b)networks in 10000 epochs on two images of ldquoCameramanrdquoand ldquoHouserdquo We observe that DIP needs early stop to selectthe best results while our NAC can stably achieve betterdenoising results within 1000 epochs3) Influence on the number of residual blocks andepochs Our backbone network is the ResNet [27] with 10residual blocks trained in 1000 epochs Now we study howthe number of residual blocks and epochs influence the per-formance of NAC on image denoising The experiments areperformed on the Set12 dataset corrupted by AWGN noise(σ = 15) From Table 6 we observe that with more residualblocks the NAC networks the NAC networks can achievebetter PSNR and SSIM [41] results And 10 residual blocksare enough to achieve satisfactory results With more (eg15) blocks there is little improvement on PSRN and SSIMHence we use 10 residual blocks the same as [27] Then westudy how the number of epochs influence the performanceof NAC on image denoising From Table 7 one can see thaton the Set12 dataset corrupted by AWGN noise (σ = 15)with more training epochs our NAC networks achieve betterPSNR and SSIM results but with longer processing time

of Blocks 1 2 5 10 15PSNRuarr 3358 3385 3414 3424 3428SSIMuarr 09161 09226 09272 09277 09272

Table 6 Average PSNR (dB)SSIM of NAC with different num-ber of blocks on Set12 corrupted by AWGN noise (σ = 15)

4) Comparison with Oracle We also study the ldquoOraclerdquoperformance of our NAC networks In ldquoOraclerdquo we trainour NAC networks on the pair of observed noisy image yand its clean image x corrupted by AWGN noise or signal

5 10 15 20 25 AWGN Noise Level

2830323436384042444648

PSNR

(dB)

PSNR (dB) OraclePSNR (dB) Ours

080082084086088090092094096098100

SSIM

SSIM OracleSSIM Ours

(a)

BM3D DnCNN NACMethod

1416182022242628

PSNR

(dB)

PSNR (dB) = 50

066068070072074076078080

SSIM

SSIM = 50

(b)Figure 5 Comparisons of PSNR (dB) and SSIM results on Set12(a) by our NAC networks and its ldquoOraclerdquo version for AWGN withσ = 5 10 15 20 25 and (b) by BM3D [13] DnCNN [44] andour NAC networks for strong AWGN (σ = 50)

dependent Poisson noise The experiments are performed onSet12 dataset corrupted by AWGN or signal dependent Pois-son noise The noise deviations are in 5 10 15 20 25Figure 5 (a) shows comparisons of our NAC and its ldquoOr-aclerdquo networks on PSNR and SSIM It can be seen thatthe ldquoOraclerdquo networks trained on the pair of noisy-cleanimages only perform slightly better than the original NACnetworks trained with the simulated -observed noisy imagepairs (zy) With our NAC strategy the NAC networkstrained only with the noisy test image can achieve similardenoising performance on the weak noise5) Performance on strong noise Our NAC strategy isbased on the assumption of ldquoweak noiserdquo It is natural towonder how well NAC performs against strong noise Toanswer this question we compare the NAC networks withBM3D [13] and DnCNN [44] on Set12 corrupted by AWGNnoise with σ = 50 The PSNR and SSIM results are plottedin Figure 5 (b) One can see that our NAC networks arelimited in handling strong AWGN noise when comparedwith BM3D [13] and DnCNN [44]

of Epochs 100 200 500 1000 5000PSNRuarr 3180 3279 3377 3424 3432SSIMuarr 08714 09023 09189 09277 09280Timedarr 674 1325 3020 5832 28156

Table 7 Average PSNR (dB) and time (s) of NAC with differentnumber of epochs on Set12 corrupted by AWGN noise (σ = 15)

6 ConclusionIn this work we proposed a ldquoNoisy-As-Cleanrdquo (NAC)

strategy for learning unsupervised image denoising networksIn our NAC we trained an image-specific network by tak-ing the noisy test image as the target and adding to it thesimulated noise to generate the simulated noisy input Thesimulated noise is close to the observed noise in the noisytest image This strategy can be seamlessly embedded into ex-isting supervised denoising networks We provided a simpleand useful observation it is possible to learn an unsupervisednetwork only with the noisy image approximating the opti-mal parameters of a supervised network learned with pairsof noisy and clean images Extensive experiments on bench-

mark datasets demonstrate that the networks trained withour NAC strategy achieved better comparable performanceon PSNR SSIM and visual quality when compared to pre-vious state-of-the-art image denoising methods includingseveral supervised learning denoising networks These re-sults validate that our NAC strategy can learn image-specificpriors and noise statistics only from the corrupted test image

References[1] Darmstadt Noise Dataset Benchmark https

noisevisinftu-darmstadtdebenchmarkresults_srgb Accessed 2019-05-23 2

[2] PyTorch httpspytorchorg Accessed 2019-05-23 5

[3] Smartphone Image Denoising Dataset Benchmarkhttpswwweecsyorkuca˜kamelsiddbenchmarkphp Accessed 2019-05-23 2

[4] Abdelrahman Abdelhamed Stephen Lin and Michael SBrown A high-quality denoising dataset for smartphonecameras In CVPR June 2018 1 2

[5] Neatlab ABSoft Neat Image httpsnineatvideocomhome 6 7

[6] Joshua Batson and Loic Royer Noise2Self Blind denoisingby self-supervision In ICML volume 97 pages 524ndash533PMLR 2019 1 2 3

[7] Patrick Billingsley Probability and Measure Wiley Seriesin Probability and Statistics Wiley 1995 3

[8] Tim Brooks Ben Mildenhall Tianfan Xue Jiawen ChenDillon Sharlet and Jonathan T Barron Unprocessing imagesfor learned raw denoising In CVPR pages 9446ndash9454 20193 4

[9] Harold Christopher Burger Christian J Schuler and StefanHarmeling Image denoising Can plain neural networkscompete with BM3D In CVPR pages 2392ndash2399 2012 12 3 4

[10] Jingwen Chen Jiawei Chen Hongyang Chao and Ming YangImage blind denoising with generative adversarial networkbased noise modeling In CVPR pages 3155ndash3164 2018 16 7

[11] Yunjin Chen and Thomas Pock Trainable nonlinear reac-tion diffusion A flexible framework for fast and effectiveimage restoration IEEE Transactions on Pattern Analysisand Machine Intelligence 39(6)1256ndash1272 2017 2

[12] Kostadin Dabov Alessandro Foi Vladimir Katkovnik andKaren Egiazarian Color image denoising via sparse 3Dcollaborative filtering with grouping constraint in luminance-chrominance space In ICIP pages 313ndash316 IEEE 2007 67

[13] Kostadin Dabov Alessandro Foi Vladimir Katkovnik andKaren Egiazarian Image denoising by sparse 3-D transform-domain collaborative filtering IEEE Transactions on ImageProcessing 16(8)2080ndash2095 2007 5 6 8

[14] Michael Elad and Michal Aharon Image denoising via sparseand redundant representations over learned dictionaries IEEETransactions on Image Processing 15(12)3736ndash3745 20062

[15] Alessandro Foi Mejdi Trimeche Vladimir Katkovnik andKaren Egiazarian Practical poissonian-gaussian noise model-ing and fitting for single-image raw-data IEEE Transactionson Image Processing 17(10)1737ndash1754 Oct 2008 1 2 4 6

[16] Shuhang Gu Qi Xie Deyu Meng Wangmeng Zuo XiangchuFeng and Lei Zhang Weighted nuclear norm minimizationand its applications to low level vision International Journalof Computer Vision 121(2)183ndash208 2017 5

[17] Shi Guo Zifei Yan Kai Zhang Wangmeng Zuo and LeiZhang Toward convolutional blind denoising of real pho-tographs In CVPR 2019 1 2 3 4 6 7

[18] Kaiming He Xiangyu Zhang Shaoqing Ren and Jian SunDeep residual learning for image recognition In CVPR pages770ndash778 2016 1 2

[19] Gao Huang Zhuang Liu Laurens van der Maaten and Kil-ian Q Weinberger Densely connected convolutional net-works In CVPR pages 4700ndash4708 2017 1

[20] Sergey Ioffe and Christian Szegedy Batch normalizationAccelerating deep network training by reducing internal co-variate shift In ICML 2015 5

[21] Diederik P Kingma and Jimmy Ba Adam A method forstochastic optimization In ICLR 2015 5

[22] Alex Krizhevsky Ilya Sutskever and Geoffrey E Hinton Im-agenet classification with deep convolutional neural networksIn NIPS pages 1097ndash1105 2012 1

[23] Alexander Krull Tim-Oliver Buchholz and Florian JugNoise2Void-learning denoising from single noisy imagesIn CVPR 2019 1 2 3 5 6

[24] Samuli Laine Tero Karras Jaakko Lehtinen and TimoAila High-quality self-supervised deep image denoisingIn NeurIPS 2019 1 2

[25] Stamatios Lefkimmiatis Non-local color image denoisingwith convolutional neural networks In CVPR pages 3587ndash3596 2017 1 2 3

[26] Jaakko Lehtinen Jacob Munkberg Jon Hasselgren SamuliLaine Tero Karras Miika Aittala and Timo AilaNoise2Noise Learning image restoration without clean dataIn ICML pages 2971ndash2980 2018 1 2 3 5 6 7

[27] Victor Lempitsky Dmitry Ulya Andrea Vedaldi and VictorLempitsky Deep image prior In CVPR pages 9446ndash94542018 1 2 3 5 6 7 8

[28] Ce Liu William T Freeman Richard Szeliski and Sing BingKang Noise estimation from a single image CVPR 1901ndash908 2006 1 3

[29] Ding Liu Bihan Wen Yuchen Fan Chen Change Loy andThomas S Huang Non-local recurrent network for imagerestoration In NeurIPS pages 1673ndash1682 2018 1 2 3 4 5

[30] Markku Makitalo and Alessandro Foi Optimal inversionof the anscombe transformation in low-count poisson imagedenoising IEEE Transactions on Image Processing 20(1)99ndash109 2011 5

[31] Xiao-Jiao Mao Chunhua Shen and Yu-Bin Yang Imagerestoration using convolutional auto-encoders with symmetricskip connections In NIPS 2016 1 2 3 5

[32] Vinod Nair and Geoffrey E Hinton Rectified linear unitsimprove restricted boltzmann machines In ICML pages807ndash814 2010 5

[33] Seonghyeon Nam Youngbae Hwang Yasuyuki Matsushitaand Seon Joo Kim A holistic approach to cross-channelimage noise modeling and its application to image denoisingIn CVPR pages 1683ndash1691 2016 1 2 3 4 6

[34] Tobias Plotz and Stefan Roth Benchmarking denoising al-gorithms with real photographs In CVPR 2017 1 2 67

[35] Tobias Plotz and Stefan Roth Neural nearest neighbors net-works In NeurIPS 2018 1 2 3

[36] Stefan Roth and Michael J Black Fields of experts Inter-national Journal of Computer Vision 82(2)205ndash229 20092

[37] Uwe Schmidt and Stefan Roth Shrinkage fields for effectiveimage restoration In CVPR pages 2774ndash2781 June 2014 2

[38] Karen Simonyan and Andrew Zisserman Very deep convolu-tional networks for large-scale image recognition In ICLR2015 1

[39] Christian Szegedy Wei Liu Yangqing Jia Pierre SermanetScott Reed Dragomir Anguelov Dumitru Erhan VincentVanhoucke and Andrew Rabinovich Going deeper withconvolutions In CVPR pages 1ndash9 2015 1

[40] Ying Tai Jian Yang Xiaoming Liu and Chunyan Xu Mem-net A persistent memory network for image restoration InICCV 2017 1 2 3 4 5

[41] Zhou Wang Alan C Bovik Hamid R Sheikh and Eero PSimoncelli Image quality assessment from error visibility tostructural similarity IEEE Transactions on Image Processing13(4)600ndash612 2004 5 6 8

[42] Junyuan Xie Linli Xu and Enhong Chen Image denoisingand inpainting with deep neural networks In NIPS pages341ndash349 2012 1 2 3

[43] Jun Xu Wangmeng Zuo Lei Zhang David Zhang and XFeng Patch group based nonlocal self-similarity prior learn-ing for image denoising In ICCV pages 244ndash252 2015 25

[44] Kai Zhang Wangmeng Zuo Yunjin Chen Deyu Meng andLei Zhang Beyond a Gaussian denoiser Residual learning ofdeep cnn for image denoising IEEE Transactions on ImageProcessing 2017 1 2 3 4 5 6 7 8

[45] Maria Zontak and Michal Irani Internal statistics of a singlenatural image In CVPR 2011 2

[46] Daniel Zoran and Yair Weiss From learning models of naturalimage patches to whole image restoration In ICCV pages479ndash486 2011 2 5

Page 7: Noisy-As-Clean: Learning Unsupervised Denoising from the ... · (d) CDnCNN+NAC: 35.80dB/0.9116 Figure 1. Denoised images and PSNR/SSIM results of CD-nCNN [44] (c) and CDnCNN trained

(a) Noisy 3146dB09370

(e) CBDNet [17] 3934dB09905

(b) CBM3D [12] 3626dB09811

(f) GCBD [10] 3752dB09765

(c) NI [5] 3752dB09868

(g) N2N [26] 3495dB09621

(d) DnCNN+ [44] 3825dB09888

(h) NAC 3834dB09887Figure 3 Denoised images and PSNR(dB)SSIM by comparison methods on ldquo0017 3rdquo in DND [34] The ldquoground-truthrdquo image is notreleased but PSNR(dB)SSIM results are publicly provided on DND Benchmark

shorter exposure times while the ldquoground truthrdquo imagesare captured under lower ISO values with adjusted longerexposure times The ldquoground truthrdquo images are not releasedbut we can obtain the performance on PSNR and SSIM bysubmitting the denoised images to the DND rsquos Website

Results on PSNR and SSIM The comparisons on averagePSNR and SSIM results are listed in Table 4 As can be seenthe proposed NAC networks achieve better performance thanall previous denoising methods including the CBM3D [12]the supervised networks DnCNN+ [44] and CBDNet [17]and the unsupervised networks GCBD [10] N2N [26] andDIP [27] This demonstrates that the proposed NAC net-works can indeed handle the complex unknown and real-istic noise and achieve better performance than supervisednetworks such as DnCNN+ [44] and CBDNet [17]

Qualitative results In Figure 3 we show the denoised im-ages of our NAC network and the comparison methods onthe image ldquo0017 3rdquo from the DND dataset We observethat our unsupervised NAC networks are very effective onremoving realistic noise from the real photograph Besidesour NAC networks achieve competitive PSNR and SSIMresults when compared with the other methods includingthe supervised ones such as DnCNN+ [44] and CBDNet [17]Speed The work most similar to ours is Deep Image Prior(DIP) [27] which also trains an image-specific network foreach test image Averagely DIP needs 6039 seconds to pro-cess a 512times 512 color image on which our NAC network

needs 5832 seconds (on an NVIDIA Titan X GPU)

54 Ablation Study

To further study our NAC strategy we conduct detailedexamination of our NAC networks on image denoising1) Generality of our NAC strategy To evaluate the gen-erality of the proposed NAC strategy we apply it on theDnCNN [44] network and denote the resulting network asldquoDnCNN-NACrdquo We train DnCNN with our NAC strategy(DnCNN-NAC) and the comparison results with DnCNNare listed in Tab 5 One can see that DnCNN-NAC achievesbetter PSNR results than that of the original DnCNN whenσ = 5 10 15 (but worse when σ = 20 25) Note that theoriginal DnCNN network is trained offline on the BSD400dataset while here the DnCNN-NAC network is trainedonline for each specific test image

σ 5 10 15 20 25DnCNN [44] 3876 3478 3286 3145 3043DnCNN-NAC 4318 3716 3365 3116 2923

Table 5 PSNR (dB) results of DnCNN and DnCNN-NAC onSet12 corrupted by AWGN noise with different σ

2) Differences from DIP [27] Though the basic network inour work is the ResNet used in DIP [27] our NAC network isessentially different from DIP on at least two aspects Firstour NAC is a novel strategy for unsupervised learning ofadaptive network parameters for the degraded image while

0 2500 5000 7500 10000Epochs

000002004006008010012

Loss

Loss CameramanLoss House

6121824303642

PSNR

(dB)

PSNR CameramanPSNR House

(a) Curves of DIP [27] (b) Curves of our NACFigure 4 Training loss and PSNR (dB) curves of DIP [27] (a)and our NAC (b) networks wrt the number of epochs on theimages of ldquoCameramanrdquo and ldquoHouserdquo from Set12

DIP aims to investigate adaptive network structure withoutlearning the parameters Second our NAC learns a mappingfrom the synthetic noisy image z = y + ns to the noisyimage y which approximates the mapping from the noisyimage y = x + no to the clean image x But DIP maps arandom noise map to the noisy image y and the denoisedimage is obtained during the process Due to the two reasonsDIP needs early stop for different images while our NACachieves more robust (and better) denoising performancethan DIP on diverse images In Figure 4 we plot the curvesof training loss and test PSNR of DIP (a) and NAC (b)networks in 10000 epochs on two images of ldquoCameramanrdquoand ldquoHouserdquo We observe that DIP needs early stop to selectthe best results while our NAC can stably achieve betterdenoising results within 1000 epochs3) Influence on the number of residual blocks andepochs Our backbone network is the ResNet [27] with 10residual blocks trained in 1000 epochs Now we study howthe number of residual blocks and epochs influence the per-formance of NAC on image denoising The experiments areperformed on the Set12 dataset corrupted by AWGN noise(σ = 15) From Table 6 we observe that with more residualblocks the NAC networks the NAC networks can achievebetter PSNR and SSIM [41] results And 10 residual blocksare enough to achieve satisfactory results With more (eg15) blocks there is little improvement on PSRN and SSIMHence we use 10 residual blocks the same as [27] Then westudy how the number of epochs influence the performanceof NAC on image denoising From Table 7 one can see thaton the Set12 dataset corrupted by AWGN noise (σ = 15)with more training epochs our NAC networks achieve betterPSNR and SSIM results but with longer processing time

of Blocks 1 2 5 10 15PSNRuarr 3358 3385 3414 3424 3428SSIMuarr 09161 09226 09272 09277 09272

Table 6 Average PSNR (dB)SSIM of NAC with different num-ber of blocks on Set12 corrupted by AWGN noise (σ = 15)

4) Comparison with Oracle We also study the ldquoOraclerdquoperformance of our NAC networks In ldquoOraclerdquo we trainour NAC networks on the pair of observed noisy image yand its clean image x corrupted by AWGN noise or signal

5 10 15 20 25 AWGN Noise Level

2830323436384042444648

PSNR

(dB)

PSNR (dB) OraclePSNR (dB) Ours

080082084086088090092094096098100

SSIM

SSIM OracleSSIM Ours

(a)

BM3D DnCNN NACMethod

1416182022242628

PSNR

(dB)

PSNR (dB) = 50

066068070072074076078080

SSIM

SSIM = 50

(b)Figure 5 Comparisons of PSNR (dB) and SSIM results on Set12(a) by our NAC networks and its ldquoOraclerdquo version for AWGN withσ = 5 10 15 20 25 and (b) by BM3D [13] DnCNN [44] andour NAC networks for strong AWGN (σ = 50)

dependent Poisson noise The experiments are performed onSet12 dataset corrupted by AWGN or signal dependent Pois-son noise The noise deviations are in 5 10 15 20 25Figure 5 (a) shows comparisons of our NAC and its ldquoOr-aclerdquo networks on PSNR and SSIM It can be seen thatthe ldquoOraclerdquo networks trained on the pair of noisy-cleanimages only perform slightly better than the original NACnetworks trained with the simulated -observed noisy imagepairs (zy) With our NAC strategy the NAC networkstrained only with the noisy test image can achieve similardenoising performance on the weak noise5) Performance on strong noise Our NAC strategy isbased on the assumption of ldquoweak noiserdquo It is natural towonder how well NAC performs against strong noise Toanswer this question we compare the NAC networks withBM3D [13] and DnCNN [44] on Set12 corrupted by AWGNnoise with σ = 50 The PSNR and SSIM results are plottedin Figure 5 (b) One can see that our NAC networks arelimited in handling strong AWGN noise when comparedwith BM3D [13] and DnCNN [44]

of Epochs 100 200 500 1000 5000PSNRuarr 3180 3279 3377 3424 3432SSIMuarr 08714 09023 09189 09277 09280Timedarr 674 1325 3020 5832 28156

Table 7 Average PSNR (dB) and time (s) of NAC with differentnumber of epochs on Set12 corrupted by AWGN noise (σ = 15)

6 ConclusionIn this work we proposed a ldquoNoisy-As-Cleanrdquo (NAC)

strategy for learning unsupervised image denoising networksIn our NAC we trained an image-specific network by tak-ing the noisy test image as the target and adding to it thesimulated noise to generate the simulated noisy input Thesimulated noise is close to the observed noise in the noisytest image This strategy can be seamlessly embedded into ex-isting supervised denoising networks We provided a simpleand useful observation it is possible to learn an unsupervisednetwork only with the noisy image approximating the opti-mal parameters of a supervised network learned with pairsof noisy and clean images Extensive experiments on bench-

mark datasets demonstrate that the networks trained withour NAC strategy achieved better comparable performanceon PSNR SSIM and visual quality when compared to pre-vious state-of-the-art image denoising methods includingseveral supervised learning denoising networks These re-sults validate that our NAC strategy can learn image-specificpriors and noise statistics only from the corrupted test image

References[1] Darmstadt Noise Dataset Benchmark https

noisevisinftu-darmstadtdebenchmarkresults_srgb Accessed 2019-05-23 2

[2] PyTorch httpspytorchorg Accessed 2019-05-23 5

[3] Smartphone Image Denoising Dataset Benchmarkhttpswwweecsyorkuca˜kamelsiddbenchmarkphp Accessed 2019-05-23 2

[4] Abdelrahman Abdelhamed Stephen Lin and Michael SBrown A high-quality denoising dataset for smartphonecameras In CVPR June 2018 1 2

[5] Neatlab ABSoft Neat Image httpsnineatvideocomhome 6 7

[6] Joshua Batson and Loic Royer Noise2Self Blind denoisingby self-supervision In ICML volume 97 pages 524ndash533PMLR 2019 1 2 3

[7] Patrick Billingsley Probability and Measure Wiley Seriesin Probability and Statistics Wiley 1995 3

[8] Tim Brooks Ben Mildenhall Tianfan Xue Jiawen ChenDillon Sharlet and Jonathan T Barron Unprocessing imagesfor learned raw denoising In CVPR pages 9446ndash9454 20193 4

[9] Harold Christopher Burger Christian J Schuler and StefanHarmeling Image denoising Can plain neural networkscompete with BM3D In CVPR pages 2392ndash2399 2012 12 3 4

[10] Jingwen Chen Jiawei Chen Hongyang Chao and Ming YangImage blind denoising with generative adversarial networkbased noise modeling In CVPR pages 3155ndash3164 2018 16 7

[11] Yunjin Chen and Thomas Pock Trainable nonlinear reac-tion diffusion A flexible framework for fast and effectiveimage restoration IEEE Transactions on Pattern Analysisand Machine Intelligence 39(6)1256ndash1272 2017 2

[12] Kostadin Dabov Alessandro Foi Vladimir Katkovnik andKaren Egiazarian Color image denoising via sparse 3Dcollaborative filtering with grouping constraint in luminance-chrominance space In ICIP pages 313ndash316 IEEE 2007 67

[13] Kostadin Dabov Alessandro Foi Vladimir Katkovnik andKaren Egiazarian Image denoising by sparse 3-D transform-domain collaborative filtering IEEE Transactions on ImageProcessing 16(8)2080ndash2095 2007 5 6 8

[14] Michael Elad and Michal Aharon Image denoising via sparseand redundant representations over learned dictionaries IEEETransactions on Image Processing 15(12)3736ndash3745 20062

[15] Alessandro Foi Mejdi Trimeche Vladimir Katkovnik andKaren Egiazarian Practical poissonian-gaussian noise model-ing and fitting for single-image raw-data IEEE Transactionson Image Processing 17(10)1737ndash1754 Oct 2008 1 2 4 6

[16] Shuhang Gu Qi Xie Deyu Meng Wangmeng Zuo XiangchuFeng and Lei Zhang Weighted nuclear norm minimizationand its applications to low level vision International Journalof Computer Vision 121(2)183ndash208 2017 5

[17] Shi Guo Zifei Yan Kai Zhang Wangmeng Zuo and LeiZhang Toward convolutional blind denoising of real pho-tographs In CVPR 2019 1 2 3 4 6 7

[18] Kaiming He Xiangyu Zhang Shaoqing Ren and Jian SunDeep residual learning for image recognition In CVPR pages770ndash778 2016 1 2

[19] Gao Huang Zhuang Liu Laurens van der Maaten and Kil-ian Q Weinberger Densely connected convolutional net-works In CVPR pages 4700ndash4708 2017 1

[20] Sergey Ioffe and Christian Szegedy Batch normalizationAccelerating deep network training by reducing internal co-variate shift In ICML 2015 5

[21] Diederik P Kingma and Jimmy Ba Adam A method forstochastic optimization In ICLR 2015 5

[22] Alex Krizhevsky Ilya Sutskever and Geoffrey E Hinton Im-agenet classification with deep convolutional neural networksIn NIPS pages 1097ndash1105 2012 1

[23] Alexander Krull Tim-Oliver Buchholz and Florian JugNoise2Void-learning denoising from single noisy imagesIn CVPR 2019 1 2 3 5 6

[24] Samuli Laine Tero Karras Jaakko Lehtinen and TimoAila High-quality self-supervised deep image denoisingIn NeurIPS 2019 1 2

[25] Stamatios Lefkimmiatis Non-local color image denoisingwith convolutional neural networks In CVPR pages 3587ndash3596 2017 1 2 3

[26] Jaakko Lehtinen Jacob Munkberg Jon Hasselgren SamuliLaine Tero Karras Miika Aittala and Timo AilaNoise2Noise Learning image restoration without clean dataIn ICML pages 2971ndash2980 2018 1 2 3 5 6 7

[27] Victor Lempitsky Dmitry Ulya Andrea Vedaldi and VictorLempitsky Deep image prior In CVPR pages 9446ndash94542018 1 2 3 5 6 7 8

[28] Ce Liu William T Freeman Richard Szeliski and Sing BingKang Noise estimation from a single image CVPR 1901ndash908 2006 1 3

[29] Ding Liu Bihan Wen Yuchen Fan Chen Change Loy andThomas S Huang Non-local recurrent network for imagerestoration In NeurIPS pages 1673ndash1682 2018 1 2 3 4 5

[30] Markku Makitalo and Alessandro Foi Optimal inversionof the anscombe transformation in low-count poisson imagedenoising IEEE Transactions on Image Processing 20(1)99ndash109 2011 5

[31] Xiao-Jiao Mao Chunhua Shen and Yu-Bin Yang Imagerestoration using convolutional auto-encoders with symmetricskip connections In NIPS 2016 1 2 3 5

[32] Vinod Nair and Geoffrey E Hinton Rectified linear unitsimprove restricted boltzmann machines In ICML pages807ndash814 2010 5

[33] Seonghyeon Nam Youngbae Hwang Yasuyuki Matsushitaand Seon Joo Kim A holistic approach to cross-channelimage noise modeling and its application to image denoisingIn CVPR pages 1683ndash1691 2016 1 2 3 4 6

[34] Tobias Plotz and Stefan Roth Benchmarking denoising al-gorithms with real photographs In CVPR 2017 1 2 67

[35] Tobias Plotz and Stefan Roth Neural nearest neighbors net-works In NeurIPS 2018 1 2 3

[36] Stefan Roth and Michael J Black Fields of experts Inter-national Journal of Computer Vision 82(2)205ndash229 20092

[37] Uwe Schmidt and Stefan Roth Shrinkage fields for effectiveimage restoration In CVPR pages 2774ndash2781 June 2014 2

[38] Karen Simonyan and Andrew Zisserman Very deep convolu-tional networks for large-scale image recognition In ICLR2015 1

[39] Christian Szegedy Wei Liu Yangqing Jia Pierre SermanetScott Reed Dragomir Anguelov Dumitru Erhan VincentVanhoucke and Andrew Rabinovich Going deeper withconvolutions In CVPR pages 1ndash9 2015 1

[40] Ying Tai Jian Yang Xiaoming Liu and Chunyan Xu Mem-net A persistent memory network for image restoration InICCV 2017 1 2 3 4 5

[41] Zhou Wang Alan C Bovik Hamid R Sheikh and Eero PSimoncelli Image quality assessment from error visibility tostructural similarity IEEE Transactions on Image Processing13(4)600ndash612 2004 5 6 8

[42] Junyuan Xie Linli Xu and Enhong Chen Image denoisingand inpainting with deep neural networks In NIPS pages341ndash349 2012 1 2 3

[43] Jun Xu Wangmeng Zuo Lei Zhang David Zhang and XFeng Patch group based nonlocal self-similarity prior learn-ing for image denoising In ICCV pages 244ndash252 2015 25

[44] Kai Zhang Wangmeng Zuo Yunjin Chen Deyu Meng andLei Zhang Beyond a Gaussian denoiser Residual learning ofdeep cnn for image denoising IEEE Transactions on ImageProcessing 2017 1 2 3 4 5 6 7 8

[45] Maria Zontak and Michal Irani Internal statistics of a singlenatural image In CVPR 2011 2

[46] Daniel Zoran and Yair Weiss From learning models of naturalimage patches to whole image restoration In ICCV pages479ndash486 2011 2 5

Page 8: Noisy-As-Clean: Learning Unsupervised Denoising from the ... · (d) CDnCNN+NAC: 35.80dB/0.9116 Figure 1. Denoised images and PSNR/SSIM results of CD-nCNN [44] (c) and CDnCNN trained

0 2500 5000 7500 10000Epochs

000002004006008010012

Loss

Loss CameramanLoss House

6121824303642

PSNR

(dB)

PSNR CameramanPSNR House

(a) Curves of DIP [27] (b) Curves of our NACFigure 4 Training loss and PSNR (dB) curves of DIP [27] (a)and our NAC (b) networks wrt the number of epochs on theimages of ldquoCameramanrdquo and ldquoHouserdquo from Set12

DIP aims to investigate adaptive network structure withoutlearning the parameters Second our NAC learns a mappingfrom the synthetic noisy image z = y + ns to the noisyimage y which approximates the mapping from the noisyimage y = x + no to the clean image x But DIP maps arandom noise map to the noisy image y and the denoisedimage is obtained during the process Due to the two reasonsDIP needs early stop for different images while our NACachieves more robust (and better) denoising performancethan DIP on diverse images In Figure 4 we plot the curvesof training loss and test PSNR of DIP (a) and NAC (b)networks in 10000 epochs on two images of ldquoCameramanrdquoand ldquoHouserdquo We observe that DIP needs early stop to selectthe best results while our NAC can stably achieve betterdenoising results within 1000 epochs3) Influence on the number of residual blocks andepochs Our backbone network is the ResNet [27] with 10residual blocks trained in 1000 epochs Now we study howthe number of residual blocks and epochs influence the per-formance of NAC on image denoising The experiments areperformed on the Set12 dataset corrupted by AWGN noise(σ = 15) From Table 6 we observe that with more residualblocks the NAC networks the NAC networks can achievebetter PSNR and SSIM [41] results And 10 residual blocksare enough to achieve satisfactory results With more (eg15) blocks there is little improvement on PSRN and SSIMHence we use 10 residual blocks the same as [27] Then westudy how the number of epochs influence the performanceof NAC on image denoising From Table 7 one can see thaton the Set12 dataset corrupted by AWGN noise (σ = 15)with more training epochs our NAC networks achieve betterPSNR and SSIM results but with longer processing time

of Blocks 1 2 5 10 15PSNRuarr 3358 3385 3414 3424 3428SSIMuarr 09161 09226 09272 09277 09272

Table 6 Average PSNR (dB)SSIM of NAC with different num-ber of blocks on Set12 corrupted by AWGN noise (σ = 15)

4) Comparison with Oracle We also study the ldquoOraclerdquoperformance of our NAC networks In ldquoOraclerdquo we trainour NAC networks on the pair of observed noisy image yand its clean image x corrupted by AWGN noise or signal

5 10 15 20 25 AWGN Noise Level

2830323436384042444648

PSNR

(dB)

PSNR (dB) OraclePSNR (dB) Ours

080082084086088090092094096098100

SSIM

SSIM OracleSSIM Ours

(a)

BM3D DnCNN NACMethod

1416182022242628

PSNR

(dB)

PSNR (dB) = 50

066068070072074076078080

SSIM

SSIM = 50

(b)Figure 5 Comparisons of PSNR (dB) and SSIM results on Set12(a) by our NAC networks and its ldquoOraclerdquo version for AWGN withσ = 5 10 15 20 25 and (b) by BM3D [13] DnCNN [44] andour NAC networks for strong AWGN (σ = 50)

dependent Poisson noise The experiments are performed onSet12 dataset corrupted by AWGN or signal dependent Pois-son noise The noise deviations are in 5 10 15 20 25Figure 5 (a) shows comparisons of our NAC and its ldquoOr-aclerdquo networks on PSNR and SSIM It can be seen thatthe ldquoOraclerdquo networks trained on the pair of noisy-cleanimages only perform slightly better than the original NACnetworks trained with the simulated -observed noisy imagepairs (zy) With our NAC strategy the NAC networkstrained only with the noisy test image can achieve similardenoising performance on the weak noise5) Performance on strong noise Our NAC strategy isbased on the assumption of ldquoweak noiserdquo It is natural towonder how well NAC performs against strong noise Toanswer this question we compare the NAC networks withBM3D [13] and DnCNN [44] on Set12 corrupted by AWGNnoise with σ = 50 The PSNR and SSIM results are plottedin Figure 5 (b) One can see that our NAC networks arelimited in handling strong AWGN noise when comparedwith BM3D [13] and DnCNN [44]

of Epochs 100 200 500 1000 5000PSNRuarr 3180 3279 3377 3424 3432SSIMuarr 08714 09023 09189 09277 09280Timedarr 674 1325 3020 5832 28156

Table 7 Average PSNR (dB) and time (s) of NAC with differentnumber of epochs on Set12 corrupted by AWGN noise (σ = 15)

6 ConclusionIn this work we proposed a ldquoNoisy-As-Cleanrdquo (NAC)

strategy for learning unsupervised image denoising networksIn our NAC we trained an image-specific network by tak-ing the noisy test image as the target and adding to it thesimulated noise to generate the simulated noisy input Thesimulated noise is close to the observed noise in the noisytest image This strategy can be seamlessly embedded into ex-isting supervised denoising networks We provided a simpleand useful observation it is possible to learn an unsupervisednetwork only with the noisy image approximating the opti-mal parameters of a supervised network learned with pairsof noisy and clean images Extensive experiments on bench-

mark datasets demonstrate that the networks trained withour NAC strategy achieved better comparable performanceon PSNR SSIM and visual quality when compared to pre-vious state-of-the-art image denoising methods includingseveral supervised learning denoising networks These re-sults validate that our NAC strategy can learn image-specificpriors and noise statistics only from the corrupted test image

References[1] Darmstadt Noise Dataset Benchmark https

noisevisinftu-darmstadtdebenchmarkresults_srgb Accessed 2019-05-23 2

[2] PyTorch httpspytorchorg Accessed 2019-05-23 5

[3] Smartphone Image Denoising Dataset Benchmarkhttpswwweecsyorkuca˜kamelsiddbenchmarkphp Accessed 2019-05-23 2

[4] Abdelrahman Abdelhamed Stephen Lin and Michael SBrown A high-quality denoising dataset for smartphonecameras In CVPR June 2018 1 2

[5] Neatlab ABSoft Neat Image httpsnineatvideocomhome 6 7

[6] Joshua Batson and Loic Royer Noise2Self Blind denoisingby self-supervision In ICML volume 97 pages 524ndash533PMLR 2019 1 2 3

[7] Patrick Billingsley Probability and Measure Wiley Seriesin Probability and Statistics Wiley 1995 3

[8] Tim Brooks Ben Mildenhall Tianfan Xue Jiawen ChenDillon Sharlet and Jonathan T Barron Unprocessing imagesfor learned raw denoising In CVPR pages 9446ndash9454 20193 4

[9] Harold Christopher Burger Christian J Schuler and StefanHarmeling Image denoising Can plain neural networkscompete with BM3D In CVPR pages 2392ndash2399 2012 12 3 4

[10] Jingwen Chen Jiawei Chen Hongyang Chao and Ming YangImage blind denoising with generative adversarial networkbased noise modeling In CVPR pages 3155ndash3164 2018 16 7

[11] Yunjin Chen and Thomas Pock Trainable nonlinear reac-tion diffusion A flexible framework for fast and effectiveimage restoration IEEE Transactions on Pattern Analysisand Machine Intelligence 39(6)1256ndash1272 2017 2

[12] Kostadin Dabov Alessandro Foi Vladimir Katkovnik andKaren Egiazarian Color image denoising via sparse 3Dcollaborative filtering with grouping constraint in luminance-chrominance space In ICIP pages 313ndash316 IEEE 2007 67

[13] Kostadin Dabov Alessandro Foi Vladimir Katkovnik andKaren Egiazarian Image denoising by sparse 3-D transform-domain collaborative filtering IEEE Transactions on ImageProcessing 16(8)2080ndash2095 2007 5 6 8

[14] Michael Elad and Michal Aharon Image denoising via sparseand redundant representations over learned dictionaries IEEETransactions on Image Processing 15(12)3736ndash3745 20062

[15] Alessandro Foi Mejdi Trimeche Vladimir Katkovnik andKaren Egiazarian Practical poissonian-gaussian noise model-ing and fitting for single-image raw-data IEEE Transactionson Image Processing 17(10)1737ndash1754 Oct 2008 1 2 4 6

[16] Shuhang Gu Qi Xie Deyu Meng Wangmeng Zuo XiangchuFeng and Lei Zhang Weighted nuclear norm minimizationand its applications to low level vision International Journalof Computer Vision 121(2)183ndash208 2017 5

[17] Shi Guo Zifei Yan Kai Zhang Wangmeng Zuo and LeiZhang Toward convolutional blind denoising of real pho-tographs In CVPR 2019 1 2 3 4 6 7

[18] Kaiming He Xiangyu Zhang Shaoqing Ren and Jian SunDeep residual learning for image recognition In CVPR pages770ndash778 2016 1 2

[19] Gao Huang Zhuang Liu Laurens van der Maaten and Kil-ian Q Weinberger Densely connected convolutional net-works In CVPR pages 4700ndash4708 2017 1

[20] Sergey Ioffe and Christian Szegedy Batch normalizationAccelerating deep network training by reducing internal co-variate shift In ICML 2015 5

[21] Diederik P Kingma and Jimmy Ba Adam A method forstochastic optimization In ICLR 2015 5

[22] Alex Krizhevsky Ilya Sutskever and Geoffrey E Hinton Im-agenet classification with deep convolutional neural networksIn NIPS pages 1097ndash1105 2012 1

[23] Alexander Krull Tim-Oliver Buchholz and Florian JugNoise2Void-learning denoising from single noisy imagesIn CVPR 2019 1 2 3 5 6

[24] Samuli Laine Tero Karras Jaakko Lehtinen and TimoAila High-quality self-supervised deep image denoisingIn NeurIPS 2019 1 2

[25] Stamatios Lefkimmiatis Non-local color image denoisingwith convolutional neural networks In CVPR pages 3587ndash3596 2017 1 2 3

[26] Jaakko Lehtinen Jacob Munkberg Jon Hasselgren SamuliLaine Tero Karras Miika Aittala and Timo AilaNoise2Noise Learning image restoration without clean dataIn ICML pages 2971ndash2980 2018 1 2 3 5 6 7

[27] Victor Lempitsky Dmitry Ulya Andrea Vedaldi and VictorLempitsky Deep image prior In CVPR pages 9446ndash94542018 1 2 3 5 6 7 8

[28] Ce Liu William T Freeman Richard Szeliski and Sing BingKang Noise estimation from a single image CVPR 1901ndash908 2006 1 3

[29] Ding Liu Bihan Wen Yuchen Fan Chen Change Loy andThomas S Huang Non-local recurrent network for imagerestoration In NeurIPS pages 1673ndash1682 2018 1 2 3 4 5

[30] Markku Makitalo and Alessandro Foi Optimal inversionof the anscombe transformation in low-count poisson imagedenoising IEEE Transactions on Image Processing 20(1)99ndash109 2011 5

[31] Xiao-Jiao Mao Chunhua Shen and Yu-Bin Yang Imagerestoration using convolutional auto-encoders with symmetricskip connections In NIPS 2016 1 2 3 5

[32] Vinod Nair and Geoffrey E Hinton Rectified linear unitsimprove restricted boltzmann machines In ICML pages807ndash814 2010 5

[33] Seonghyeon Nam Youngbae Hwang Yasuyuki Matsushitaand Seon Joo Kim A holistic approach to cross-channelimage noise modeling and its application to image denoisingIn CVPR pages 1683ndash1691 2016 1 2 3 4 6

[34] Tobias Plotz and Stefan Roth Benchmarking denoising al-gorithms with real photographs In CVPR 2017 1 2 67

[35] Tobias Plotz and Stefan Roth Neural nearest neighbors net-works In NeurIPS 2018 1 2 3

[36] Stefan Roth and Michael J Black Fields of experts Inter-national Journal of Computer Vision 82(2)205ndash229 20092

[37] Uwe Schmidt and Stefan Roth Shrinkage fields for effectiveimage restoration In CVPR pages 2774ndash2781 June 2014 2

[38] Karen Simonyan and Andrew Zisserman Very deep convolu-tional networks for large-scale image recognition In ICLR2015 1

[39] Christian Szegedy Wei Liu Yangqing Jia Pierre SermanetScott Reed Dragomir Anguelov Dumitru Erhan VincentVanhoucke and Andrew Rabinovich Going deeper withconvolutions In CVPR pages 1ndash9 2015 1

[40] Ying Tai Jian Yang Xiaoming Liu and Chunyan Xu Mem-net A persistent memory network for image restoration InICCV 2017 1 2 3 4 5

[41] Zhou Wang Alan C Bovik Hamid R Sheikh and Eero PSimoncelli Image quality assessment from error visibility tostructural similarity IEEE Transactions on Image Processing13(4)600ndash612 2004 5 6 8

[42] Junyuan Xie Linli Xu and Enhong Chen Image denoisingand inpainting with deep neural networks In NIPS pages341ndash349 2012 1 2 3

[43] Jun Xu Wangmeng Zuo Lei Zhang David Zhang and XFeng Patch group based nonlocal self-similarity prior learn-ing for image denoising In ICCV pages 244ndash252 2015 25

[44] Kai Zhang Wangmeng Zuo Yunjin Chen Deyu Meng andLei Zhang Beyond a Gaussian denoiser Residual learning ofdeep cnn for image denoising IEEE Transactions on ImageProcessing 2017 1 2 3 4 5 6 7 8

[45] Maria Zontak and Michal Irani Internal statistics of a singlenatural image In CVPR 2011 2

[46] Daniel Zoran and Yair Weiss From learning models of naturalimage patches to whole image restoration In ICCV pages479ndash486 2011 2 5

Page 9: Noisy-As-Clean: Learning Unsupervised Denoising from the ... · (d) CDnCNN+NAC: 35.80dB/0.9116 Figure 1. Denoised images and PSNR/SSIM results of CD-nCNN [44] (c) and CDnCNN trained

mark datasets demonstrate that the networks trained withour NAC strategy achieved better comparable performanceon PSNR SSIM and visual quality when compared to pre-vious state-of-the-art image denoising methods includingseveral supervised learning denoising networks These re-sults validate that our NAC strategy can learn image-specificpriors and noise statistics only from the corrupted test image

References[1] Darmstadt Noise Dataset Benchmark https

noisevisinftu-darmstadtdebenchmarkresults_srgb Accessed 2019-05-23 2

[2] PyTorch httpspytorchorg Accessed 2019-05-23 5

[3] Smartphone Image Denoising Dataset Benchmarkhttpswwweecsyorkuca˜kamelsiddbenchmarkphp Accessed 2019-05-23 2

[4] Abdelrahman Abdelhamed Stephen Lin and Michael SBrown A high-quality denoising dataset for smartphonecameras In CVPR June 2018 1 2

[5] Neatlab ABSoft Neat Image httpsnineatvideocomhome 6 7

[6] Joshua Batson and Loic Royer Noise2Self Blind denoisingby self-supervision In ICML volume 97 pages 524ndash533PMLR 2019 1 2 3

[7] Patrick Billingsley Probability and Measure Wiley Seriesin Probability and Statistics Wiley 1995 3

[8] Tim Brooks Ben Mildenhall Tianfan Xue Jiawen ChenDillon Sharlet and Jonathan T Barron Unprocessing imagesfor learned raw denoising In CVPR pages 9446ndash9454 20193 4

[9] Harold Christopher Burger Christian J Schuler and StefanHarmeling Image denoising Can plain neural networkscompete with BM3D In CVPR pages 2392ndash2399 2012 12 3 4

[10] Jingwen Chen Jiawei Chen Hongyang Chao and Ming YangImage blind denoising with generative adversarial networkbased noise modeling In CVPR pages 3155ndash3164 2018 16 7

[11] Yunjin Chen and Thomas Pock Trainable nonlinear reac-tion diffusion A flexible framework for fast and effectiveimage restoration IEEE Transactions on Pattern Analysisand Machine Intelligence 39(6)1256ndash1272 2017 2

[12] Kostadin Dabov Alessandro Foi Vladimir Katkovnik andKaren Egiazarian Color image denoising via sparse 3Dcollaborative filtering with grouping constraint in luminance-chrominance space In ICIP pages 313ndash316 IEEE 2007 67

[13] Kostadin Dabov Alessandro Foi Vladimir Katkovnik andKaren Egiazarian Image denoising by sparse 3-D transform-domain collaborative filtering IEEE Transactions on ImageProcessing 16(8)2080ndash2095 2007 5 6 8

[14] Michael Elad and Michal Aharon Image denoising via sparseand redundant representations over learned dictionaries IEEETransactions on Image Processing 15(12)3736ndash3745 20062

[15] Alessandro Foi Mejdi Trimeche Vladimir Katkovnik andKaren Egiazarian Practical poissonian-gaussian noise model-ing and fitting for single-image raw-data IEEE Transactionson Image Processing 17(10)1737ndash1754 Oct 2008 1 2 4 6

[16] Shuhang Gu Qi Xie Deyu Meng Wangmeng Zuo XiangchuFeng and Lei Zhang Weighted nuclear norm minimizationand its applications to low level vision International Journalof Computer Vision 121(2)183ndash208 2017 5

[17] Shi Guo Zifei Yan Kai Zhang Wangmeng Zuo and LeiZhang Toward convolutional blind denoising of real pho-tographs In CVPR 2019 1 2 3 4 6 7

[18] Kaiming He Xiangyu Zhang Shaoqing Ren and Jian SunDeep residual learning for image recognition In CVPR pages770ndash778 2016 1 2

[19] Gao Huang Zhuang Liu Laurens van der Maaten and Kil-ian Q Weinberger Densely connected convolutional net-works In CVPR pages 4700ndash4708 2017 1

[20] Sergey Ioffe and Christian Szegedy Batch normalizationAccelerating deep network training by reducing internal co-variate shift In ICML 2015 5

[21] Diederik P Kingma and Jimmy Ba Adam A method forstochastic optimization In ICLR 2015 5

[22] Alex Krizhevsky Ilya Sutskever and Geoffrey E Hinton Im-agenet classification with deep convolutional neural networksIn NIPS pages 1097ndash1105 2012 1

[23] Alexander Krull Tim-Oliver Buchholz and Florian JugNoise2Void-learning denoising from single noisy imagesIn CVPR 2019 1 2 3 5 6

[24] Samuli Laine Tero Karras Jaakko Lehtinen and TimoAila High-quality self-supervised deep image denoisingIn NeurIPS 2019 1 2

[25] Stamatios Lefkimmiatis Non-local color image denoisingwith convolutional neural networks In CVPR pages 3587ndash3596 2017 1 2 3

[26] Jaakko Lehtinen Jacob Munkberg Jon Hasselgren SamuliLaine Tero Karras Miika Aittala and Timo AilaNoise2Noise Learning image restoration without clean dataIn ICML pages 2971ndash2980 2018 1 2 3 5 6 7

[27] Victor Lempitsky Dmitry Ulya Andrea Vedaldi and VictorLempitsky Deep image prior In CVPR pages 9446ndash94542018 1 2 3 5 6 7 8

[28] Ce Liu William T Freeman Richard Szeliski and Sing BingKang Noise estimation from a single image CVPR 1901ndash908 2006 1 3

[29] Ding Liu Bihan Wen Yuchen Fan Chen Change Loy andThomas S Huang Non-local recurrent network for imagerestoration In NeurIPS pages 1673ndash1682 2018 1 2 3 4 5

[30] Markku Makitalo and Alessandro Foi Optimal inversionof the anscombe transformation in low-count poisson imagedenoising IEEE Transactions on Image Processing 20(1)99ndash109 2011 5

[31] Xiao-Jiao Mao Chunhua Shen and Yu-Bin Yang Imagerestoration using convolutional auto-encoders with symmetricskip connections In NIPS 2016 1 2 3 5

[32] Vinod Nair and Geoffrey E Hinton Rectified linear unitsimprove restricted boltzmann machines In ICML pages807ndash814 2010 5

[33] Seonghyeon Nam Youngbae Hwang Yasuyuki Matsushitaand Seon Joo Kim A holistic approach to cross-channelimage noise modeling and its application to image denoisingIn CVPR pages 1683ndash1691 2016 1 2 3 4 6

[34] Tobias Plotz and Stefan Roth Benchmarking denoising al-gorithms with real photographs In CVPR 2017 1 2 67

[35] Tobias Plotz and Stefan Roth Neural nearest neighbors net-works In NeurIPS 2018 1 2 3

[36] Stefan Roth and Michael J Black Fields of experts Inter-national Journal of Computer Vision 82(2)205ndash229 20092

[37] Uwe Schmidt and Stefan Roth Shrinkage fields for effectiveimage restoration In CVPR pages 2774ndash2781 June 2014 2

[38] Karen Simonyan and Andrew Zisserman Very deep convolu-tional networks for large-scale image recognition In ICLR2015 1

[39] Christian Szegedy Wei Liu Yangqing Jia Pierre SermanetScott Reed Dragomir Anguelov Dumitru Erhan VincentVanhoucke and Andrew Rabinovich Going deeper withconvolutions In CVPR pages 1ndash9 2015 1

[40] Ying Tai Jian Yang Xiaoming Liu and Chunyan Xu Mem-net A persistent memory network for image restoration InICCV 2017 1 2 3 4 5

[41] Zhou Wang Alan C Bovik Hamid R Sheikh and Eero PSimoncelli Image quality assessment from error visibility tostructural similarity IEEE Transactions on Image Processing13(4)600ndash612 2004 5 6 8

[42] Junyuan Xie Linli Xu and Enhong Chen Image denoisingand inpainting with deep neural networks In NIPS pages341ndash349 2012 1 2 3

[43] Jun Xu Wangmeng Zuo Lei Zhang David Zhang and XFeng Patch group based nonlocal self-similarity prior learn-ing for image denoising In ICCV pages 244ndash252 2015 25

[44] Kai Zhang Wangmeng Zuo Yunjin Chen Deyu Meng andLei Zhang Beyond a Gaussian denoiser Residual learning ofdeep cnn for image denoising IEEE Transactions on ImageProcessing 2017 1 2 3 4 5 6 7 8

[45] Maria Zontak and Michal Irani Internal statistics of a singlenatural image In CVPR 2011 2

[46] Daniel Zoran and Yair Weiss From learning models of naturalimage patches to whole image restoration In ICCV pages479ndash486 2011 2 5

Page 10: Noisy-As-Clean: Learning Unsupervised Denoising from the ... · (d) CDnCNN+NAC: 35.80dB/0.9116 Figure 1. Denoised images and PSNR/SSIM results of CD-nCNN [44] (c) and CDnCNN trained

[33] Seonghyeon Nam Youngbae Hwang Yasuyuki Matsushitaand Seon Joo Kim A holistic approach to cross-channelimage noise modeling and its application to image denoisingIn CVPR pages 1683ndash1691 2016 1 2 3 4 6

[34] Tobias Plotz and Stefan Roth Benchmarking denoising al-gorithms with real photographs In CVPR 2017 1 2 67

[35] Tobias Plotz and Stefan Roth Neural nearest neighbors net-works In NeurIPS 2018 1 2 3

[36] Stefan Roth and Michael J Black Fields of experts Inter-national Journal of Computer Vision 82(2)205ndash229 20092

[37] Uwe Schmidt and Stefan Roth Shrinkage fields for effectiveimage restoration In CVPR pages 2774ndash2781 June 2014 2

[38] Karen Simonyan and Andrew Zisserman Very deep convolu-tional networks for large-scale image recognition In ICLR2015 1

[39] Christian Szegedy Wei Liu Yangqing Jia Pierre SermanetScott Reed Dragomir Anguelov Dumitru Erhan VincentVanhoucke and Andrew Rabinovich Going deeper withconvolutions In CVPR pages 1ndash9 2015 1

[40] Ying Tai Jian Yang Xiaoming Liu and Chunyan Xu Mem-net A persistent memory network for image restoration InICCV 2017 1 2 3 4 5

[41] Zhou Wang Alan C Bovik Hamid R Sheikh and Eero PSimoncelli Image quality assessment from error visibility tostructural similarity IEEE Transactions on Image Processing13(4)600ndash612 2004 5 6 8

[42] Junyuan Xie Linli Xu and Enhong Chen Image denoisingand inpainting with deep neural networks In NIPS pages341ndash349 2012 1 2 3

[43] Jun Xu Wangmeng Zuo Lei Zhang David Zhang and XFeng Patch group based nonlocal self-similarity prior learn-ing for image denoising In ICCV pages 244ndash252 2015 25

[44] Kai Zhang Wangmeng Zuo Yunjin Chen Deyu Meng andLei Zhang Beyond a Gaussian denoiser Residual learning ofdeep cnn for image denoising IEEE Transactions on ImageProcessing 2017 1 2 3 4 5 6 7 8

[45] Maria Zontak and Michal Irani Internal statistics of a singlenatural image In CVPR 2011 2

[46] Daniel Zoran and Yair Weiss From learning models of naturalimage patches to whole image restoration In ICCV pages479ndash486 2011 2 5