Top Banner
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 9, SEPTEMBER 2011 2683 REFERENCES [1] L. J. Karam, T. Ebrahimi, S. S. Hemami, T. N. Pappas, R. J. Safranek, Z. Wang, and A. B. Watson, “Introduction to the issue on visual media quality assessment,” IEEE J. Sel. Topics Signal Process., vol. 3, no. 2, pp. 189–192, Apr. 2009. [2] R. Ferzli and L. J. Karam, “A no-reference objective image sharpness metric based on the notion of just noticeable blur (JNB),” IEEE Trans. Image Process., vol. 18, no. 4, pp. 717–728, Apr. 2009. [3] N. G. Sadaka, L. J. Karam, R. Ferzli, and G. P. Abousleman, “A noref- erence perceptual image sharpness metric based on saliency-weighted foveal pooling,” in Proc. IEEE Int. Conf. Image Process., Oct. 2008, pp. 369–372. [4] R. Hassen, Z. Wang, and M. Salama, “No-reference image sharpness assessment based on local phase coherence measurement,” in Proc. Int. Conf. Acoust., Speech, Signal Process., Mar. 2010, pp. 2434–2437. [5] J. G. Robson and N. Graham, “Probability summation and regional variation in contrast sensitivity across the visual field,” Vis. Res., vol. 21, no. 3, pp. 409–418, 1981. [6] P. Marziliano, F. Dufaux, S. Winkler, and T. Ebrahimi, “Perceptual blur and ringing metrics: Applications to JPEG2000,” Signal Process.: Image Commun., vol. 19, no. 2, pp. 163–172, Feb. 2004. [7] N. D. Narvekar, “Objective no-reference visual blur assessment,” M.S. thesis, Dept. Electr. Eng., Arizona State Univ., Tempe, 2009. [8] Z. Liu, L. J. Karam, and A. B. Watson, “JPEG2000 encoding with per- ceptual distortion control,” IEEE Trans. Image Process., vol. 15, no. 7, pp. 1763–1778, Jul. 2006. [9] D. Hood and M. Finkelstein, Handbook of Perception and Human Per- formance. New York: Wiley, 1986. [10] H. R. Sheikh, A. C. Bovik, L. Cormack, and Z. Wang, “LIVE image quality assessment database,” 2003 [Online]. Available: http://live.ece. utexas.edu/research/quality [11] N. Ponomarenko, M. Carli, V. Lukin, K. Egiazarian, J. Astola, and F. Battisti, “Color image database for evaluation of image quality met- rics,” in Proc. Int. Workshop Multimedia Signal Process., Oct. 2008, pp. 403–408. [12] P. Le Callet and F. Autrusseau, “Subjective quality assessment IR- CCyN/IVC database,” 2005 [Online]. Available: http://www.irccyn.ec- nantes.fr/ivcdb/ [13] Z. M. P. Sazzad, Y. Kawayoke, and Y. Horita, “Image quality eval- uation database,” [Online]. Available: http://mict.eng.u-toyama.ac.jp/ database_toyama/ [14] “Final report from the Video Quality Experts Group on the validation of objective models of video quality assessment,” VQEG, 2000. [15] S. Tourancheau, F. Autrusseau, Z. M. Parvez, and Y. Horita, “Impact of subjective datasets on the performance of image quality metrics,” in Proc. IEEE Int. Conf. Image Process., Oct. 2008, pp. 365–368. Nonlocal Means With Dimensionality Reduction and SURE-Based Parameter Selection Dimitri Van De Ville, Member, IEEE, and Michel Kocher Abstract—Nonlocal means (NLM) is an effective denoising method that applies adaptive averaging based on similarity between neighborhoods in the image. An attractive way to both improve and speed-up NLM is by first performing a linear projection of the neighborhood. One particular example is to use principal components analysis (PCA) to perform dimen- sionality reduction. Here, we derive Stein’s unbiased risk estimate (SURE) for NLM with linear projection of the neighborhoods. The SURE can then be used to optimize the parameters by a search algorithm or we can con- sider a linear expansion of multiple NLMs, each with a fixed parameter set, for which the optimal weights can be found by solving a linear system of equations. The experimental results demonstrate the accuracy of the SURE and its successful application to tune the parameters for NLM. Index Terms—Linear transforms, nonlocal means (NLM), principal com- ponent analysis (PCA), Stein’s unbiased risk estimate. I. INTRODUCTION Learning from neighborhoods has become an important and powerful data-driven approach for various applications in image processing. Most notably, the nonlocal means (NLM) [1] algorithm applies adaptive averaging based on similar neighborhoods in a search region. Various methods have been proposed to accelerate the initial approach using preselection of the contributing neighborhoods based on average value and gradient [2], average and variance [3] or higher-order statistical moments [4], cluster tree arrangement [5], and [6], [7]. The computation of the distance measure between different neighborhoods itself can be optimized using the fast Fourier transform [8], a moving average filter [9], [10], early termination of the search [11], or by reducing redundant comparisons [12]. Variations of the NLM algorithm have also been proposed to improve the denoising performance; e.g., adaptive neighborhoods [13], iterative application [5], combination with kernel regression [14] and spectral analysis [15], and other similarity measures based on principal component analysis (PCA) [6], [16] or rotation invariance [17]. The smoothing parameter that determines the contributions of the patches has been locally optimized using Mallow’s statistic [18]. The most evolved version of the nonlocal principle is probably BM3D [19], which further processes the selected neighborhoods and gives high quality results. The combination of NLM with dimensionality reduction methods such as PCA [6], [16] and SVD [7] has gained increased interest since the advantages are twofold. First, the computational complexity is highly reduced. Second, measuring the distance between neighbor- hoods in a lower-dimensional subspace improves robustness to noise; Manuscript received August 11, 2010; revised December 06, 2010, February 10, 2011; accepted February 21, 2011. Date of publication March 07, 2011; date of current version August 19, 2011. This work has been funded in part by the Swiss National Science Foundation (PP00P2-123438, D. Van De Ville) and in part by the Centre for Biomedical Imaging (CIBM). The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Rafael Molina. D. Van De Ville is with the Department of Radiology and Medical Infor- matics, University of Geneva, 1211 Geneva 14, Switzerland, and also with the Institute of Bioengineering, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne 1015, Switzerland. M. Kocher is with the Biomedical Imaging Group, EPFL, Lausanne 1015, Switzerland. Digital Object Identifier 10.1109/TIP.2011.2121083 1057-7149/$26.00 © 2011 IEEE
8

R Nonlocal Means With Dimensionality Reduction and SURE …miplab.epfl.ch/pub/vandeville1101.pdf · 2011-08-30 · Proc. IEEE Int. Conf. Image Process., Oct. 2008, pp. 365–368.

Jul 06, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: R Nonlocal Means With Dimensionality Reduction and SURE …miplab.epfl.ch/pub/vandeville1101.pdf · 2011-08-30 · Proc. IEEE Int. Conf. Image Process., Oct. 2008, pp. 365–368.

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 9, SEPTEMBER 2011 2683

REFERENCES

[1] L. J. Karam, T. Ebrahimi, S. S. Hemami, T. N. Pappas, R. J. Safranek,Z. Wang, and A. B. Watson, “Introduction to the issue on visual mediaquality assessment,” IEEE J. Sel. Topics Signal Process., vol. 3, no. 2,pp. 189–192, Apr. 2009.

[2] R. Ferzli and L. J. Karam, “A no-reference objective image sharpnessmetric based on the notion of just noticeable blur (JNB),” IEEE Trans.Image Process., vol. 18, no. 4, pp. 717–728, Apr. 2009.

[3] N. G. Sadaka, L. J. Karam, R. Ferzli, and G. P. Abousleman, “A noref-erence perceptual image sharpness metric based on saliency-weightedfoveal pooling,” in Proc. IEEE Int. Conf. Image Process., Oct. 2008,pp. 369–372.

[4] R. Hassen, Z. Wang, and M. Salama, “No-reference image sharpnessassessment based on local phase coherence measurement,” in Proc. Int.Conf. Acoust., Speech, Signal Process., Mar. 2010, pp. 2434–2437.

[5] J. G. Robson and N. Graham, “Probability summation and regionalvariation in contrast sensitivity across the visual field,” Vis. Res., vol.21, no. 3, pp. 409–418, 1981.

[6] P. Marziliano, F. Dufaux, S. Winkler, and T. Ebrahimi, “Perceptualblur and ringing metrics: Applications to JPEG2000,” Signal Process.:Image Commun., vol. 19, no. 2, pp. 163–172, Feb. 2004.

[7] N. D. Narvekar, “Objective no-reference visual blur assessment,” M.S.thesis, Dept. Electr. Eng., Arizona State Univ., Tempe, 2009.

[8] Z. Liu, L. J. Karam, and A. B. Watson, “JPEG2000 encoding with per-ceptual distortion control,” IEEE Trans. Image Process., vol. 15, no. 7,pp. 1763–1778, Jul. 2006.

[9] D. Hood and M. Finkelstein, Handbook of Perception and Human Per-formance. New York: Wiley, 1986.

[10] H. R. Sheikh, A. C. Bovik, L. Cormack, and Z. Wang, “LIVE imagequality assessment database,” 2003 [Online]. Available: http://live.ece.utexas.edu/research/quality

[11] N. Ponomarenko, M. Carli, V. Lukin, K. Egiazarian, J. Astola, and F.Battisti, “Color image database for evaluation of image quality met-rics,” in Proc. Int. Workshop Multimedia Signal Process., Oct. 2008,pp. 403–408.

[12] P. Le Callet and F. Autrusseau, “Subjective quality assessment IR-CCyN/IVC database,” 2005 [Online]. Available: http://www.irccyn.ec-nantes.fr/ivcdb/

[13] Z. M. P. Sazzad, Y. Kawayoke, and Y. Horita, “Image quality eval-uation database,” [Online]. Available: http://mict.eng.u-toyama.ac.jp/database_toyama/

[14] “Final report from the Video Quality Experts Group on the validationof objective models of video quality assessment,” VQEG, 2000.

[15] S. Tourancheau, F. Autrusseau, Z. M. Parvez, and Y. Horita, “Impactof subjective datasets on the performance of image quality metrics,” inProc. IEEE Int. Conf. Image Process., Oct. 2008, pp. 365–368.

Nonlocal Means With Dimensionality Reduction andSURE-Based Parameter Selection

Dimitri Van De Ville, Member, IEEE, and Michel Kocher

Abstract—Nonlocal means (NLM) is an effective denoising method thatapplies adaptive averaging based on similarity between neighborhoods inthe image. An attractive way to both improve and speed-up NLM is byfirst performing a linear projection of the neighborhood. One particularexample is to use principal components analysis (PCA) to perform dimen-sionality reduction. Here, we derive Stein’s unbiased risk estimate (SURE)for NLM with linear projection of the neighborhoods. The SURE can thenbe used to optimize the parameters by a search algorithm or we can con-sider a linear expansion of multiple NLMs, each with a fixed parameterset, for which the optimal weights can be found by solving a linear system ofequations. The experimental results demonstrate the accuracy of the SUREand its successful application to tune the parameters for NLM.

Index Terms—Linear transforms, nonlocal means (NLM), principal com-ponent analysis (PCA), Stein’s unbiased risk estimate.

I. INTRODUCTION

Learning from neighborhoods has become an important andpowerful data-driven approach for various applications in imageprocessing. Most notably, the nonlocal means (NLM) [1] algorithmapplies adaptive averaging based on similar neighborhoods in asearch region. Various methods have been proposed to accelerate theinitial approach using preselection of the contributing neighborhoodsbased on average value and gradient [2], average and variance [3] orhigher-order statistical moments [4], cluster tree arrangement [5], and[6], [7]. The computation of the distance measure between differentneighborhoods itself can be optimized using the fast Fourier transform[8], a moving average filter [9], [10], early termination of the search[11], or by reducing redundant comparisons [12].

Variations of the NLM algorithm have also been proposed toimprove the denoising performance; e.g., adaptive neighborhoods[13], iterative application [5], combination with kernel regression [14]and spectral analysis [15], and other similarity measures based onprincipal component analysis (PCA) [6], [16] or rotation invariance[17]. The smoothing parameter that determines the contributions ofthe patches has been locally optimized using Mallow’s �� statistic[18]. The most evolved version of the nonlocal principle is probablyBM3D [19], which further processes the selected neighborhoods andgives high quality results.

The combination of NLM with dimensionality reduction methodssuch as PCA [6], [16] and SVD [7] has gained increased interest sincethe advantages are twofold. First, the computational complexity ishighly reduced. Second, measuring the distance between neighbor-hoods in a lower-dimensional subspace improves robustness to noise;

Manuscript received August 11, 2010; revised December 06, 2010, February10, 2011; accepted February 21, 2011. Date of publication March 07, 2011;date of current version August 19, 2011. This work has been funded in partby the Swiss National Science Foundation (PP00P2-123438, D. Van De Ville)and in part by the Centre for Biomedical Imaging (CIBM). The associate editorcoordinating the review of this manuscript and approving it for publication wasDr. Rafael Molina.

D. Van De Ville is with the Department of Radiology and Medical Infor-matics, University of Geneva, 1211 Geneva 14, Switzerland, and also with theInstitute of Bioengineering, École Polytechnique Fédérale de Lausanne (EPFL),Lausanne 1015, Switzerland.

M. Kocher is with the Biomedical Imaging Group, EPFL, Lausanne 1015,Switzerland.

Digital Object Identifier 10.1109/TIP.2011.2121083

1057-7149/$26.00 © 2011 IEEE

Page 2: R Nonlocal Means With Dimensionality Reduction and SURE …miplab.epfl.ch/pub/vandeville1101.pdf · 2011-08-30 · Proc. IEEE Int. Conf. Image Process., Oct. 2008, pp. 365–368.

2684 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 9, SEPTEMBER 2011

e.g., results for NLM denoising for a 7� 7 neighborhood are clearlyimproved by reducing the 49 dimensions to 5–10. Moreover, Tasdizen[20] proposed the combination of PCA and NLM with parallel analysisto select the dimensionality [21].

Stein’s unbiased risk estimate (SURE) [22] is one elegant way to es-timate the mean squared error (MSE) of an image degraded by additiveGaussian noise. Following this principle, one can select optimal pa-rameters for regularization in inverse problems [23]–[25], in denoisingstrategies for wavelet thresholding [26]–[28], or using a numerical pro-cedure for denoising approaches in general [29]. In recent work, wederived an analytical form of SURE [22] for the NLM algorithm [30].This way, the MSE can be monitored from the noisy image only, whichis a very useful property to optimally tune the NLM algorithm. Thisconcept can also be used to locally adapt the NLM parameters [12].

Here we further extend the analytical form of the SURE for NLMwith linear projection of the neighborhoods, including projection ona dimensionality-reduced subspace as specified by PCA. Since thePCA-NLM algorithm depends nonlinearly on the different parameters(neighborhood size, width of smoothing kernel, search region, PCAdimensionality), we propose to optimize a linear expansion of severalNLMs with different parameter settings; an approach that is inspiredby the SURE-based linear expansion of thresholds (LET) proposedfor wavelet denoising [31]. In our case, the optimal linear combina-tion can be retrieved using the SURE of the individual PCA-NLMcontributions.

In Section II, we briefly review the NLM algorithm and the SUREprinciple, together with the extension of SURE for NLM with linearprojection. We also show how a linear expansion of multiple NLMs re-duces to solving a linear system of equations. Next, in Section III, wepresent and discuss the experimental results to demonstrate the feasi-bility of using SURE for NLM parameter selection.

II. METHODS

A. Nonlocal Means Algorithm

We consider the observation model

� � � � � (1)

where � � � stands for the vector representation of the noise-freeimage containing� pixels,� is the zero-mean white Gaussian noise ofvariance ���, and� is the observed noisy data. We denote the grayscalevalue of the individual pixel at position � � � as ��, where we implic-itly assume that vector indexing is mapped to a scalar index (e.g., usinglexicographic ordering); this notation better reflects the spatial depen-dencies of the image. The pixel-based NLM algorithm [1] is a spatiallyadaptive filter that maps the measured data � into �� as follows:

��� ����

������

�������

(2)

where �� is the search region around � and ���� are the weights thatcompare the neighborhoods around pixels � and �, respectively. Theweights are defined as

���� � ��� � �������� � ������

� ��(3)

where � defines the neighborhood and � is its total size; e.g., � ���� �� � ��� �� and � � � for a 7� 7 neighborhood.

B. Mean Squared Error and Stein’s Unbiased Risk Estimate

The mean squared error (MSE) of the denoised image with respectto its noise-free version is

������� ��

���� ���� �

����

��� � ����� (4)

where � � �� is the Euclidean norm. The peak signal-to-noise ratio(PSNR) is then defined as

������������� � ��� ������������

�����(5)

where the denominator indicates the peak intensity value of the image.SURE provides a means for unbiased estimation of the true MSE. It isspecified by the following analytical expression [22]

�������� ��

��� � ���� � �

� � ��������

�(6)

where ������ is the divergence of the NLM algorithm with respectto the measurements

������ ����

�����

(7)

which needs to be well defined in the weak sense. The derivation ofSURE relies on the additive white Gaussian noise hypothesis and as-sumes the knowledge of the noise variance ��. In practice, �� can beeasily estimated from the measured data (e.g., using the median of ab-solute deviation). The SURE-based PSNR, which we will name SURE-PSNR from now on, can then be computed as ��������������.

In previous work, we derived the analytical form of SURE forNLM [30]

��������

��

��� � ���� � �

���

�������

��� � ����� ���

����

���

�������

����

��������� � ��������� � �����

(8)

where � ����

���� and where �� is the NLM algorithm appliedto the squared pixel values. The computation of the divergence term canbe readily incorporated within the core of the NLM algorithm. Specifi-cally, implementing (8) requires an additional memory array to store ���(next to �), and its computational complexity takes only ��� � ��operations, compared to ��� � � � �� of the NLM algorithm itself,where � is the number of pixels in the search region.

C. Nonlocal Means for Transformed Neighborhoods

Instead of using directly the pixel values of the neighborhoods aspositions in the high-dimensional space, an appealing alternative is tofirst transform the neighborhoods in another domain with some favor-able properties. For example, the computational burden of the NLMalgorithm can be alleviated by projecting the neighborhood into a sub-space of lower dimensionality as determined by PCA [6], [16], [20].Specifically, the projection matrix � that diagonalizes the demeanedcovariance matrix of all patches in the image is computed. Then, eachneighborhood centered around� can be projected onto the vector�� � ������������� that is in a subspace with �� dimensions with �� � �

��� ����

��������� � � �� � ��� (9)

The only adaptation to the NLM algorithm is to redefine the weights as

���� � ��� ��

���� ��� � �����

�� ��� (10)

The use of dimensionality reduction ��� � �� can significantlyspeed-up the algorithm. Here we extend the derivation of the SUREfor the case of PCA-based NLM. The first step consists of deriving thedivergence term.

Page 3: R Nonlocal Means With Dimensionality Reduction and SURE …miplab.epfl.ch/pub/vandeville1101.pdf · 2011-08-30 · Proc. IEEE Int. Conf. Image Process., Oct. 2008, pp. 365–368.

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 9, SEPTEMBER 2011 2685

Fig. 1. Results for the “peppers256” test image corrupted with additive white noise of � � ��. (a) Original image. (b) Noisy image. (c) Results obtained withBM3D [19]. Without dimensionality reduction (no PCA): (d) 1 NLM, exhaustive optimization; (e) 12 NLMs with fixed parameters, optimal linear expansion usingSURE; (f) 12 NLMs with Monte-Carlo generated parameters, optimal linear expansion using SURE. With dimensionality reduction (6 PCs): (g) 1 NLM, exhaustiveoptimization; (h) 12 NLMs with fixed parameters, optimal linear expansion using SURE; (i) 12 NLMs with Monte-Carlo generated parameters, including PCAdimensionality � , optimal linear expansion using SURE.

Proposition 1 (Divergence of NLM With Linear Transform): Theindividual terms of the divergence �������� of the NLM algorithm aftertransforming the neighborhoods according to (9) are given by

�������

��

��

���

�������

����

��������� � �

� ���

���

�������

����

�������

��

�������

������

���

������ � ����������� � ���� (11)

Given our vector-indexing, it is important to note that the element ����correspond to the weight of the projection matrix � for the center po-

sition in the neighborhood contributing to the projection on the �thcomponent.

Proposition 2 (SURE for NLM With Linear Transform): The SUREfor the NLM algorithm can be expressed as

�� ��� ��

��� � ���� �

�� �

��������

��

���

�������

���������

����� � ���

���

�������

�������

����

������

���

������ � �������� ��� � ���� (12)

Page 4: R Nonlocal Means With Dimensionality Reduction and SURE …miplab.epfl.ch/pub/vandeville1101.pdf · 2011-08-30 · Proc. IEEE Int. Conf. Image Process., Oct. 2008, pp. 365–368.

2686 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 9, SEPTEMBER 2011

Fig. 2. Results for the “lena512” test image corrupted with additive white noise of � � ��. (a) Noisy image. (b) Results obtained with BM3D [19], PSNR 33.05dB. (c) Best result obtained with the proposed method using dimensionality reduction (6 PCs) and 12 NLMs with Monte-Carlo generated parameters, includingPCA dimensionality � , optimal linear expansion using SURE, PSNR 32.53 dB.

Note that this expression is valid for any linear transformation of theneighborhood, with or without dimensionality reduction. The compu-tational complexity for obtaining SURE has increased with respect to(8); i.e., it is now���� �� ���, which is of the same order as the NLMalgorithm. However, the operations of (12) can be incorporated in thecore loop of the NLM algorithm.

D. Selection of Best NLM

Using the proposed SURE for NLM, we are able to compare theperformance of NLMs with different parameter sets ������ �� ��, inorder to improve the denoising capabilities.

E. Linear Expansion of Multiple NLMs

Another possibility is inspired by the approach from [31]: we con-sider the linear combination of the outputs of several NLMs with anddifferent fixed parameter sets, and we optimize these linear weights bySURE to hopefully exceed the performance of each NLM taken indi-vidually. Specifically, we consider the linear expansion approach as

�� �

���

�������� (13)

In our case, ����� is the th NLM with parameter set���� �

�� ��� ��� and �� is the weight in the linear expansion. Theoptimal weights are obtained by minimizing the SURE of the linearcombination. From (6)

�������� �

���� � ��� ���� ��� �

��

�� ������ (14)

the partial derivatives towards the weights �� are then given by

���������

����

���

�������� �������

� ������� �� ����� (15)

which leads to the following system of equations:�

���

�� ������ ����� � �

� �������� �� ����� � � � � � � �� (16)

where the derivation of the SURE provides us with the divergenceterms. We can then find the linear weights that optimize the SURE ofthe linear expansion efficiently.

III. RESULTS AND DISCUSSION

We describe how SURE for NLM can be successfully deployedfor automatic parameter selection following various optimization

TABLE IHEURISTIC CHOICE OF PARAMETERS FOR EACH OF THE NLM WHEN

PERFORMING LINEAR EXPANSION OF MULTIPLE ONES

strategies. In the various experiments, the parameter space that we willsample from is as follows:

• neighborhood � � �� �� �� �� �� �, so � � �� ��� ��;• dimensionality of projection �� � �� �� � � � � �;• search region � � ���� ���� � � � � ���, so � � ��� ��� ��;• smoothing parameter � � ���� ���� � � � � ��.

All results discussed in detail below are summarized in the Table II. Wealso show some visual examples for the “peppers” test image, Fig. 1(a),corrupted with additive Gaussian noise of � ��, Fig. 1(b), and the“lena” test image corrupted with noise of � ��, Fig. 2(a).

A. Exhaustive Optimization

As a starting point, we perform an exhaustive optimization for asingle NLM to determine the best parameters ���� ��� ��� for�� � �(which corresponds to no PCA dimensionality reduction) and �� � �,respectively. The global optimum within this parameter space is foundby choosing the settings corresponding to the best SURE-PSNR, whichcoincides with the optimal setting for the ground-truth PSNR for alltest images and noise levels. Moreover, SURE-PSNR was always closewithin 0.10 dB to the true one—see the results in the rows “1 NLM”in Table II. The best parameter setting of the NLM varied with the testimage and with the noise level, which indicates the importance of adata-adaptive strategy such as obtained using SURE. It is surprising

Page 5: R Nonlocal Means With Dimensionality Reduction and SURE …miplab.epfl.ch/pub/vandeville1101.pdf · 2011-08-30 · Proc. IEEE Int. Conf. Image Process., Oct. 2008, pp. 365–368.

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 9, SEPTEMBER 2011 2687

TABLE IIPERFORMANCE OF THE VARIOUS APPROACHES AS MEASURED BY PSNR AND SURE-PSNR (BETWEEN PARENTHESES)

that dimensionality reduction of the patches onto 6 principal compo-nents (PCs) improved the performance with 0.5–1.1 dB. Visually, thisdifference is also striking as can be observed by comparing Fig. 1(d)and (g). Moreover, next to the performance gain, the computationalcomplexity of PCA-NLM with 6 PCs is reduced with almost one orderof magnitude (factor of 6/49).

One way to improve the NLM method is to change the weight of thecentral pixel, which is overestimated in the classical NLM formulation;i.e., ���� is always at the maximum of one, which is independent of thenoise level since the same two noise realizations are compared. Thispossibility has been mentioned by various authors and solved in dif-ferent ways; e.g., the NLM weights can be estimated using the SUREprinciple as in [32]. Here, we propose to add the original (noisy) imageto the set of images of the linear expansion. This image’s weight will bedetermined by SURE and turns out to be negative in practice in orderto lower the importance of the central pixel1. This way we easily obtainthe optimal weights of the two contributions that result in the best per-formance. Despite the face that providing the noisy image improves the

1Providing the noisy image itself to the linear expansion corresponds toadding the output of the identity operator, for which holds ��� ���� � � .

results for the NLM without dimensionality reduction, especially forhigh noise levels, the PCA-NLM method does not improve by doingso. This can be explained by the fact that the projection onto the mostimportant components of the patches automatically also removes thesole influence of the central weight because none of the PC vectors willbe localized at the central pixel of the patch. These results are listed inthe rows “1 ��� � ����” in Table II.

B. Heuristic Optimization

Given the improvement by adding a proportion of the original noisyimage with the weight determined by the SURE, it is tempting to addmore NLMs to the linear expansion. However, finding the optimal pa-rameters of all NLMs jointly becomes unfeasible. Therefore, in the nextseries of experiments, we verify how the performance can be furtherimproved by linearly combining the outputs of multiple NLMs, eachone with predefined parameter set. In particular, when more NLMsare added, the parameter set is chosen according to Table I; i.e., weused 3, 6, and 12 NLMs, respectively, each time together with the orig-inal noisy image. The results are listed in Table II in the rows withthe label���. For the case without dimensionality reduction, increasing

Page 6: R Nonlocal Means With Dimensionality Reduction and SURE …miplab.epfl.ch/pub/vandeville1101.pdf · 2011-08-30 · Proc. IEEE Int. Conf. Image Process., Oct. 2008, pp. 365–368.

2688 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 9, SEPTEMBER 2011

Fig. 3. Average PSNR (with respect to ground truth) over 10 realizations of NLM linear expansions with 12 NLM and Monte-Carlo generated parameters��� �� ��. The SURE-PSNR (not indicated) is very close to the true PSNR (within 0.10 dB). (a) Noise level of � � ��. (b) Noise level of � � ��.

the number of NLMs always improved results. For the case with di-mensionality reduction, the improvement from 6 to 12 NLMs becomesless significant. In some cases (e.g., “house” with dimensionality reduc-tion), the SURE-PSNR improved but the true PSNR decreased. We be-lieve that this is due to an overfitting of the linear expansion, especiallyfor a simple image such as “house” where the neighborhoods have arelatively low dimensionality. In this case, the difference between thetrue PSNR and SURE-PSNR increases, as well as the dynamic range ofthe weights of the linear expansion (typically below 0.50). An exampleof “peppers” using 12 NLMs is shown in Fig. 1(e) and (h), without andwith dimensionality reduction, respectively.

C. Monte-Carlo Optimization

From the optimal weights of the NLM linear expansions, we couldnot identify a clear trend that would be indicative for the “right” NLMparameters to use; i.e., there is a large variability for different imagesand noise levels. Therefore, we consider another experiment where allNLM parameters ������ �� �� are randomly generated according toa uniform distribution within the range of neighborhood size, dimen-sionality of the neighborhood after PCA, search region, and smoothingparameter as defined in the beginning of Section III.

1) Selection of Best NLM: We now take one step back and se-lect the best performing NLM, according to its SURE, for 120 re-alizations. This reverts parameter optimization by random samplingof the parameter space ������ �� ��. The results are indicated in therows “� ���� ��” with label��� in Table II. We observe that in al-most all cases the performance from the exhaustive search (with fixed�� � ) is not reached. This suggests that random sampling of theparameter space remains suboptimal, despite the high number (120) ofrealizations.

2) Linear Expansion of Multiple NLMs: We have seen before, whencombining multiple NLMs with heuristic parameter sets with SURE-based linear expansion, that the diversity of the various contributions ismore important than their individual quality. Therefore, we use nowrandom parameters for 3, 6, and 12 NLMs. These are indicated inTable II with the label���. Each time, the best performance of 10 Monte-Carlo realizations as indicated by the SURE-PSNR is reported. Inter-estingly, this simple method outperforms both the best single NLM andthe NLM linear expansion with fixed parameters: combining only 3NLMs often reached or improved the results over 12 NLMs with fixedparameters. The results for “peppers” are shown in Fig. 1(f) and (i).Despite the improved PSNR, visual observation of the images revealsa grainy appearance which is probably due to contributions of NLMswith small neighborhoods. As a comparison, the result obtained by the

Fig. 4. Computation time of a single NLM algorithm for a 256� 256 image asa function of the neighborhood dimensionality �� � and search region.

state-of-the-art algorithm BM3D [19], which uses basis functions thatare better adapted to edges, is shown in (c). For natural images such as“lena,” this difference is less obvious; e.g., see Figs. 2(b) and (c).

D. On the Dimensionality Reduction

We have observed that using PCA to reduce the dimensionality ofthe neighborhood is beneficial for both quality and computational com-plexity. The optimal number of dimensions is still dependent on theimage content and the noise level, which explains why the linear expan-sion of NLMs with Monte-Carlo generated parameters is advantageous.In Fig. 3, we plot the average PSNR as a function of the number of di-mensions for a linear expansion of 12 NLMs with ����� �� Monte-Carlo generated.

E. On the Computational Complexity

Finally, we also briefly mention the computational complexity of theproposed method, which was implemented in Matlab (R2010b) usingC for the core calculations (Intel Core 2 Duo, 2.66 GHz; 4 GB RAM).The dimensionality of the neighborhood (eventually after projection)and the search region are the two main parameters that influence thecomputation time. Therefore, in Fig. 4, we plot the computation timeof a single NLM for a 256� 256 image as a function of �� and �.Compared to the main NLM algorithm, the computational load of thedivergence term for the SURE calculation and the optimal weights ofthe linear expansion (when combining multiple NLMs) are negligible.

Page 7: R Nonlocal Means With Dimensionality Reduction and SURE …miplab.epfl.ch/pub/vandeville1101.pdf · 2011-08-30 · Proc. IEEE Int. Conf. Image Process., Oct. 2008, pp. 365–368.

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 9, SEPTEMBER 2011 2689

�����

����

� �

�������� � ��������� � �� � �� �

� �

�������� � ���������

��

� �

�������� � ����������� � �� � � �

(18)

Note that our best results taking the best realization of 10 Monte-Carloparameter sets for 12 NLMs require 120 NLM evaluations, which typ-ically adds up to 15–20 minutes. However, this type of algorithm canmake use of a parallel implementation in an almost trivial way.

IV. CONCLUSION

We derived the SURE for the NLM algorithm with linear projectionof the neighborhoods. The key feature of this derivation is the explicitanalytical form of the divergence term of NLM, which is surprising fora nonlinear algorithm. The SURE can be easily computed on-the-fly aspart of the original NLM algorithm.

The parameter setting of NLM is dependent on image content andnoise level. Therefore, the SURE is a useful measure to estimate andtune these parameters. Next to exhaustive optimisation, we considereda linear expansion of multiple NLMs. We obtained the best perfor-mance for a linear combination of 12 NLMs using Monte-Carlo gen-erated parameter sets. These results are close to the state-of-the-art de-noising schemes while relying on the relatively simple algorithm ofNLM and the SURE-based optimisation of linear weights.

Future work could further investigate the optimal structure of theNLM parameter settings. Promising avenues also include the use ofdifferent linear projections of the neighborhoods (e.g., to improve in-variance to some features) and further development of a spatially adap-tive version of NLM, including speeding up the algorithm [12]. Finally,the reprojection method from [33] could be incorporated to improve vi-sual quality once the optimal parameter set determined.

APPENDIX ADERIVATION OF THE DIVERGENCE TERM

To obtain the divergence term, we introduce �� � ������� and

we derive �� with respect to ��, which results into

������

���

��������� ������

��

��

� �

��

�����

�����

����� � ����

��

������

�����

��� (17)

Further on, deriving the weights gives (18), shown at the top of thepage. By using the previous relations, we derive the constituting termof the divergence as (11). The divergence is finally given by combiningthe previous relation with (7).

ACKNOWLEDGMENT

The authors would like to thank Dr. G. Peyré for making availableto the community his Matlab/C implementation of the NLM algorithm.The modified source code is available on request.

REFERENCES

[1] A. Buades, B. Coll, and J. Morel, “A review of image denoising algo-rithms, with a new one,” SIAM Interdisciplinary J.: Multiscale Model.Simulat., vol. 4, no. 2, pp. 290–530, 2005.

[2] M. Mahmoudi and G. Sapiro, “Fast image and video denoising via non-local means of similar neighborhoods,” IEEE Signal Process. Lett., vol.12, no. 12, pp. 839–842, Dec. 2005.

[3] P. Coupé, P. Yger, S. Prima, P. Hellier, C. Kervrann, and C. Barillot,“An optimized blockwise nonlocal means denoising filter for 3-d mag-netic resonance images,” IEEE Trans. Med. Imag., vol. 27, no. 4, pp.425–441, Apr. 2008.

[4] A. Dauwe, B. Goossens, H. Luong, and W. Philips, “A fast non-localimage denoising algorithm,” in Proc. SPIE Electron. Imag., 2008, vol.6812, pp. 681210–681210-8.

[5] T. Brox, O. Kleinschmidt, and D. Cremers, “Efficient nonlocal meansfor denoising of textural patterns,” IEEE Trans. Image Process., vol.17, no. 7, pp. 1083–1092, Jul. 2008.

[6] N. Azzabou, N. Paragios, and F. Guichard, “Image denoising basedon adapted dictionary computation,” in Proc. IEEE Int. Conf. ImageProcess. (ICIP), 2007, vol. 3, pp. 109–112.

[7] J. Orchard, M. Ebrahimi, and A. Wong, “Efficient nonlocal-meansdenoising using the SVD,” in Proc. IEEE Int. Conf. Image Process.(ICIP), 2008, pp. 1732–1735.

[8] J. Wang, Y. Guo, Y. Ying, Y. Liu, and Q. Peng, “Fast non-local algo-rithm for image denoising,” in Proc. IEEE Int. Conf. Image Process.(ICIP), 2006, pp. 1429–1432.

[9] B. Goossens, H. Luong, A. Pizurica, and W. Philips, “An improvednon-local denoising algorithm,” in Proc. Int. Workshop Local Non-Local Approximat. Image Process. (LNLA), 2008, pp. 143–156.

[10] J. Darbon, A. Cunha, T. F. Chan, S. Osher, and G. J. Jensen, “Fast non-local filtering applied to electron cryomicroscopy,” in Proc. 5th IEEEInt. Symp. Biomed. Imag.: Nano Macro (ISBI2008), Paris, France, May14–17, 2008, pp. 1331–1334.

[11] R. Vignesh, B. T. Oh, and C. Kuo, “Fast non-local means (nlm) compu-tation with probabilistic early termination,” IEEE Signal Process. Lett.,vol. 17, no. 3, pp. 277–280, Mar. 2010.

[12] V. Duval, J. F. Aujol, and Y. Gousseau, On the Parameter Choice forthe Non-Local Means HAL, Tech. Rep. HAL-00468856, 2010.

[13] C. Kervrann and J. Boulanger, “Optimal spatial adaptation for patch-based image denoising,” IEEE Trans. Image Process., vol. 15, no. 10,pp. 2866–2878, Oct. 2006.

[14] P. Chatterjee and P. Milanfar, “A generalization of non-local means viakernel regression,” in Proc. SPIE Electron. Imag., 2008, vol. 6814, pp.68140P–68140P-9.

[15] G. Peyré, “Image processing with non-local spectral bases,” SIAM Mul-tiscale Model. Simulat., vol. 7, no. 2, pp. 703–730, 2008.

[16] T. Tasdizen, “Principal components for non-local means image de-noising,” in Proc. 15th IEEE Int. Conf. Image Process. (ICIP), SanDiego, CA, 2008, pp. 1728–1731.

[17] S. Zimmer, S. Didas, and J. Weickert, “A rotationally invariant blockmatching strategy improving image denoising with non-local means,”in Proc. Int. Workshop Local Non-Local Approximat. Image Process.,2008, pp. 135–142.

[18] V. Doré and M. Cheriet, “Robust NL-means filter with optimal pixel-wise smoothing parameter for statistical image denoising,” IEEE Trans.Signal Process., vol. 57, no. 5, pp. 1703–1716, May 2009.

[19] K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “Image denoisingby sparse 3d transform-domain collaborative filtering,” IEEE Trans.Image Process., vol. 16, no. 8, pp. 2080–2095, Aug. 2007.

[20] T. Tasdizen, “Principal neighborhood dictionaries for nonlocal meansimage denoising,” IEEE Trans. Image Process., vol. 18, no. 12, pp.2649–2660, Dec. 2009.

[21] J. L. Horn, “A rationale and test for the number of factors in factoranalysis,” Psychometrica, vol. 30, no. 2, pp. 179–185, 1965.

[22] C. Stein, “Estimation of the mean of a multivariate normal distribution,”Ann. Statist., vol. 9, pp. 1135–1151, 1981.

[23] K. C. Li, “From Stein^{\prime}s unbiased risk estimates to the methodof generalized cross validation,” Ann. Statist., vol. 13, no. 4, pp.1352–1377, 1985.

[24] V. Solo, “A SURE-fired way to choose smoothing parameters in ill-conditioned inverse problems,” in ICIP, 1996, vol. 3, pp. 89–92.

[25] C. Vonesch, S. Ramani, and M. Unser, “Recursive risk estimation fornon-linear image deconvolution with a wavelet-domain sparsity con-straint,” in Proc. 2008 IEEE Int. Conf. Image Process. (ICIP2008), SanDiego, CA, Oct. 12–15, 2008, pp. 665–668.

Page 8: R Nonlocal Means With Dimensionality Reduction and SURE …miplab.epfl.ch/pub/vandeville1101.pdf · 2011-08-30 · Proc. IEEE Int. Conf. Image Process., Oct. 2008, pp. 365–368.

2690 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 9, SEPTEMBER 2011

[26] D. L. Donoho and I. M. Johnstone, “Adapting to unknown smoothnessvia wavelet shrinkage,” J. Amer. Statist. Assoc., vol. 90, no. 432, pp.1200–1224, 1995.

[27] A. Benazza-Benyahia and J.-C. Pesquet, “Building robust wavelet es-timators for multicomponent images using Stein^{\prime}s principle,”IEEE Trans. Image Process., vol. 14, no. 11, pp. 1814–1830, Nov.2005.

[28] F. Luisier, T. Blu, and M. Unser, “A new SURE approach to imagedenoising: Interscale orthonormal wavelet thresholding,” IEEE Trans.Image Process., vol. 16, no. 3, pp. 593–606, Mar. 2007.

[29] S. Ramani, T. Blu, and M. Unser, “Monte-Carlo SURE: A black-boxoptimization of regularization parameters for general denoising algo-rithms,” IEEE Trans. Image Process., vol. 17, no. 9, pp. 1540–1554,Sep. 2008.

[30] D. Van De Ville and M. Kocher, “SURE-based non-local means,” IEEESignal Process. Lett., vol. 16, no. 11, pp. 973–976, Nov. 2009.

[31] T. Blu and F. Luisier, “The SURE-LET approach to image denoising,”IEEE Trans. Image Process., vol. 16, no. 11, pp. 2778–2786, Nov. 2007.

[32] J. Salmon, “On two parameters for denoising with non-local means,”IEEE Signal Process. Lett., vol. 17, no. 3, pp. 269–272, Mar. 2010.

[33] J. Salmon and Y. Strozecki, “From patches to pixels in non-localmethods: Weighted-average reprojection,” in Proc. 2010 IEEE 17thInt. Conf. Image Process., 2010, pp. 1929–1932.

Fast Bilateral Filter With Arbitrary Range andDomain Kernels

Bahadir K. Gunturk, Senior Member, IEEE

Abstract—In this paper, we present a fast implementation of the bilat-eral filter with arbitrary range and domain kernels. It is based on the his-togram-based fast bilateral filter approximation that uses uniform box asthe domain kernel. Instead of using a single box kernel, multiple box ker-nels are used and optimally combined to approximate an arbitrary domainkernel. The method achieves better approximation of the bilateral filtercompared to the single box kernel version with little increase in compu-tational complexity. We also derive the optimal kernel size when a singlebox kernel is used.

Index Terms—Image enhancement, nonlinear filtering.

I. INTRODUCTION

The bilateral filter is a nonlinear weighted averaging filter, where theweights depend on both the spatial distance and the intensity distancewith respect to the center pixel. The main feature of the bilateral filteris its ability to preserve edges while doing spatial smoothing. The term“bilateral filter” was first used by Tomasi and Manduchi in [1]; thesame filter was earlier called the SUSAN (Smallest Univalue SegmentAssimilating Nucleus) filter by Smith and Brady in [2]. The variants ofthe bilateral filter have been published even earlier as the sigma filter[3] and the neighborhood filter [4].

Manuscript received April 14, 2010; revised November 02, 2010; acceptedFebruary 18, 2011. Date of publication March 14, 2011; date of current ver-sion August 19, 2011. This work was supported in part by the National ScienceFoundation under Grant 0528785 and the National Institutes of Health underGrant 1R21AG032231-01. The associate editor coordinating the review of thismanuscript and approving it for publication was Dr. Luminita Aura Vese.

The author is with the Department of Electrical and Computer Engi-neering, Louisiana State University, Baton Rouge, LA 70803 USA (e-mail:[email protected]).

Color versions of one or more of the figures in this paper are available onlineat http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TIP.2011.2126585

At a pixel location � � ���� ���, the output of the bilateral filter iscalculated as follows:

����� ��

������� ���

����� � ������������ ���������� (1)

where ����� is the spatial domain kernel, ����� is the intensity rangekernel, � ��� is the set of pixels within a spatial neighborhood of �,and ���� is the normalization term

���� ���� ���

������ ������������ ������� (2)

The kernels����� and����� determine how the spatial and intensitydifferences are treated. The contribution (weight) of a pixel ���� isdetermined by the product of ����� and �����. The bilateral filter in[1] uses the Gaussian kernel, ����� � ���������, for both thedomain and range kernels:

������ ��� � �� ��� � ��� (3)

and

��������� ������ � �� ������� ������� (4)

On the other hand, the sigma filter [3] and the neighborhood filter [4]use different kernels. The sigma filter [3] first calculates the local stan-dard deviation around ����; the standard deviation is then used to de-termine a threshold value for pixel intensities, and pixels that are withinthe threshold of the center pixel ���� are averaged (with equal weights)to calculate the filter output at that pixel. In case of the neighborhoodfilter [4], the range kernel is a Gaussian as in (3), and the spatial kernelis a uniform box kernel. Among different kernel options, the Gaussiankernel is the most popular choice for both the range and spatial kernels,as it gives an intuitive and simple control of the behavior of the filterwith two parameters, � and � .

The bilateral filter has found a wide range of applications in imageprocessing and computer vision. The immediate application of the bi-lateral filter is image denoising as it can do spatial averaging withoutblurring edges. [5] presents a multiresolution extension of the bilateralfilter for image denoising and an empirical study on optimal parameterselection. It is shown that the optimal value of � is relatively insensi-tive to noise power, while the optimal � value is linearly proportionalto the noise standard deviation. Other applications of bilateral filter in-clude tone mapping in high-dynamic range imaging [6], contrast en-hancement [7], [8], fusion of flash and no-flash images [9], [10], fusionof visible spectrum and infrared spectrum images [11], compressionartifact reduction [12], 3-D mesh denoising [13], [14], depth map es-timation [15], video stylization [16], video enhancement [17], textureand illumination separation [18], orientation smoothing [19], and op-tical flow estimation [20].

This paper presents a fast approximation of the bilateral filter witharbitrary range and domain kernels. It is based on a method presentedby Porikli in [21]. The method in [21] (which uses a box domainkernel) is extended by optimally combining multiple box kernels toapproximate an arbitrary domain kernel. As there is no restriction onthe range kernel either, any range and domain kernels can be usedwith this fast bilateral filter implementation. Section II reviews thefast bilateral filter techniques in the literature. The proposed method isexplained in Section III. In Section IV, the question of optimal kernelsize in case of a single box kernel is addressed. Section V providessome experimental results, and Section VI concludes the paper.

1057-7149/$26.00 © 2011 IEEE