Signal Processing 87 (2007) 799–810 A novel approach for vector quantization using a neural network, mean shift, and principal component analysis-based seed re-initialization Chin-Chuan Han a, , Ying-Nong Chen b , Chih-Chung Lo c , Cheng-Tzu Wang d a Department of Computer Science and Information Engineering, National United University, Miaoli, Taiwan b Department of Computer Science and Information Engineering, National Central University, Taoyuan, Taiwan c Department of Informatics, Fo Guang College of Humanities and Social Sciences, Ilan, Taiwan d Department of Computer Science, National Taipei University of Education, Taipei, Taiwan Received 6 December 2005; received in revised form 15 July 2006; accepted 7 August 2006 Available online 7 September 2006 Abstract In this paper, a hybrid approach for vector quantization (VQ) is proposed for obtaining the better codebook. It is modified and improved based on the centroid neural network adaptive resonance theory (CNN-ART) and the enhanced Linde–Buzo–Gray (LBG) approaches to obtain the optimal solution. Three modules, a neural net (NN)-based clustering, a mean shift (MS)-based refinement, and a principal component analysis (PCA)-based seed re-initialization, are repeatedly utilized in this study. Basically, the seed re-initialization module generates a new initial codebook to replace the low- utilized codewords during the iteration. The NN-based clustering module clusters the training vectors using a competitive learning approach. The clustered results are refined using the mean shift operation. Some experiments in image compression applications were conducted to show the effectiveness of the proposed approach. r 2006 Elsevier B.V. All rights reserved. Keywords: Vector quantization; ELBG algorithm; Neural network; Mean shift; Principal component analysis 1. Introduction Vector quantization (VQ) is an efficient and simple approach for data compression. It was derived from Shannon’s rate-distortion theory [1], and has been successfully applied in image compres- sion and speech coding. It encodes images by scalars instead of vectors for obtaining better performance. In the procedure for image compression, VQ first partitions an image into several blocks to form a vector set (input vectors). The input vectors are individually quantized to the closest codeword from a codebook. This codebook (e.g. a set of codewords) was generated from a set of training vectors by using the clustering techniques. An image was encoded by the indices of codewords and decoded by a table-look-up technique. In general, VQ partitions a vector set X of size N p into a codebook C of size N c . That is X ¼ fx 1 ; x 2 ; ... ; x N p g, C ¼fc 1 ; c 2 ; ... ; c N c g, and the di- mensions of all vectors are m. Here, m is the size of an encoded image block, and N c 5N p . To quantize ARTICLE IN PRESS www.elsevier.com/locate/sigpro 0165-1684/$ - see front matter r 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.sigpro.2006.08.006 Corresponding author. Tel.: +886 3 7381258; fax: +886 3 7354326. E-mail address: [email protected] (C.-C. Han).
12
Embed
A novel approach for vector quantization using a neural …techlab.bu.edu/files/resources/articles_tt/[SignalProcessing]v87_i... · A novel approach for vector quantization using
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
ARTICLE IN PRESS
0165-1684/$ - se
doi:10.1016/j.si
�Correspondfax: +886 3 735
E-mail addr
Signal Processing 87 (2007) 799–810
www.elsevier.com/locate/sigpro
A novel approach for vector quantization using aneural network, mean shift, and principal component
aDepartment of Computer Science and Information Engineering, National United University, Miaoli, TaiwanbDepartment of Computer Science and Information Engineering, National Central University, Taoyuan, Taiwan
cDepartment of Informatics, Fo Guang College of Humanities and Social Sciences, Ilan, TaiwandDepartment of Computer Science, National Taipei University of Education, Taipei, Taiwan
Received 6 December 2005; received in revised form 15 July 2006; accepted 7 August 2006
Available online 7 September 2006
Abstract
In this paper, a hybrid approach for vector quantization (VQ) is proposed for obtaining the better codebook. It is
modified and improved based on the centroid neural network adaptive resonance theory (CNN-ART) and the enhanced
Linde–Buzo–Gray (LBG) approaches to obtain the optimal solution. Three modules, a neural net (NN)-based clustering, a
mean shift (MS)-based refinement, and a principal component analysis (PCA)-based seed re-initialization, are repeatedly
utilized in this study. Basically, the seed re-initialization module generates a new initial codebook to replace the low-
utilized codewords during the iteration. The NN-based clustering module clusters the training vectors using a competitive
learning approach. The clustered results are refined using the mean shift operation. Some experiments in image
compression applications were conducted to show the effectiveness of the proposed approach.
r 2006 Elsevier B.V. All rights reserved.
Keywords: Vector quantization; ELBG algorithm; Neural network; Mean shift; Principal component analysis
1. Introduction
Vector quantization (VQ) is an efficient andsimple approach for data compression. It wasderived from Shannon’s rate-distortion theory [1],and has been successfully applied in image compres-sion and speech coding. It encodes images by scalarsinstead of vectors for obtaining better performance.In the procedure for image compression, VQ first
e front matter r 2006 Elsevier B.V. All rights reserved
partitions an image into several blocks to form avector set (input vectors). The input vectors areindividually quantized to the closest codeword froma codebook. This codebook (e.g. a set of codewords)was generated from a set of training vectors byusing the clustering techniques. An image wasencoded by the indices of codewords and decodedby a table-look-up technique.
In general, VQ partitions a vector set X of size Np
into a codebook C of size Nc. That is X ¼
fx1;x2; . . . ;xNpg, C ¼ fc1; c2; . . . ; cNc
g, and the di-mensions of all vectors are m. Here, m is the size ofan encoded image block, and Nc5Np. To quantize
ARTICLE IN PRESSC.-C. Han et al. / Signal Processing 87 (2007) 799–810800
each vector xk, a codeword cj is selected with theshortest distance dðxk; cjÞ between xk and cj. TheEuclidean distance is the most used metric, i.e.,cj ¼ qðxkÞ ¼ argminci2Ckxk � cik. Assume a univer-sal set U of all possible codebooks. The bettercodebook is obtained by minimizing the square oferrors between two sets X and C,
C ¼ argminqj2U
1
Np
XNp
k¼1
dðxk; qjðxkÞÞ2
!. (1)
Linde et al. [1] proposed the famous Linde–Buzo–Gray (LBG) algorithm to find the codebookfor image compression. However, the results of VQmethods are much affected by the initialization ofthe codebook. They are frequently trapped in thelocal solution resulting in poor performance. Be-sides, the low-utilized codewords in the generatedcodebook distort the decoded images. There threeproblems are still present in the LBG algorithm.Patane and Russo [2] proposed the enhanced LBG(ELBG) algorithm to find an optimal codebook bymeans of shifting the low-utilized codewords toanother ones with high utility. This is a split-and-merge-based algorithm for improving the uti-lity of codewords. Kaukoranta et al. [3] alsoproposed an iterative algorithm combining withsplit and merge operations (ISM) for the generationof a codebook. Haber and Seidel [4] modified theLBG algorithm, called ILBG, to reduce the code-book errors by a few additional iteration steps.Huang et al. [5] proposed a novel algorithm toimprove the codeword utility and the trainingperformance by combining the genetic algorithmand the simulated annealing technique. The cross-over and mutation operations were based on thesimulated annealing process. The local optimalsolution was avoided.
Recently, many researchers have tried to solve theproblems in finding the optimal solution using theneural network-based learning approaches. Lin andYu [6] proposed a centroid neural network adaptiveresonance theory (CNN-ART) to improve theperformance of VQ. CNN-ART, an unsupervisedand competitive neural network model, consideredthe weights of neurons as the codewords. AlthoughCNN-ART relieved the dependence on the initiali-zation of the codebook, it was still affected as can beseen from the results. Besides, there is a low-utilityproblem in their proposed approach with poorinitialization. Laha et al. [7] designed a codebookusing a self-organizing feature map technique.
During the training process, the weights of nodeswere considered to be the codewords. All weights ofnodes built up the codebook. However, the block-effects frequently distorted the reconstructedimages. This problem was solved by a polynomialsurface fitting technique [7].
On the other hand, researchers tried to speed upthe process of finding the optimal solution. The treesearch vector quantizer (TSVQ) was the improbablealgorithm using a tree structure [8]. Chang and Lin[9] eliminated most of the impossible codewords todecrease the matched number for speeding up thesearching process. Chan and Ma [10] proposed afast maximum descent (MD) approach to quicklyfind the codebook. Pan et al. [11] modified theL2-norm pyramid encoding approach to decreasethe searching time. Chen [12] utilized the fuzzyreasoning technique to predict the codewords forimproving the searching performance. Huang andChang [13] proposed a color finite-state LBG(CFSLBG) algorithm to reduce the searching timeof the LBG algorithm.
Since the compression performance is muchaffected by the codebook, the codebook generationhas a key role in VQ. In summary, four problemsfrequently occur in the clustering process. They are(1) the initial condition, (2) the local optimum
solution, (3) the low-utilized codeword, and (4) thesequence of training vectors problems. Given atraining set X ¼ fx1;x2; . . . ; xNp
g, Np elements arepartitioned into Nc subsets and the sum of squarederrors in Eq. (1) should be minimized. There areðNcÞ
Np=Nc ways for partition. It is infeasible to findthe optimal partition by an exhaustive searchingapproach. Iterative optimization is the most fre-quently used approach. Nc initial codewordsc1; c2; . . . ; cNc
are randomly guessed. At each itera-tion, Np training vectors are sequentially assigned tothe nearest codeword and a new codeword is re-calculated as the mean of all instances belonging tothe same cluster. These two steps are repeated untilthe codewords stabilize. The iterative procedureguarantees local but not global optimization. Thefound solution highly depends on the initial condi-tion Ci and the sequence of training vectorsx1; x2; . . . ;xNp
. They are the well-known problemsof initial condition, local optimization, and samplesequence. In addition to the sum-of-squared-errorcriterion, the codeword utility is another indicationto evaluate the total distortions related to codewordcj. The equalization of the codeword distortions isequivalent to the equalization of the codeword
ARTICLE IN PRESSC.-C. Han et al. / Signal Processing 87 (2007) 799–810 801
utilities. According to the consequences in Ref. [2],the aim of the clustering operations is to obtain theequal distortion for each codeword in optimal VQ.Therefore, discarding the low-utility codewords andsplitting the high-utility codewords into smallerones are the effective strategies to solve the low-utility problem.
In this paper, a hybrid algorithm is proposed tofind a better codebook for image compressionapplications. Three modules, a neural net (NN)-based clustering, a mean shift (MS)-based refine-ment, and a principal component analysis (PCA)-based seed re-initialization, were integrated in thisstudy. It is abbreviated as PCA-NN-Meanshift(PNM). Initially, a codebook of size Nc wasrandomly generated. These codewords were in-putted into the weights of a NN. Next, the trainingvectors were clustered and the weights were adaptedup to a converged state. In addition, a MS-basedoperation was performed to refine the codeword ineach cluster. However, a local optimal solution wasfrequently obtained. Besides, codewords with lowutility were frequently generated. A PCA-basedprocess was performed to analyze the sampledistribution. New seeds were generated to replacethe codewords with low utility. These three moduleswere repeatedly executed to find the better solution.The rest of this paper is organized as follows:The background of ELBG and CNN-ART arebriefly reviewed in Section 2. In Section 3, themodules, NN-based clustering, MS-based refine-ment, and PCA-based seed re-initialization, weredesigned to find the better codebook. In Section 4,some experiments in the image compression appli-cation were conducted to show the feasibility ofproposed method. Some conclusions are given inSection 5.
2. Background
In this paper, the ELBG and CNN-ART algo-rithms were improved to obtain a better codebookfor VQ. Following is a brief description of thebackgrounds of these two algorithms.
2.1. Enhanced LBG algorithm
An ELBG algorithm was proposed by Patane andRusso to improve the LBG method [2]. Theyapplied a utility rate for each codeword to enhancethe performance of the LBG algorithm. The mainimprovement was based on a shifting of codewords
attempts (SoCA) procedure. Basically, a codewordcl with a low-utility rate was heuristically shiftedto a nearby codeword ch with a high rate. Supposethat a training set X assigned to codeword chwas bounded in a hyper-box I ¼ ½x1l ; x1u��
½x2l ;x2u� � � � � � ½xml ;xmu�, and codeword ch wasthe center vector of set X. Codewords cl and chwere recomputed as cl ¼ ½x1l þ
14ðx1u � x1lÞ;x2l þ
14ðx2u � x2lÞ; . . . ;xml þ
14ðxmu � xmlÞ� and ch ¼ ½x1u�
14ðx1u�x1lÞ;x2u�
14ðx2u�x2lÞ; . . . ; xmu�
14ðxmu � xmlÞ�.
All vectors in set X were re-clustered to thesetwo new codewords. That means two codewordssplit the training vectors in set X. This is a split-and-merge process. The largest cluster was split into twoclusters by two new codewords cl and ch, and thesmallest cluster was merged to another cluster. TheSoCA procedure is summarized as follows:
1.
Check the stopping criterion: Check if at least onecodeword with a utility rate of less than 1 hasbeen shifted in the previous iteration. If not;ter-minate the procedure.
2.
Codeword selection: Select two codewords, acodeword ch with a utility rate greater than 1and a codeword cl with a rate less than 1, toexecute the codeword shifting process.
3.
Codeword shifting and local rearrangement: Shiftcodeword cl to codeword ch. After that, adjustthese two codewords by the traditional LBGalgorithm under a termination criterion with ahigher threshold.
4.
Quantization error estimation: Calculate the ex-pected values of quantization errors (QE) beforeand after shifting. If value QE has decreased aftershifting, confirm the shifting process. Otherwise,reset codewords cl and ch to their originalpositions.
Repeat the above steps until the stop criterion issatisfied. This ELBG algorithm is used to solve theproblem of local optimal solution.
2.2. Centroid neural network adaptive resonance
theory (CNN-ART)
Lin and Yu [6] proposed a novel algorithm,CNN-ART, to generate a codebook. It is anunsupervised competitive learning algorithm, andis an extension of Grossberg’s adaptive resonancetheory [14]. Basically, the CNN-ART algorithmconsidered the synaptic weight vectors in each
ARTICLE IN PRESSC.-C. Han et al. / Signal Processing 87 (2007) 799–810802
neuron as codewords. The gradient-descent-basedalgorithm was used as the weight updating rule.CNN-ART started at a single node, i.e. the size ofcodebook was initialized to one. The winner nodewas rewarded with a positive learning gain and theloser nodes were punished with a negative one forthe competitive learning rules. This approach is athreshold criterion-based clustering approach. Acentroid node is increased as a new cluster whenthe Euclidean distances between an input vector andthe existing nodes are larger than a vigilance value.The CNN-ART approach repeated the incrementalprocess until the codebook size was Nc. Thealgorithm is summarized as follows:
1.
Initialization: Initialize the following variables:the codebook size Nc, the initial codebook C0 ¼
; , the training set X ¼ fxk : k ¼ 1; 2; . . . ;Npg, apre-defined threshold v, and the iteration indext ¼ 0.
2.
Clustering: Given the codebook Ct at iteration t,calculate the Euclidean distances between vectorxk and the weights in the network. Assign vectorxk to the node with the smallest distance.
3.
Node increment or weight updating: If the smallestdistance was larger than threshold v, and thenode number was smaller than value Nc, generatea new node. Otherwise, update the weights of awinner with the rewarding rules and those oflosers with the punishing rules.
4.
Check the stop criterion: When the state is stable,the process is terminated.
5.
Generate the codebook: If the codebook size wasNc, and a converged and stable state occurred,the codebook is assigned as the set of all nodes’synaptic weights.
Due to the scenarios of the ELBG algorithm andthe CNN-ART algorithm, four problems should besolved in the VQ process. To overcome theseproblems and to improve the performance of theELBG or the CNN-ART algorithms, a hybridalgorithm was developed in the following.
3. PNM for VQ
The architecture of PNM is composed of a PCA-based seed re-initialization module, an NN-basedclustering module, and an MS-based refinementmodule. They were iteratively performed to find theoptimal solution.
3.1. NN-based clustering
A simple MINNET network is a competitivelearning algorithm. It determines the nearest dis-tance between an input vector and those in theneuron’s weights for the output layer. The dimen-sions of the input layer is m the same as the numberof each neuron’s synaptic weights because of the fullconnection between them. Therefore, the weights ina neuron are considered as a codeword. The maindifference between the PNM and the CNN-ARTarchitecture is the number of initial neurons. CNN-ART repeatedly increases the neurons until Nc
nodes (the codebook size). Whereas Nc neurons areinitialized in PNM, and their weights are randomlyinitialized. The functions of MINNET in PNM arethe same as those in CNN-ART. All neurons arecompletely interconnected in MINNET. Each neu-ron got the values from its original neuron and thelateral inhibition ð��Þ from the other neurons. Theoutput value O
ðtÞj of the jth neuron at iteration t is
thus given as follows:
OðtÞj ¼ f t O
ðt�1Þj � �
Xiaj
Oðt�1Þi
!and
i; j ¼ 1; 2; . . . ;Nc; tX1, ð2Þ
f tðbÞ ¼b if bo0;
0 otherwise:
�(3)
Here, the initial condition was set as
Oð0Þj ¼ kxk � wjk. (4)
Value Nc denotes the number of neurons in netMINNET, and �41=Nc. Each node was repeatedlycompared with the other nodes until only a negativeoutput was generated. Meanwhile, the other neu-rons all outputted zero. The node with the negativeoutput was the desired codeword.
Next, let us describe the learning rules in PNM.Similar to the rules in CNN-ART and LBGalgorithms, the rewarding equation is written below:
wðtÞj ¼ w
ðt�1Þj þ
1
jcjj þ 1½xk � w
ðt�1Þj �,
k ¼ 1; 2; . . . ;Np; j ¼ 1; 2; . . . ;Nc, ð5Þ
and value 1=ðjcjj þ 1Þ is its learning rate.The above learning approach plays a clustering
role in PNM. The training vectors were sequentiallyclustered with the cluster centers (e.g., codewords,the weights of nodes).
ARTICLE IN PRESSC.-C. Han et al. / Signal Processing 87 (2007) 799–810 803
3.2. MS-based refinement
After the NN-based clustering procedure, the MSoperation [15] was performed on each cluster torefine the codeword. In this operation, the multi-variate kernel density estimate with an observationwindow W was estimated as
f̂ ðxÞ ¼1
Nwhm
XNw
k¼1
kex� xk
h
� �. (6)
Here, symbol x is an observation sample, valuesm;Nw, and h represent the window’s dimensionalnumber, the sample number in window W, and thewindow’s radius, respectively. The Epanechnikovkernel function was applied in this procedure as
keðxÞ ¼12
c�1m ðmþ 2Þð1� xTxÞ if xTxo1;
0 otherwise:
((7)
cm is the volume of unit sphere. The codewordshifted towards to the next position by the densitygradient of Eq. (6) as
MhðxÞ ¼h2
mþ 2
r̂f ðxÞ
f̂ ðxÞ, (8)
r̂f ðxÞ is a density gradient estimate. According tothe conclusions in Ref. [15], the sample MS in auniform kernel centered on x is the normalizedgradient of a kernel density estimate. Consider aregion ShðxÞ is a hypersphere of radius h in whichpoint x is a center of Nw samples. The sample MS isdefined as
MhðxÞ ¼1
Nw
Xxk2ShðxÞ
xk � xð Þ. (9)
Value h was set as half of the second eigenvalue l2in this study. Compute the MS vector MhðxÞ usingthe training samples belonging to the codeword cj,and shift to the next position. Repeat the movingprocess to refine the codeword.
3.3. PCA-based seed re-initialization
Since the results of CNN-ART algorithm areinfluenced by the initialization problem, a localoptimal solution is obtained. Besides, the codewordsin low utility are frequently generated from the localoptimal solution. As mentioned in the previoussection, there are four problems that frequentlyoccur in many proposed approaches. In order toremedy these problems, a PCA-based seed re-
initialization procedure was performed to generatea new codebook. Using this procedure, the inferenceof initial conditions was reduced. In addition, theprobability of finding an optimal solution wasincreased, and the number of low-utilized codewordwas thus decreased. The clustering results of NN-based and MS-based procedures were applied in theseed re-initialization module. Generally, the largeclusters were split into several smaller clusters,and the small clusters with low utility werediscarded and merged into a bigger cluster. Thesplitting rules are designed as follows: consider Ns
split clusters and Nm merged clusters. Nm clustercenters were discarded, and Nm new cluster centerswere generated for Ns clusters. On average, Nm=Ns
new cluster centers were generated for eachlarger cluster. The clusters were sorted basedon their sizes. Thereafter, the largest Ns clustersand the smallest Nm clusters were selected tosplit and discarded in this module, respectively.In this study, the number of new centers dependedon a generation rate r. Nm ¼ rNc new centerswere generated in each iteration, and Ns ¼ 3. Inaddition, value r was decreased by the distortion dbetween two iterations. Next, the PCA-basedstrategy to generate new cluster centers is designedas follows.
3.3.1. PCA-based seed selection
PCA is the most popular technique in manyapplications. Suppose there are M training samplespossessing the feature vectors of length m. Imageblocks of size 4 by 4, e.g. m ¼ 16, are frequentlyencoded in image compression. The covariancematrix S is defined as S ¼
PMk¼1ðxk � mÞðxk � mÞT,
where m denotes the sample center. The major K
eigenvectors fi corresponding with the largest K
eigenvalues li comprise the major distributions ofthese samples. The samples can be represented bythe K eigenvectors called bases with the minimalerrors. That means the new centers located at themajor bases have higher probabilities than the otherrandomly selected ones. Consider a specified clustercj of a larger size which was split by Nm=Ns newcenters. The samples belonging to cluster cj gener-ated the first K major bases, K ¼ 3 in this study.The samples were projected to a specified axis, fi,i ¼ 1; 2; 3, to obtain the temporary point with scalarvalues. Those temporary points were re-clusteredinto Nm=Ns clusters by the traditional K-meanclustering method with 1D feature values. Theclustered centers of temporary points were the
ARTICLE IN PRESSC.-C. Han et al. / Signal Processing 87 (2007) 799–810804
candidates of new centers. The temporary center onthe first axis f1 with 1D value l was also a pointwith the coordinate lf1 of dimension m in theoriginal space. Similarly, the samples were projectedand were re-clustered on the other axes to obtain thenew candidate centers. The distance variances of the3Nm=Ns candidate clusters were calculated andsorted. Finally, Nm=Ns candidate centers with thefirst Nm=Ns smallest variances were selected to bethe new centers.
3.3.2. Adaptive learning rate for seed selection
In this study, the adaptive learning rules weredesigned to avoid the divergence. The generationrate r was adapted due to the distortion d betweentwo iterations. The distortion d is defined as
d ¼ð1=NpÞ
PNp
k¼1dðxk; qt�1ðxkÞÞ � ð1=NpÞPNp
k¼1dðxk; qtðxkÞÞ
ð1=NpÞPNp
k¼1dðxk; qt�1ðxkÞÞ,
(10)
q is the mapping function for vector xk and itscorresponding codeword. The adaptive rulesdescribed in [16] were utilized to vary the learningrate. Three parameters x;k, and Z should bedetermined and three conditions were consideredas follows:
1.
If the distortion d increased by more thanparameter x ¼ 0:04, the learning rate was multi-plied by parameter Z ¼ 1:05. The rate r would behigh to generate more possible new seeds. Inaddition, the new seeds generated in this iterationwere discarded.
2.
If the distortion d decreased, the new seeds wereaccepted. The learning rate was multiplied byparameter k ¼ 0:9. The number of new seeds wasdecreased according to the decreasing rate.
3.
If the distortion increased by less than parameterx, the rate was unchanged for finely adapting theseed positions.
In summary, these three modules were iterativelyperformed. The NN module played a clusteringrole, the MS module refined the codewords, andthe PCA module assigned the new and better seeds.The PCA module tried to find the possiblecodewords near to the optimal solution. Accordingto the sample distribution, the large clusterswere split into several smaller clusters to reducethe distortion. Besides, it discarded the codewordswith low utility and re-assigned the better ones
with high utility. PCA module could solve theproblems of codebook initialization, codewordutility, and local optimization. The MS modulemoved each codeword toward a better position witha high sample density. It could improve theperformance of codeword utility and solutionoptimization. Since these modules were iterativelyperformed, a near global optimal solution wasfound, the low-utilized codewords were decreased,the poor initial seeds were discarded, and theinfluences of sample sequences were decreased.Thereafter, the proposed iterative scheme couldsolve the four problems of clustering and designedas follows:
3.3.3. Algorithm of PNM
Input: A training set X ¼ fx1; x2; . . . ; xNpg of size
Np.
Output: A set of cluster centers C ¼ fc1; c2; . . . ; cNcg
of size Nc.
Step 1: I nitialize the parameter r. Step 2: C luster the training samples sequentially to
obtain the cluster centers using the neuralnet-based clustering.
Step 3: F
or each codeword cj, compute the mean
shift vector using the training samples incluster cj, and shift to the next position.
Step 4: P
CA-based seed re-initialization. S tep 4.1: D etermine Ns split and Nm ¼ rNc
merged clusters based on theirsize, e.g. codeword utility.
S
tep 4.2: F or each split cluster cj:
1
. C ompute the eigenvectors andeigenvalues from the samplesin cluster cj.
2
. P roject the samples to the firstthree eigenvectors f1;f2, andf3, e.g. projection axes.
3
. O n each projection axis, thesamples were clustered usingthe 1D projected values.
4
. S elect Nm=Ns candidatecenters with the smallestvariances to be the new seedsin the next iteration.
S
tep 4.3: I gnore Nm codewords with lowutility.
Step 5: R
e-calculate the parameters r and d. Step 6: R epeat Steps 2 to 6 until a converged state
occurs, e.g. do0:01.
ARTICLE IN PRESSC.-C. Han et al. / Signal Processing 87 (2007) 799–810 805
4. Experimental results
In this section, some experiments were conductedto show the efficiency of the proposed method. ThePSNR value is the popular measurement to evaluatethe compressing algorithms. The benchmark imagesof 512 by 512 pixels used in the experiments areshown in Fig. 1. In this study, several experimentswere designed to test the performance of thealgorithm compared with others.
The first three experiments were conducted toshow the effectiveness of the found codebooks.First, an experiment was designed for the evaluationof the codebook size. Two images ‘Lena’ and‘Pepper’ were encoded using codebook sizes 32,64, 128, and 256. The codebooks were generated bythe LBG [1], ELBG [2], GVQ [5], GSAVQ [5],CNN-ART [6], CNN-ART-LBG, ILBG [4], ISM[3], MD [10], and PNM approaches, respectively.
Fig. 1. The images used in this study: (a) ‘Lena’, (b) ‘Pepper’, (c) ‘F-16’,
‘Girl’, (j) ‘Couple’, (k) ‘Baboon’, and (l) ‘Aerial’.
The comparisons using PSNR-based measurementwere made between the algorithms as shown inTable 1. In this table, the CNN-ART-LBG algo-rithm is an approach combining the CNN-ART andLBG approaches. Here, the codebook was firstgenerated from the CNN-ART algorithm, and thefinal codebook was refined by the LBG algorithm.Similarly, the GSAVQ algorithm is an approachusing the genetic algorithm encoding and thesimulated annealing technique. The proposed ap-proach is clearly superior to the others. Next, twoexperiments for the codebook generality wereconducted. The ‘Lena’ image was used to be thetraining image and to generate a codebook in thesecond experiment. The codebook was used toencode and decode the images ‘Pepper’, ‘F-16’,‘Sailboat’, and ‘Tiffany’ as shown in Fig. 1(b)–(e).The PSNR values generated by 10 algorithmsare tabulated in Table 2. Similar to the second
C.-C. Han et al. / Signal Processing 87 (2007) 799–810806
experiment, four images were used to generate thecodebook as shown in Fig. 1(b)–(e). Seven images,Fig. 1(f)–(l), were encoded and decoded by thegenerated codebook. The PSNR values for variousimages are tabulated in Table 3. From these tables,the generality of the codebook generated by PNM ismore robust than the others.
The last three experiments were conducted tocompare the algorithms on the codebook initializa-tion, the codeword utility, and the sequence oftraining vectors problems. Similar to the first three
experiments, 10 algorithms were executed fordifferent factors in various codebook sizes. Theseresults are shown in Tables 4–6.
The statistical utility rates of codewords wereobtained in various compression rates as listed inTable 4(a). The ‘Lena’ image was encoded anddecoded in this experiment. The compression rateswere set as 0.5625, 0.625, and 0.6875BPP. From thistable, the PNM algorithm generated more highlyutilized codewords than the other algorithms invarious rates. The codebook generated by the PNM
ARTICLE IN PRESS
Table 3
The PSNR values for various images encoded by the codebook generated from images ‘Lena’, ‘Pepper’, ‘F-16’, ‘Sailboat’ and ‘Tiffany’
C.-C. Han et al. / Signal Processing 87 (2007) 799–810 807
comprises fewer low-utilized codewords. In order toshow the invariance of the initial codebook, 30initial codebooks were randomly initialized. Image‘Lena’ was used to generate the encoding codebooksby 10 algorithms. The PSNR values for 30 initialconditions were averaged in various codebook sizesas tabulated in Table 4(b). Similarly, varioussequences of training vectors of image ‘Lena’ wererandomly generated to test the invariance of samplesequence. The variance of averaging PSNR valuesfor these algorithms are shown in Table 4(c). Ingeneral, five images ‘Lena’, ‘Pepper’, ‘F-16’, ‘Sail-
boat’, and ‘Tiffany’ were encoded and decoded toperform an inside test as tabulated in Table 5. Sevenimages ‘Goldhill’, ‘Toys’, ‘Zelda’, ‘Girl’, ‘Couple’,‘Baboon’ and ‘Aerial’ were decoded using thecodebook generated from the previous five trainingimages. The results of this outside test are tabulatedin Table 6 for different factors. From these threetables, the proposed approach has the smallestvariance of PSNR values. Our proposed algorithmcan be proven to be superior to the others.Additionally, the training time for various algo-rithms is shown in Fig. 2. Initially, the performance
ARTICLE IN PRESS
Table 4
The comparisons for 10 algorithms for different factors encoding and decoding the image ‘Lena’
BPP LBG ELBG GVQ GSAVQ C-ART CAL ILBG ISM MD PNM
(a) The utility rates in various compression rates
C.-C. Han et al. / Signal Processing 87 (2007) 799–810808
of the proposed approach was similar to that ofCNN-ART. The PSNR value was increased afterthe MS and PCA modules. After 350 s, the PSNRvalue of the proposed approach was the highest one.
5. Conclusions
In this paper, a novel algorithm has been proposedto find the better solution for VQ. Four problems inthe clustering process have been solved using the
NN-based clustering, the MS-based refinement, andthe PCA-based seed re-initialization modules. Thesethree modules were repeatedly performed to obtainthe better codebook. Some experiments were con-ducted to make a comparison with the otheralgorithms on the codebook initialization, the foundsolution, the codeword utility, and the sequence ofsamples problems. From the experimental results, theproposed algorithm performed better than the othersunder the PSNR criterion.
ARTICLE IN PRESS
Table 6
The comparisons of an outside test for different factors
BPP LBG ELBG GVQ GSAVQ C-ART CAL ILBG ISM MD PNM
(a) The utility rates in various compression rates