Top Banner
A Comparative Analysis of Dimension Reduction Algorithms on Hyperspectral Data Kate Burgers Yohannes Fessehatsion Sheida Rahmani Jia Yin Seo Advisor: Todd Wittman August 7, 2009 Abstract In the past there has been little study to determine an optimal dimension reduc- tion algorithm on hyperspectral images. In this paper we investigate the performance of different dimension reduction algorithms including PCA and Isomap using various hyperspectral tasks to compare them. We considered runtime, classification, anomaly dection, target detection and unmixing. We use both synthetic and real hyperspectral images throughout the experiment. The results are analyzed both quantitatively and qualitatively. 1 Introduction Hyperspectral sensors collect information as a set of images represented by different bands. Hyperspectral images are three-dimensional images with sometimes over 100 bands where as regular images have only three bands: red, green and blue. Each pixel has a hyperspectral signature that represents different materials. Hyperspectral images can be used for geology, forestry and agriculture mapping, land cover analysis, and atmospheric analysis. Even though hyperspectral images can sometimes contain over 100 bands, relatively few bands can explain the vast majority of the information. For such reason, hyperspectral images are mapped into a lower dimension while preserving the main features of the original data by a process called dimensional reduction. There is evidence that performing dimension reduction may affect the performance of image processing tasks including target detection algorithms, anomaly detection algorithms, classification and unmixing [1]. The dimension reduction codes are taken from Van Der Maatens MATLAB dimension reduction toolbox, except for ICA and classification, which comes from ENVI 4.6.1 Software [2, 3]. In this paper, we compare various algorithms for dimension reduction that have been developed. The paper investigates the following 8 dimension reduction techniques: (1) PCA [4], (2) Kernel PCA [5], (3) Isomap [6], (4) Diffusion Maps [7], (5) Laplacian Eigenmaps [8], (6) ICA [9], (7) LMVU [10], and (8) LTSA [11]. The aims of the paper are (1) evaluate the performance of each dimension reduction algorithm on the basic image processing tasks, and 1
23

A Comparative Analysis of Dimension Reduction Algorithms

Oct 24, 2014

Download

Documents

solmazbabakan
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A Comparative Analysis of Dimension Reduction Algorithms

A Comparative Analysis of Dimension Reduction Algorithms

on Hyperspectral Data

Kate Burgers Yohannes Fessehatsion Sheida RahmaniJia Yin Seo

Advisor: Todd Wittman

August 7, 2009

Abstract

In the past there has been little study to determine an optimal dimension reduc-tion algorithm on hyperspectral images. In this paper we investigate the performanceof different dimension reduction algorithms including PCA and Isomap using varioushyperspectral tasks to compare them. We considered runtime, classification, anomalydection, target detection and unmixing. We use both synthetic and real hyperspectralimages throughout the experiment. The results are analyzed both quantitatively andqualitatively.

1 Introduction

Hyperspectral sensors collect information as a set of images represented by different bands.Hyperspectral images are three-dimensional images with sometimes over 100 bands where asregular images have only three bands: red, green and blue. Each pixel has a hyperspectralsignature that represents different materials. Hyperspectral images can be used for geology,forestry and agriculture mapping, land cover analysis, and atmospheric analysis. Eventhough hyperspectral images can sometimes contain over 100 bands, relatively few bandscan explain the vast majority of the information. For such reason, hyperspectral imagesare mapped into a lower dimension while preserving the main features of the original databy a process called dimensional reduction. There is evidence that performing dimensionreduction may affect the performance of image processing tasks including target detectionalgorithms, anomaly detection algorithms, classification and unmixing [1]. The dimensionreduction codes are taken from Van Der Maatens MATLAB dimension reduction toolbox,except for ICA and classification, which comes from ENVI 4.6.1 Software [2, 3].

In this paper, we compare various algorithms for dimension reduction that have beendeveloped. The paper investigates the following 8 dimension reduction techniques: (1) PCA[4], (2) Kernel PCA [5], (3) Isomap [6], (4) Diffusion Maps [7], (5) Laplacian Eigenmaps [8],(6) ICA [9], (7) LMVU [10], and (8) LTSA [11]. The aims of the paper are (1) evaluate theperformance of each dimension reduction algorithm on the basic image processing tasks, and

1

Page 2: A Comparative Analysis of Dimension Reduction Algorithms

(2) determine intrinsic dimensionality of hyperspectral images. The performance of eachalgorithm is conducted both qualitatively when evaluating classification and quantitativelywhen evaluating runtime, target detection, anomaly detection and unmixing.

This paper is organized as follows: section 2 presents and discusses the eight dimensionreduction techniques. Section 3 introduces the data used and the methods of producingsynthetic images for target and anomaly detections and unmixing. Section 4 discussesthe approach to obtaining the runtime, anomaly and target detection, classification andunmixing. Section 5 outlines the results of the different algorithms based on the performancetasks. Section 6 discuses the conclusions of the experiment.

2 Dimension Reduction Algorithms

Eight different dimension reduction methods are compared in the experiment on the run-time, and the tasks of classification, anomaly detection, target detection, and unmixing.These methods are broken up into two categories: linear and nonlinear. There are oftendifferences between linear and nonlinear methods regarding both runtime and performance.Nonlinear methods, Isomap, Laplacian, LMVU and KPCA, are thought to give better re-sults with the trade-off of slower runtime [1]. For our experiments, different subsets ofalgorithms were tested for different tasks. The methods that we chose were all run withtheir default parameters except for KPCA, for which we chose ’poly’.

3 Images

We have used four different hyperspectral images on determining the runtime and classi-fication, because some algorithms give better results on some images while failing to giveadequate results on others. These four hyperspectral images are Urban (162 bands) [12],Terrain (162 bands) [12], Smith Island (126 bands) [13] and San Diego Airport (224 bands)[3] shown in figure 1. Urban is the image of a Walmart in Cypress Cove, Texas. Terrainis aerial image of roads in a desert region. Smith Island and San Diego Airport are aerialimages of Smith Island, Maryland and an airport in San Diego.

Ideally, we would use these four images to compare the success of each algorithm oncorrectly identifying each target, anomaly and class. We are unable to use these images dueto the fact that we have no ground truth of these images. As such, we have created differentsynthetic images for each task. From the hyperspectral images, we obtained a number ofspectral signatures of identifiable materials and labeled it accordingly with the same amountof bands as the images from which they were acquired. The size of the synthetic imagesfor anomaly detection and target detection were 50 × 50, where as the synthetic imagesfor unmixing was 100 × 100 and downsampled to 50 × 50. Each pixel was assigned with aknown material and Gaussian Noise was introduced with mean zero and variance 0.00005.

There are three synthetic images for anomaly detection and two synthetic images fortarget detection shown in figure 2. The spectral signature of each images were taken fromUrban and San Diego Airport labeled Fake Urban, Urban Blocks, and Fake San Diego,

2

Page 3: A Comparative Analysis of Dimension Reduction Algorithms

(a) Urban (b) Terrain

(c) Smith Island (d) San Diego Airport

Figure 1: Images used in our experiment.

3

Page 4: A Comparative Analysis of Dimension Reduction Algorithms

(a) Fake San Diego (b) Fake Urban (c) Urban Blocks

Figure 2: Images used for anomaly detection, target detection and unmixing with Gaussiannoise.

using the same number of bands as the parent image. Fake Urban and Fake San Diego wereused for anomaly detection and target detection. Urban Blocks was used solely for anomalydetection. For unmixing, Fake Urban and Fake San Diego were enlarged to 100×100 pixels,each pixel was blurred with its neigbours and the image was downsampled.

4 Experimental Design

4.1 Classification

Classification is the process of taking an image and breaking it up into specified classesdepending on the differences in hyperspectral signature of each material [14]. Pixels withsimilar hyperspectral signatures are generally grouped into the same class. An effectivedimension reduction algorithm will retain the information on each material enough thatwhen ran classification, the same material should be grouped into the same class.

In Meiching Fong’s paper it was mentioned that dimension reduction could possiblyimprove classification [1]. We were intrigued to use experiments to analyze and evaluatethis idea. A set of dimension reduction methods and images were chosen for testing. Themethods are KPCA, PCA, and ICA with reduced to three, four, and five bands for allthree methods. The images are Urban, Terrain, San Diego Airport and Smith Island. Weperformed classification with the K-Means classification method from ENVI. The defaultparameters for K-Means is retained with the change to the number of classes depending onthe image. The number of endmembers were used to select the number of classes in eachimage. The number of classes chosen for each image are shown in table 1.

The classification results were evaluated qualitatively based on comparison betweenmethods and the different dimensionalities on each method. The images were analyzedbased on accuracy of size of each class, ability to classify between manmade and naturalmaterials, and consistency in classifying the same materials as the same class.

4

Page 5: A Comparative Analysis of Dimension Reduction Algorithms

Urban San Diego Airport Terrain Smith Island5 7 4 4

Table 1: The number of classes in each image.

4.2 Anomaly Detection

In our experiment, we tested the performance of five different dimension reduction methodson the tasks of anomaly and target detection. We also tested for the intrinsic dimensionalityfor each image on both tasks. The dimension reduction methods compared are DiffusionMap, ICA, KPCA, LTSA and PCA. Dimension reduction was performed on the preparedsynthetic hyperspectral data to reduce the dimensionalities down to 3, 4 and 5 dimensions.We compare the methods by the means of True Positive Rate (TPR) and False PositiveRate (FPR) in the anomalies or targets detected.

Anomaly detection refers to selecting pixels from a given hyperspectral image that aredissimilar to their surrounding [15]. Finding an anomaly is generally not an easy taskbecause anomalies are small, which could be read as noise, and determining something asanomaly is fairly subjective. The tool used in our experiment is the RX Anomaly Detectionin ENVI with a local mean source. On each of the three synthetic hyperspectral images,we run each of the five dimension reduction methods to reduce the number of dimensionsdown to 3, 4, and 5 bands. With these we have a total of 45 result images to comparewith 3 original images, to determine both the optimal dimension reduction method and theintrinsic dimensionality.

4.3 Target Detection

The purpose of target detection is to find other pixels in the image that has the samespectral signature as a pre-determined material [15]. The difficulty in finding the targetsarise from: 1) with noise added, the spectral signatures of the target pixels are not thesame; 2) target patches of pixels are of various sizes, down to one pixel, which could bemistaken as noise; 3) each target is on top of different backgrounds, with some backgroundmore similar to the target than others. To perform target detection, we first select a pixelas the target spectral signature. The tool used in our experiment is the SAM Target Finderfrom ENVI. The parameters are set as default with the SAM maximum angle of 0.1. Wefirst run five dimension reduction methods on the two synthetic images down to the outputdimensionality of 3, 4, and 5 to obtain 30 results. Each of these results are fed through theSAM Target Finder, using the same target pixel for each image. These resulting imagesare used to test with the original synthetic images to evaluate the performance of eachdimension reduction method.

We tested whether each of the dimension reduction methods was able to retain theanomalies and targets in the original image even after removing the majority of the bands.To compare the different methods, we use the metrics TPR and FPR. First, we create

5

Page 6: A Comparative Analysis of Dimension Reduction Algorithms

the binary image representation of the original synthetic images, with one denoting thepresence of an anomaly or target and zero denoting the background. Secondly, we takethe dimension reduction result images and obtain the binary image representation of eachone. We compare the binary image of the original data along with each of the binary resultimages to obtain a TPR and FPR for each. Comparing the two images pixel by pixel, weclassify each pixel as True Positive (TP), False Positive (FP), True Negative (TN), or FalseNegative (FN). The TPR and FPR are defined as follows:

TPR =TP

TP + FN

andFPR =

FP

FP + TN.

Each of the rates is ranged from zero to one. The optimal value for the TPR is one andzero for FPR. The different methods can be evaluated using these rates.

4.4 Unmixing

Hyperspectral images generally have very low spacial resolution, so each pixel tends to havemultiple materials mixed together. Unmixing is the process of determining what materialsare in each pixel, and how much [16]. In order to use unmixing, we first had to extractendmembers, which are pixels in the image that are representative of a material. Forthis task, we used SMACC Endmember extraction in ENVI. Since the endmembers arenot found in any particular order, we then had to match the extracted endmembers toknown endmembers in our image before finally unmixing the images using Linear SpectralUnmixing in ENVI.

In this portion of the experiment, we looked at four different algorithms, PCA, KPCA,ICA and Isomap, and compared those results to each image before it was dimensionallyreduced. We can then compare the algorithms in three different ways: first by consideringthe number of endmembers that were found, second by determining how many of those end-members matched a known endmember, and third by determining how closely the unmixedimages match a ground truth.

In choosing endmembers, there were several parameters to test. The first was maximumnumber of endmembers. We did not want to limit the number of endmembers directly, sowe chose to have 30 endmembers as an upper-bound. In order to keep from getting pixelsthat were too impure, we set a maximum error of 0.1 and also chose to have the algorithmcoalesce redundant endmembers. Once a set of endmembers had been chosen, we could thenmatch them to our ground truth endmembers. Once that was done, we could then comparethe ground truth map for each material to the map for each corresponding endmember, ifone was found. We did this using the Euclidean metric across the pixels. The root meansquared error E is given by

6

Page 7: A Comparative Analysis of Dimension Reduction Algorithms

E =

√√√√ i∑1

(Gi −Ri)2

where i is the number of pixels, Gi is the ith pixel of the ground truth image and Ri is theith pixel in the result image.

5 Results

5.1 Runtime

The four hyperspectral images, Urban, Terrain, San Diego Airport, and Smith Island areall of different sizes. When testing for runtime, we wanted to control for the size of eachimage. As such, 2500 pixel samples of each image were taken to test the runtime. We havetested the runtime of PCA, KPCA, LLE, Isomap, LMVU, Laplacian, LTSA, and DiffusionMap. All eight dimension reduction algorithms were run three times, each time reducingto three, four or five bands. The time was recorded in seconds and the average runtime ofthree, four and five bands was computed. For each image, PCA strongly outperformed theother seven algorithms reducing the 50×50 images in less than one second. LLE, Laplacianand LTSA each took under five seconds to run, KPCA and Diffusion Map each took undera minute to run and total runtime for Isomap was just under seven minutes. LMVU isthe slowest algorithm to run: the runtime for LMVU ranged from 29-41 minutes to finishreducing each image. There are some unknown issues with LLE, which failed to reduce thedimensions on San Diego. Table 2 shows the average recorded runtime of each algorithm.

PCA KPCA LLE Isomap LMVU Laplacian LTSA DMUrban 0.07 s 56 s 2.7 s 429 s 2073 s 1.7 s 4.9 s 42 sTerrain 0.04 s 54 s 2.4 s 431 s 2101 s 1.6 s 4.7 s 42 s

Smith Island 0.04 s 49 s 2.5 s 420 s 1781 s 1.7 s 4.7 s 42 sSan Diego 0.07 s 75 s —— 428 s 2451 s 1.8 s 4.9 s 42 s

Table 2: The average runtime of all eight algorithms which reduced 50 × 50 images intothree, four, and five bands. Each time was recorded in seconds. There is no data for LLESan Diego because LLE fails to run on San Diego.

5.2 Classification

We found that for classification, different methods are more suitable to use on different im-ages. As each image has its own characteristics, some methods are better in retaining thosecharacteristics than others. Over all, classification rate did improve after using dimensionreduction. Since ground truth was not available, the results were analyzed qualitatively.

7

Page 8: A Comparative Analysis of Dimension Reduction Algorithms

Some of the obvious differences were that after dimension reduction, certain materials weremore distinct; therefore they were identified correctly into different categories. In our ex-periments we chose to reduced each image to 3, 4, and 5 bands As shown in figure 7 inappendix A, after running classification the results were identical for the different dimen-sionality values.

Results varied among the four images. As shown in figure 8 in appendix A, in the SanDiego image, some of the noticeable differences were in using the KPCA. Some materials,for example grass and concrete, were classified as the same. In ICA, the planes and grasswere classified as the same. PCA was detailed and it classified the airplanes and grassdifferently. The results were drastically different for the Terrain image, as shown in figure9 in appendix app:class. Terrain has four classes and using PCA there was too much detailthat took away from the smoothness of the grass and tree areas. It identified the shadowsas a different category. KPCA performs better on this image, it retains the smoothness ofthe grass area. ICA also gave great results but the shapes of the trees were not preservedas well after running the ICA method.

In testing using Smith Island, the best result was achieved with ICA, as shown in figure10 in appendix A. It was more distinct in categories and it was more consistent withcomparison to the original hyperspectral image. In the Urban image, PCA performed best,as shown in figure 11 in appendix A. It captured all the detailed differences. It was able toidentify grass, tree and concrete as different materials. When using KPCA, grass tree andconcrete were identified to be the same. Also the ICA method was not accurate.

5.3 Anomaly Detection

The performance of each dimension reduction method after anomaly detection was evaluatedagainst the ground truth synthetic image to obtain a TPR and FPR for each. Table 3in appendix app:detect shows the results of our experiment on each dimension reductionmethods performance in three different images and three dimensionalities. Results obtainedin the three images tend to vary as they have different characteristics on the materials, size,and location of the anomalies.

Fake Urban had all the anomalies in blocks of 2×2 pixels; however the anomalies are laidin different backgrounds at various locations of the image. As seen in figure 3, this imagealso has a large number of edges, and we notice that RX detector has the tendency to detectedges as anomalies, resulting in a fairly large amount of False Positives found. Out of allthe methods, only KPCA was able to achieve a TPR of one in all three dimensionalities.All other methods except for Diffusion Map were able to achieve TPR of one. For all themethods, the FPR appears to be very similar except a jump in dimensionality 4 for DiffusionMap and LTSA. There does not seem to be a correlation between TPRs and FPRs in eachmethod. We also observe the trend of increasing TPRs, and constant FPRs as we increasethe dimensionalities.

Fake San Diego had the anomalies in various sizes and shapes, but the anomalies arestill laid on different backgrounds at various locations of the image. Like Fake Urban,this image had a large number of edges, which resulted in generation of false positives.

8

Page 9: A Comparative Analysis of Dimension Reduction Algorithms

(a) Diffusion Map (b) ICA (c) KPCA

(d) LTSA (e) PCA

Figure 3: Here are five dimension reduction results with all five methods in dimensionalityfour. Diffusion Map and LTSA both have a fair amount of false positives with DM beingto detect the actual anomalies a little better. Both methods detected edges as anomalies.ICA, KPCA, and PCA were able to detect all the correct anomalies but with differentnumbers of false positives. Although KPCA were able to select all the anomalies in allthree dimensionalities, it selects more false positives than ICA and PCA.

9

Page 10: A Comparative Analysis of Dimension Reduction Algorithms

(a) 3 bands (b) 4 bands (c) 5 bands

Figure 4: Here are three anomaly detection results on Urban Blocks with Diffusion Map.Each pixel in the image has a value that indicates how close to being an anomaly that pixelis. The whiter the pixel, the closer the pixel is to being an anomaly. On dimensionality3, the right most columns of anomalies are not detected. The left two most columns indimensionality 4 are darker than the two of dimensionality 5. This shows that in the case ofdiffusion map, dimensionality 5 creates the most optimal results with the brightest results.

In this picture, KPCA again is the only method that generated TPR of one in all threedimensionalities, with LTSA coming in second with TPR of one in two dimensionalities.Both PCA and ICA generated the lowest FPR with fairly similar results, but ICA has ahigher TPR. Again, there is a positive correlation between the TPR and dimensionalities,and only in dimensionality of five did we observe TPR of one for all methods.

The Urban Blocks image contains five different types of anomalies on the same back-ground. Each type of anomaly has various sizes. Unlike the other two images, this imagedoes not have many edges, which helps the RX detector in not generating any false posi-tives. This image also contains a larger number of anomalies than the other two images,and all anomalies are perfectly lined up. In this image, ICA and PCA gives the highestTPR in all dimensionalities. All of the methods generated a FPR of zero mainly due tothe arrangement of the anomalies. Diffusion Map had the lowest average TPR in dimen-sionality 3 and 4. As seen in figure 4, there is a positive correlation between TPR and thedimensionality.

5.4 Target Detection

The results of target detection are tabulated in table 4 in appendix B, with all five methodsand all three dimensionalities. Both images yield similar results and are very successfulin finding all the targets. LTSA is the only method that did not have TPR of one in alldimensionalities. Diffusion Map and LTSA are the only two methods that did not yieldFPR of zero in all dimensionalities. In figure 5, we can see that Diffusion Map picks outa lot of false positives along with the true positives. Again in target detection we witnessthat only dimensionality of 5 has perfect results in all methods and all images.

We observe a very unlikely result of having most of the TPR of one, this led us toconduct another sub-experiment to test the SAM Maximum Angle on the effect of varying

10

Page 11: A Comparative Analysis of Dimension Reduction Algorithms

(a) 3 bands (b) 4 bands (c) 5 bands

Figure 5: Here are three target detection results from the method Diffusion Map. Althoughdimensionality 3 and 4 both were able to detect the targets, both dimensionalities detectednon-targets as targets, with dimensionality 4 performing the worst. Only dimensionality 5was able to obtain all the correct targets with no false positives.

this threshold. The results of this sub-experiment is shown in table 5 in appendix B. Wefound that we are able to vary this threshold greatly and still obtain constant results. Mostimportantly, LTSA is the most robust to this change of threshold and it was able to obtainTPR of one and FPR of zero. On the other hand, Diffusion Map is the most vulnerable tothis change, and tends to have the TPR and FPR vary greater as the threshold changes.Ultimately, we verified that SAM Maximum Angle of 0.1 is the optimal parameter with allmethods achieving TPR of one and FPR of zero.

5.5 Unmixing

The results of unmixing can be determined by looking at endmembers as well as abundancemaps. Since we used synthetic data, we know exactly how many endmembers are in eachimage, what the endmembers are, as well as in how much abundance for each pixel. Wefound that in general when Isomap or PCA was run, there were not quite enough endmem-bers or the right number, but when KPCA and ICA were run, more endmembers were foundthan actually existed. Despite finding so many endmembers in these cases, not many ofthem actually matched a known endmember, where as the endmembers found when Isomapor PCA were run tended to match known endmembers. The results of endmember selectionare shown in table 6 in appendix C.

We found that in the resulting abundance maps, the correct pixels would have thehighest percent of a given material, but many pixels that contained none of that materialwould be given a non-zero value as well. This is illustrated in figure 6. The algorithmalso picked up the noise that was added to the synthetic image. In order to quantitativelycompare the methods for dimension reduction, we took the root mean squared error of theabundance map for each member separately. PCA tended to have the lowest error. WhileKPCA generally performed well, it was inconsistant. The results are summarized in table8 in appendix C.

We also performed unmixing on the original synthetic images to see whether dimension

11

Page 12: A Comparative Analysis of Dimension Reduction Algorithms

(a) Ground Truth for Fake San Diego (b) Fake Sandiego Unmixed after ICA is runto 3 bands

Figure 6: This is the abundance map of grass in Fake San Diego. Notice that unmixingtends to find some grass in places that there is not actally any grass.

reduction improved the results. We found that generally the results are similar whether ornot we performed dimension reduction. The exception was with PCA, which improved theresults. The results without dimension reduction are summarized in table 7 in appendix C.

We also wanted to know if the images have an intrinsic dimensionality. For unmixing,every algorithm except ICA performed best with dimensionality 5 most of the time. WhenICA was run, there seems to be no strong correlation between dimensionality and accuracy.

6 Conclusion

We compared eight different dimension reduction methods in performance on five differenthyperspectral tasks. PCA is the fastest algorithm, while LMVU is the slowest. After PCAis run, urban images are classified more accurately, and after KPCA is run rural imagesare classified more accurately. For anomaly detction, KPCA works best for images withmultiple edges, but PCA and ICA perform comparably on images without many edges.Target detection worked perfectly on synthetic images when we ran KPCA, PCA and ICA.PCA usually resulted in less error for unmixing than the other algorithms. In conclusion,we found that PCA outperforms the other methods in every task that we investigated.

12

Page 13: A Comparative Analysis of Dimension Reduction Algorithms

A Classification Images

(a) PCA with 3 dimensions (b) PCA with 4 dimensions

(c) PCA with 5 dimensions

Figure 7: Images with different dimensionality values. The images are very similar, regard-less of dimensionality.

13

Page 14: A Comparative Analysis of Dimension Reduction Algorithms

(a) Original San Diego Airport image (b) Classified image without dimensionreduction

(c) Dimensionally reduced image usingPCA

(d) Dimensionally reduced image usingKPCA

(e) Dimensionally reduced image usingICA

Figure 8: Different dimension reduction methods on the San Diego image.

14

Page 15: A Comparative Analysis of Dimension Reduction Algorithms

(a) Original Terrain image (b) Classified image withoutdimension reduction

(c) Dimensionally reduced im-age using PCA

(d) Dimensionally reduced im-age using KPC

(e) Dimensionally reduced im-age using ICA

Figure 9: Different dimension reduction methods on the Terrain image.

15

Page 16: A Comparative Analysis of Dimension Reduction Algorithms

(a) Original Smith Island image

(b) Classified image without dimen-sion reduction

(c) Dimensionally reduced image usingPCA

(d) Dimensionally reduced image usingKPCA

(e) Dimensionally reduced image usingICA

Figure 10: Different dimension reduction methods on the Smith Island image.

16

Page 17: A Comparative Analysis of Dimension Reduction Algorithms

(a) Original Urban image (b) Classified image withoutdimension reduction

(c) Dimensionally reduced im-age using PCA

(d) Dimensionally reduced im-age using KPCA

(e) Dimensionally reduced im-age using ICA

Figure 11: Different dimension reduction methods on the Urban image.

17

Page 18: A Comparative Analysis of Dimension Reduction Algorithms

B Anomaly and Target Detection Tables

Fake Urban 3 - TPR 3 - FPR 4 - TPR 4 - FPR 5 - TPR 5 - FPRDM 0.1429 0.1429 0.1429 0.0854 1 0.0396ICA 0.8571 0.0392 1 0.0392 1 0.0396

KPCA 1 0.0396 1 0.0396 1 0.0405LTSA 1 0.0388 0.1429 0.0866 1 0.0396PCA 0.8571 0.0396 1 0.0392 1 0.0396

Fake San Diego 3 - TPR 3 - FPR 4 - TPR 4 - FPR 5 - TPR 5 - FPRDM 0.3077 0.0229 0.2308 0.0748 1 0.0314ICA 0.0769 0.004 0.7692 0.0072 1 0.006

KPCA 1 0.0221 1 0.0225 1 0.0281LTSA 0.0769 0.029 1 0.0209 1 0.031PCA 0.0769 0.0044 0.4615 0.0072 1 0.006

Urban Blocks 3 - TPR 3 - FPR 4 - TPR 4 - FPR 5 - TPR 5 - FPRDM 0.4188 0 0.6188 0 1 0ICA 0.6063 0 0.9938 0 1 0

KPCA 0.4063 0 0.8063 0 1 0LTSA 0.6 0 0.6375 0 1 0PCA 0.6063 0 0.9938 0 1 0

Table 3: Anomaly Detection results of all three images, five methods and 3, 4, and 5dimensions

18

Page 19: A Comparative Analysis of Dimension Reduction Algorithms

Fake Urban 3 - TPR 3 - FPR 4 - TPR 4 - FPR 5 - TPR 5 - FPRDM 1 0.0028 1 0.0433 1 0ICA 1 0 1 0 1 0

KPCA 1 0 1 0 1 0LTSA 1 0.0057 0.3571 0 1 0PCA 1 0 1 0 1 0

Fake SD 3 - TPR 3 - FPR 4 - TPR 4 - FPR 5 - TPR 5 - FPRDM 1 0.123 1 0.123 1 0ICA 1 0 1 0 1 0

KPCA 1 0 1 0 1 0LTSA 1 0 1 0 1 0PCA 1 0 1 0 1 0

Table 4: Target Detection results of two images, five methods and 3, 4, and 5 dimensions

DM ICA KPCA LTSA PCASAM Angle TPR FPR TPR FPR TPR FPR TPR FPR TPR FPR

0.005 0.25 0 0.0357 0 0.0357 0 1 0 0.0357 00.01 0.7143 0 0.1429 0 0.1429 0 1 0 0.3571 00.1 1 0 1 0 1 0 1 0 1 00.55 1 0.04 1 0 1 0 1 0 1 00.75 1 0.0433 1 0.0113 1 0 1 0 1 0.0433

Table 5: Target Detection result of Fake Urban in dimensionality of 5 with varying SAMMaximum Angles. The different SAM Maximum Angles used were 0.005, 0.01, 0.1, 0.55,and 0.75.

19

Page 20: A Comparative Analysis of Dimension Reduction Algorithms

C Unmixing Tables

(a) Results for Fake Urban.

ICA - 3 ICA - 4 ICA - 5 PCA-3 PCA - 4 PCA - 5Found 6 7 8 5 5 6

Matched 2 1 1 5 5 6KPCA - 3 KPCA - 4 KPCA - 5 Isomap-3 Isomap - 4 Isomap - 5

Found 13 14 14 4 5 6Matched 5 5 4 4 5 6

(b) Results for Fake San Diego.

ICA - 3 ICA - 4 ICA - 5 PCA-3 PCA - 4 PCA - 5Found 5 28 14 5 5 5

Matched 4 6 7 5 5 5KPCA - 3 KPCA - 4 KPCA - 5 Isomap-3 Isomap - 4 Isomap - 5

Found 12 16 17 3 4 6Matched 4 4 4 3 4 6

Table 6: The number of endmembers found after each algorithm was run on each images,as well as the number of those endmembers which closely matched a known endmember.Fake Urban had six materials and Fake San Diego had seven materials.

Urban Car Roofing Road Dirt Tree Grass Found Matched9.0723 N/M 6.1146 10.3639 29.4156 N/M 4 4

San Diego Grass Dirt Plane Street Roofing Cement Car Found Matched3.0577 N/M 15.7316 38.7782 N/M N/M N/M 3 3

Table 7: The results of unmixing on the synthetic data without running dimension reduc-tion.

20

Page 21: A Comparative Analysis of Dimension Reduction Algorithms

(a) Results for Fake Urban.

ICA - 3 ICA - 4 ICA - 5 PCA - 3 PCA - 4 PCA - 5Car 17.1628 25.5465 37.1267 11.1771 2.0607 0.1336

Roofing 16.0306 N/M N/M N/M N/M 0.3075Road N/M N/M N/M 6.9818 3.7827 0.2083Dirt N/M N/M N/M 6.5991 2.079 0.1672Tree N/M N/M N/M 14.6616 1.7435 0.4505Grass N/M N/M N/M 16.9262 2.0893 0.5688

KPCA - 3 KPCA - 4 KPCA - 5 Isomap - 3 Isomap - 4 Isomap - 5Car 3.8302 3.7915 3.4849 14.0413 10.5166 9.2738

Roofing 7.785 6.6579 6.0378 21.5723 11.1113 10.517Road 10.4356 9.5683 4.7265 18.2611 14.3377 9.386Dirt N/M N/M N/M 5.8537 4.9535 4.1337Tree 15.9174 11.2643 8.8447 N/M 11.407 12.0029Grass 30.1688 30.4059 N/M N/M N/M 31.1289

(b) Results for Fake San Diego.

ICA - 3 ICA - 4 ICA - 5 PCA - 3 PCA - 4 PCA - 5Grass 24.6383 31.8303 17.34 2.0846 0.8043 0.7245Dirt 80.884 13.4311 6.7828 11.9723 4.9288 4.3842

Plane 20.4755 4.5548 2.9187 14.9182 15.6564 15.4881Street N/M 17.2976 16.2582 2.5456 0.7424 0.7383

Roofing N/M 17.4452 7.9619 N/M N/M N/MCement 66.783 19.9295 11.3591 9.6941 2.5046 1.9937

Car N/M N/M 2.5806 N/M N/M N/MKPCA - 3 KPCA - 4 KPCA - 5 Isomap - 3 Isomap - 4 Isomap - 5

Grass 27.828 24.6011 23.8715 28.8731 31.3294 20.3963Dirt 11.4607 10.8406 12.4254 39.2464 20.9429 11.844

Plane N/M N/M N/M N/M 34.111 8.5502Street 6.4651 7.82 8.066 N/M N/M 18.7444

Roofing N/M N/M N/M 31.0046 41.9153 22.8257Cement 8.4147 6.0202 5.314 N/M N/M 29.3472

Car N/M N/M N/M N/M N/M N/M

Table 8: Root mean squared error in unmixed image for each material. N/M indicates thatno endmember was found that matched the material.

21

Page 22: A Comparative Analysis of Dimension Reduction Algorithms

References

[1] Fong, M. (2007, August 31). Dimension Reduction on Hyperspectral Images.

[2] van der Maaten, L. (2008, November). Matlab Toolbox for Dimensionality Reduc-tion v0.7. http://ticc.uvt.nl/~lvdrmaaten/Laurens_van_der_Maaten/Matlab_Toolbox_for_Dimensionality_Reduction.html

[3] RSI (2008). ENVI Version 4.6.1 Computer Software. Research Systems Inc.

[4] Jolliffe, I. T. (1986). Principal Component Analysis (2nd ed.) New York: SpringerVerlag.

[5] Scholkopf, B., Smola, A., & Muller, K. (1998). Nonlinear Component Analysis as aKernel Eigenvalue Problem. Neural Computation, 10, 1299–1319.

[6] Tenenbaum, J. B., de Silva, V., & Langford J. C. (2000). A Global Geometric Frame-work for Nonlinear Dimensionality Reduction. Science, 290, 2319.

[7] Lafon, S., & Lee, A. B. (2006, September). Diffusion Maps and Coarse-Graining: A uni-fied framework for dimensionality reduction, graph partitioning, and data set param-eterization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(9),1393–1403.

[8] Belkin, M. & Niyogi, P. (2003, June). Laplacian eigenmaps for dimensionality reductionand data representation. Neural Computation, 6(15), 1373–1396.

[9] Hyvarinen, A., & Oja, E. (2000). Independent component analysis: algorithms andapplications. Neural Networks, 15(4), 411–430.

[10] Venna, J. (2007, June 8). Dimensionality Reduction for Visual Exploration of SimilarityStructures. (Doctoral dissertation, Helsinki University of Technology, 2007).

[11] Zhang, Z. & Zha, H. (2002). Principal manifolds and nonlinear dimension reductionvia local tangent space alignement. (Tech. Rep. No. CSE-02-019). Pennsylvania StateUniversity, Department of computer science and engineering.

[12] US Army Topographic Engineering Center, ”HyperCube.” http://www.tec.army.mil/Hypercube/

[13] Web Site for the University of Virginias Long Term Ecological Research Program [On-line]. Available: http://www.vcrlter.virginia.edu

[14] Canty, J. M. (2007). Image Analysis, Classification, and Change Detection in RemoteSensing with Algorithms for ENVI/IDL. Florida: CRC Press.

22

Page 23: A Comparative Analysis of Dimension Reduction Algorithms

[15] Chang, C.-I., & Ren, H. (2000, March). An Experiment-Based Quantitative and Com-parative Analysis of Target Detection and Image Classification Algorithrms for Hy-perspectral Imagery. IEEE Transactions on Geoscience and Remote Sensing, 38(2),1044–1063.

[16] Winter, M. E. (2000). N-FINDR: an algorithm for fast autonomous spectral end-member determination in hyperspectral data.

23