1 Information Extraction Principles for Hyperspectral Data David Landgrebe Professor of Electrical & Computer Engineering Purdue University [email protected].

1

Information Extraction Principles for Hyperspectral Data

David LandgrebeProfessor of Electrical & Computer Engineering

Purdue [email protected]

• A Historical Perspective• Data and Analysis Factors• Hyperspectral Data Characteristics• Examples• Summary of Key Factors

Outline

2

1957 - Sputnik

REMOTE SENSING OF THE EARTH

Atmosphere - Oceans - Land

Brief History

1958 - National Space Act - NASA formed

1960 - TIROS I

1960 - 1980 Some 40 Earth Observational Satellites Flown

3

Image Pixels

Enlarged 10 Times

Thematic Mapper Image

4

Three Generations of Sensors

Band No.

Re

lativ

e R

esp

on

se

05

101520253035404550

1 2 3 4

Green Veg.

Bare Soil6-bit data

MSS1968

Band No.

Re

lati

ve

Re

sp

on

se

0

50

100

150

200

1 2 3 4 5 6 7

Green Veg.

Bare Soil

8-bit data

TM1975

Wavelength (µm)

Re

lativ

e R

ad

ian

ce

Re

sp

on

se

0

500

1000

1500

2000

2500

0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4

Water

Emerging Crop

Trees

Soil

10-bit data

Hyperspectral1986

5

Systems View

Sensor On-BoardProcessing

PreprocessingData

AnalysisInformationUtilization

Human Participationwith Ancillary Data

Ephemeris,Calibration, etc.

6

Scene Effects on Pixel

7

Data Representations

Image Space Spectral Space

0

1000

2000

3000

0.40 0.80 1.20 1.60 2.00 2.40

Wavelength (µm)

Water Trees Soil

Feature Space

0

200

400

600

800

1000

1200

1400

1600

0 500 1000 1500 2000

Band 0.60

Water

Trees

Soil

Sample

• Image Space - Geographic Orientation

• Feature Space - For Use in Pattern Analysis• Spectral Space - Relate to Physical Basis for Response

8

Data Classes

9

SCATTER PLOT FOR TYPICAL DATA

30

60

90

120

150

180

210

17 34 51 68 85 102

BiPlot of Channels 4 vs 3

Channel 3

++ +++

+++++++ ++++

+++++

++++++

+++++++++ +++ +

+ + +++++ +++

+

+ ++++++++++++++ +

+

+ +++ +

++

+

++

+++ +

+++ +++

++ +++++++

+++++ +

++++ + + + + ++

++++++++

+++ ++ ++++

++++ +++++

+ + ++++ ++

+ +++

++++++

++

+ +

++

++++

+++

++

++++

+++

++

++++++

++

+ + + +++

++++++ +

++++++

+

+

+++++

+

+

+

++++

+++++

+

+

+

++

+++

+++

+++

++

+++++++

++++ ++

+ ++++++++ ++ ++

++

++++++ ++++++++

+++

++ +++++++

+++

+++++++++

++++

+

++++

++

+++++++

+

+++++++

+++

++ + +

+++

+

+++++++++

+++

+++

+++

+++

+++ +

++++

+

+

++ ++++

++++++

++++

+++ ++

+++++++ +

+ ++++++++++ + +

+

+++++++++

++ +++

+++

++++++++++

++++ ++

+

+

+++++

+++ +

++++

++++

+++++++

+ ++ ++

++ + +

+++ +

+

+++++++ +

+

+++

+

++ ++

+

+++

++++++++ +

++

+++

++++

++++++++ +++

++++++++++

++

++

++

+++++++ +

+

+

+

+ ++

++

++++++

+ +++

++

+++++++

++ +++++ ++ + +

+++++

+ +

+++++

++++

+++

++

+++

++

++++++

++++

+

+++ +++

+

+

++++

+++++ +

++

+

+ +++

+++

++++

+

+

++ +

++++

+

+++

+++

++++

++++

+

++

++++++++++

+++++

+

++

+

++

+++

++

+++ +

++ ++

+++++

++++

+++ ++

+++

+++

++

+

+++

+++++++ +

++

++++++

+

++

++++

+++

+ ++

+ +

++

+++

++

++++++ +

+++

+++

++

+++++

+++

+++ + ++

+++++

++

++

++ +

+ + ++

+++

++

++

+

+++

+++++

+

+ + +

++

+

++++ +

+ ++

+ ++

+

+++

+

+++

+++ +

++++++

+

++

+

+ ++

++

+

+

++

+

++++++ ++

+

+++

+

+

++

++

++ ++

++

++

+++

+

+++

+

+ ++++++

++++

+++

+ ++

++

+++++++

++

+++

++

+++

++

++

+++ +

++

+ ++++

+ +++ +

+++

++++ +

++

+++

+

+ +

++++

++

+++

+++

+

+

++

+ +++ + +

+

++

+

++ ++

+++

+++

+

+

+++++

+

+++

++

++++ + +

++

++++++

+

+++++

++

++

+

+++ +++

++

+++

++

++

+ + ++

+ +++

+

+

+++ ++

+

+ ++

+++

++ ++

+ + +

++++++++

++

+++++++

++ ++++

+++

++++

+++++++

++

++ +

+

++++

+

+++

+++++++

++ +++ +

+

++

++ +

+++

+

++++++++++

+++

+

+++ ++

++++

++

+++

+++

++

+ ++

++++++

++++

+

+++

+ +

++

+++

+ ++

++

+ +++

+

+++

++++

++

+

++ +

+

+

++++

++++

++++

+

++ ++

++++++++++

++

++

+ +

+

++

++

+

++

++

++

++

++++++

+

++

+ +

+

++++

+

++

++

+ + + ++

++

+ ++

+

++

+++

++

+++++

++

+

++++

+

+++

+++

+

+

+

++

+++

++++

+

+++

++

++ +

+

+

+++

++

++

+

++

+

+

++

+++

+++

+

++

+

+ ++

++

+++

+++

++

+++

++

++

+ +++

++ +

++ ++

+ +

+++++++

+++

++

+ ++++++

+

+ +++

++++

++

++

++

+

+++++++++

+

++

+ ++

+++

++

+++

++

++

+++

++

+++++++

+

++

+++

++

+ +

+

+

+

+++

+++

++

++

++

+

++

+

+

+

+

+++

+

+

+

++

+

++

++

+

+ +

+++

++

++++ +

+

++

+++

+

++

+

+

+

++

+

+

+++

+ +

+++

+

+

+

++++

+

++

++

++++

++++

++++

+

++ +

+ ++

+++

++

+ +

++++

+ ++ +

+

+++

++

++

+

++

+ +++

+

+ +

++++

+

++++

++

+++++

+

+

+

++

+

++

+

+

++

++++

+

+

++ ++

+

++

++++

+++

+

+

++

+

+

+

++++

+++

+

++

++

+

++++

++

+

+

+

+

+

+

++

+

++

+++

++

++

+++++

++

+

+

+ ++

++

++

++

+++

+

+ +

+++

++++

++++++

++

++

+

+

+

+++

+

+++++ +

+

+

++ ++++

+

+ +++

+

+

+

+++

+ ++

++

+ +

+

+

++

+

+ +++ ++

+

+++

++

+

+++

+

++

+

++

++

+

+

++ ++

++

+++++++

+

+

+

+

+

+++

+

+++

++++

++

++

++

+

+++++

++ +

+

+

+

+

+

+

++

+

+

+

+

+++++

+

+

+

++

+++

++

++++

+

++ +

+

++

++

++++ +

++

+++

+

++ +

+

+

+ ++ +

++

+

+

+

++

+

+

++

++

+

+

+++

++

+ +++

++

+

+

++

+

+++++

+++

+

+

+

++++++++++

++

++

++

+

++

+

+

+

++++

+

+

+

+++

++++

++

+

+++

++

++++

++

++

++

+

+

++

++++

+

+

+++ +++ +

+++

+

+

++

++

++

++++

+

+++++

+++++++ +

+

+

+ ++

+

+++ +

++

++

++

++ ++

+++ ++

++ ++++

++

+++ +

+++++

+

+++++ +

++

+

+ ++

+

+++

+

+++

+++

+ ++++

++

+++++

+

++

+++

++

+++ +

+

++++

+++++++

+

+++

+++ +

++ ++

+ ++

+++

+++

++++

+++

+

+++

+

+

+++

++++ ++ +

+++

++ ++++++

+ ++

++

++

++

++ +

+ +++

+++

+

+

+++ ++++

+

++++

++

+ ++ +

++

++

++

++

+

+ +++

+

++

+++

+++

++++

+++++++

+

+

++

++

++ ++++

+ +

+

+++

+

+++++++++

+ ++

+

+

+

++ +++

++

++++++++ ++++

+ +++

+

++

+

++++

+

+

++

+++++

++ + +++

+

++

++

++ ++

+ +

+

++

+++ +

++

++

+

+++++

++ ++++

+ ++

+

+++ +

+

++

+++

+++ ++

++

+++++

++

+++++

++ +

+++

+++++++ +

+ + ++ + +++++

++++

+++

+++

+ ++

+

++++ +++++ ++

++++ +

++

++ +++

+++++++ +++

+++

+

++

+ +

+

+++

++

+

++

++

++

+

++

++ +

++++

+++

++

++ +

++

++

++++

+

++++

+++

+

+

+

+

+

+

+ + ++ + + ++++

++

+ ++++ + ++ +++

++

+++++++++

+

+

++++++++ ++

++++++

+

++

+

+

+

+

+

+

+

+++++++

++

+ ++

+++++

++

++

++++++

+

+

+++

++

+++++++++++

+

+

++

++

+

++++++ +

++

++

+

++

++

+

+

+

+

+

+

++

+++++

+

++

+

++

++++

+

+

+

+

+

++++

++

++

+

++

++

+

+

+

+ ++

+

+++ +

+

+ +

++

++

++

+

+++

+

+

+

+

++

+

++

+

+

+

+

++

+++

+++

+

+++

+

+

+

+

++

++

++

+

++

++

+

++

++++

++

+

+++

+

++

+

+

+

++

+

++

+

++

+ +

+

+

+ + ++++

+

++

+ +

+

+

++ +

+

+

++

+

+

++

+

+

+ +

+

++++

+

+++

+

+

+

+++ +

++

+ +

+++

+

++

+

++

+ +++++

+

10

BHATTACHARYYA DISTANCE

B 1

81 2

T 1 2

2

1

1 2 1

2Ln

1

21 2

1 2

Mean Difference Term Covariance Term

11

Vegetation in Spectral Space

Laboratory Data: Two classes of vegetation

12

Scatter Plots of Reflectance

0.720.710.700.690.680.670.660.6510

12

14

16

18

20

22

24

Class 1 - 0.67 µm

Class 2 - 0.67 µmClass 1 - 0.69 µm

Class 2 - 0.69 µm

Scatter of 2-Class Data

Wavelength - µm

Refl

ect

ance

- %

13

Vegetation in Feature Space

15141312111016

17

18

19

20

21

22

23

Class 1

Class 2

Samples from Two Classes

% Reflectance at 0.67 µm

% R

efl

ecta

nce a

t 0

.69

µm

14

Hughes Effect

m=25

10

20

50100

200

1000

500

m =

1 100050020010050201052

MEASUREMENT COMPLEXITY n (Total Discrete Values)

0.50

0.55

0.60

0.65

0.70

0.75M

EA

N R

EC

OG

NIT

ION

AC

CU

RA

CY

G.F. Hughes, "On the mean accuracy of statistical pattern recognizers," IEEE Trans. Inform. Theory., Vol IT-14, pp. 55-63, 1968.

15

A Simple Measurement Complexity Example

16

Classifiers of Varying Complexity

• Quadratic Form

gi(X) = 1

2(X i )

T i 1(X i )

1

2ln i

• Fisher Linear Discriminant - Common class covariance

gi(X) = 1

2(X i )

T 1(X i )

• Minimum Distance to Means - Ignores second moment

gi(X) = 1

2(X i )

T (X i )

17

Classifier Complexity - con’t• Correlation Classifier

gi(X) XT i

XTX iT i

• Spectral Angle Mapper

gi(X) cos 1 XTiXTX i

Ti

• Matched Filter - Constrained Energy Minimization

gi(X) XTCb

1i iTCb

1 i• Other types - “Nonparametric”

Parzen Window Estimators Fuzzy Set - based Neural Network implementations K Nearest Neighbor - K-NN etc.

18

Covariance Coefficients to be Estimated

• Assume a 5 class problem in 6 dimensions

• Normal maximum likelihood - estimate coefficients a and b• Ignore correlation between bands - estimate coefficients b

• Ignore correlation between bands - estimate coefficients d

Class 1 Class 2 Class 3 Class 4 Class 5b b b b ba b a b a b a b a ba a b a a b a a b a a b a a ba a a b a a a b a a a b a a a b a a a ba a a a b a a a a b a a a a b a a a a b a a a a ba a a a a b a a a a a b a a a a a b a a a a a b a a a a a b

• Assume common covariance - estimate coefficients c and d

Common Covar.dc dc c dc c c dc c c c dc c c c c d

19

EXAMPLE SOURCES OFCLASSIFICATION ERROR

Decision boundary defined by the

diagonal covariance classifier

class 2

class 1

Decision boundary defined by Gaussian ML classifier

20

Number of Coefficients to be Estimated

• Assume 5 classes and p features

No. ofFeatures p

Class Covar.

(a & b above)5{{ p+1)p/2}

Diagonal ClassCommon Covar.

(b above)5p

CommonCovar.

(c & d above){ p+1)p/2}

Diagonal CommonCovar.

(d above)p

5 75 25 15 510 275 50 55 1020 1050 100 210 2050 6375 250 1275 50

200 100,500 1000 20,100 200

21

Intuition and Higher Dimensional Space

Borsuk’s Conjecture: If you break a stick in two, both pieces are shorter than the original.

Keller’s Conjecture: It is possible to use cubes (hypercubes) of equal size to fill an n-dimensional space, leaving no overlaps nor underlaps.

Science, Vol. 259, 1 Jan 1993, pp 26-27

Counter-examples to both have been found for higher dimensional spaces.

22

The Geometry of High Dimensional Space

The Volume of a Hypercube concentrates in the corners

0.6

1 2 3 4 5 6 70

0.2

0.4

0.8

1

dimension d

The Volume of a Hypersphereconcentrates in the outer shell

1 2 3 4 5 6 7 8 9 10 110

0.2

0.4

0.6

0.8

1

dimension d

Vd (r ) Vd (r )

Vd (r)rd (r )d

rd1 1

r

d

d 1

V hypersphere

Vhypercube

d

2

d2d 1 d2

d 0

23

Some Implications

High dimensional space is mostly empty. Data in high dimensional space is mostly in a lower dimensional structure.

Normally distributed data will have a tendency to concentrate in the tails; Uniformly distributed data will concentrate in the corners.

24

Volume of a hypersphere =2rd

dd / 2

(d / 2)

How can that be?

dVdr

2d / 2

(d / 2)r (d 1)

Differential Volume at r =

0 1 2 3 4 50

20

40

60

80

Distance from Class Mean, r

1

2

3 4 5

Surface of Hypersphere

Volumn of shell

25

How can that be? (continued)

rd 1e r

2

2

2d2 1d2

The Probability Mass at r =

0 1 2 3 4 50

0.2

0.4

0.6

0.8

Distance from Class Mean, r

1

2 3 4 5 10 15 20

Probability Density of Distance r

Probability mass in shell

26

MORE ON GEOMETRY

• The diagonals in high dimensional spaces become nearly orthogonal to all coordinate axes

cos d 1d

Implication: The projectionof any cluster onto anydiagonal, e.g., by averagingfeatures could destroy information

27

STILL MORE GEOMETRY

• The number of labeled samples needed for supervised classification increases rapidly with dimensionality

In a specific instance, it has been shown that the samples required for a linear classifier increases linearly, as the square for a quadratic classifier. It has been estimated that the number increases exponentially for a non-parametric classifier.

• For most high dimensional data sets, lower dimensional linear projections tend to be normal or a combination of normals.

28

A HYPERSPECTRAL DATA ANALYSIS SCHEME

200 Dimensional Data

Class ConditionalFeature Extraction

FeatureSelection

Classifier/Analyzer

Class-SpecificInformation

29

Finding Optimal Feature Subspaces

• Feature Selection (FS)

• Discriminant Analysis Feature Extraction (DAFE)

• Decision Boundary Feature Extraction (DBFE)

• Projection Pursuit (PP)

.Available in MultiSpec via WWW at: http://dynamo.ecn.purdue.edu/~biehl/MultiSpec/Additional documentation via WWW at: http://dynamo.ecn.purdue.edu/~landgreb/publications.html

30

Hyperspectral Image of DC Mall

HYDICE Airborne System1208 Scan Lines, 307 Pixels/Scan Line210 Spectral Bands in 0.4-2.4 µm Region155 Megabytes of Data(Not yet Geometrically Corrected)

31

Define Desired Classes

Training areas designated by polygons outlined in white

32

Thematic Map of DC Mall

Legend Operation CPU Time (sec.) Analyst TimeDisplay Image 18Define Classes < 20 min.Feature Extraction 12Reformat 67Initial Classification 34Inspect and Mod. Training ≈ 5 min.Final Classification 33

Total 164 sec = 2.7 min. ≈ 25 min.

Roofs

Streets

Grass

Trees

Paths

Water

Shadows

(No preprocessing involved)

33

Hyperspectral Potential - Simply Stated

• Assume 10 bit data in a 100 dimensional space.• That is (1024)100 ≈ 10300 discrete locations

Even for a data set of 106 pixels, the probability

of any two pixels lying in the same discrete location

is vanishingly small.

34

Summary - Limiting Factors

PreprocessingData

AnalysisInformationUtilization

Human Participationwith Ancillary Data

Sensor On-BoardProcessing

Ephemeris,Calibration, etc. • Scene - The most complex

and dynamic part

• Sensor - Also not under analyst’s control

• Processing System - Analyst’s choices

35

Limiting Factors

Scene - Varies from hour to hour and sq. km to sq. km

Sensor - Spatial Resolution, Spectral bands, S/N

Processing System -

• Classes to be labeled

• Number of samples to define the classes

• Complexity of the Classifier

• Features to be used

- Exhaustive,

- Separable,- Informational Value,

36

Source of Ancillary Input

Possibilities

• Ground Observations

• “Imaging Spectroscopy”

- From the Ground

- Of the Ground

• Previously Gather Spectra

• “End Members”

Image Space

Spectral Space

Feature Space

.

37

Use of Ancillary Input

A Key Point:

• Ancillary input is used to label training samples.

• Training samples are then used to compute class quantitative descriptions

Result:

• This reduces or eliminates the need for many types of preprocessing by normalizing out the difference between class descriptions and the data

1 Information Extraction Principles for Hyperspectral Data David Landgrebe Professor of Electrical & Computer Engineering Purdue University [email protected].

Documents

ba ba ba ba b

typical data slide

pixel slide

response slide

moment slide

feature space slide

coefficients c

classes of vegetation