Pine Creek Example 1

This example follows the Hematite and Methanol examples. With one major difference, Hematite and Methanol cubes are generated in laboratory conditions and are easier to analyse. For the Pine Creek example, the HSI cube comes from a satellite.

Before the analysis, four bands ranges were removed. They correspond to atmospheric scattering. Twelve endmembers was extracted and their corresponding abundance maps generated by NNLS. After inspecting the abundances maps, five endmembers are kept and are used to create the classes’ maps. For the classification step SAM and SID is used. For this particular case, SID gives a better result than SAM.

Finally, at the end of this document you can compare the endmembers classification to a supervised ECHO clustering made by David Landgrebe with the MultiSpec software. ECHO clustering do a right classification for 16 features, ECHO is a clear winner here.

In conclusion, endmembers extraction carries useful information and can be use to do a 'preview or quick' classification. However, it does not beat a standard clustering algorithm. Abundance maps can be analysed individually. By example, EM4 and EM8 show many of the buildings and roads and EM2 give a good distribution of the forest. The former is the subject for the next example on Pine Creek.

In [1]:
%matplotlib inline
import os.path as osp
import numpy as np

import pysptools.util as util
import pysptools.eea as eea
import pysptools.abundance_maps as amp
import pysptools.classifiers as cls
import pysptools.material_count as cnt

def remove_bands(M):
    """
    Remove the bands where there is atmospheric
    scattering.
    Remove:
        [0..4]
        [102..110]
        [148..169]
        [211..end]
    """
    p1 = range(5,102)
    p2 = range(111,148)
    p3 = range(170,211)
    Mp = M[:,:,p1+p2+p3]
    return Mp

data_path = '../data1'
sample = '92AV3C.hdr'

data_file = osp.join(data_path, sample)
data, header = util.load_ENVI_file(data_file)

# Remove some bands
data_clean = remove_bands(data)
# reduce the number of bands
header['wavelength'] = range(1, data_clean.shape[2]+1)

# Display the Pine Creek cube
util.display(data, 74, 46, 18)

HySime do a good guess.

In [2]:
hy = cnt.HySime()
kf, Ek = hy.count(data_clean)
print 'Testing HySime'
print '  Virtual dimensionality is: k =', kf
Testing HySime
  Virtual dimensionality is: k = 14

In [3]:
def SAM(data, E, thrs=None):
    sam = cls.SAM()
    cmap = sam.classify(data, E, threshold=thrs)
    sam.display(colorMap='Paired', suffix='Pine Creek')

def SID(data, E, thrs=None):
    sid = cls.SID()
    cmap = sid.classify(data, E, threshold=thrs)
    sid.display(colorMap='Paired', suffix='Pine Creek')

We extract 12 endmembers.

In [4]:
ee = eea.NFINDR()
U = ee.extract(data_clean, 12, maxit=5, normalize=False, ATGP_init=True)
ee.display(header, suffix='Pine Creek')
<matplotlib.figure.Figure at 0xb271e48>

And generate the corresponding abundance maps. Abundance maps 2, 4, 6, 7 and 12 are used for the classification step.

In [5]:
am = amp.FCLS()
amaps = am.map(data_clean, U, normalize=False)
am.display(colorMap='jet', suffix='Pine Creek')
<matplotlib.figure.Figure at 0x17449b38>
In [6]:
# U2 is a library made of 5 selected endmembers
U2 = U[[1,3,5,6,11],:]
SAM(data_clean, U2, [0.1,0.3,0.2,0.2,0.15])
SID(data_clean, U2, [0.1,0.2,0.1,0.2,0.03])
# The best results are obtained with SID, buildings and roads are
# well classified
<matplotlib.figure.Figure at 0x1725c4a8>
In [7]:
# Display an ECHO classification
# Reference:
#   Landgrebe, David, 1997, Multispectral Data Analysis: A Signal Theory Perspective, 
#   School of Electrical Engineering, Purdue University.
from IPython.core.display import Image 
Image(filename=r'C:\dev\pysptools\examples\pine_creek_echo_class.png')
Out[7]: