Principal Components Analysis (PCA) - Lecture Material - Completely Remote Sensing tutorial, GPS, and GIS - facegis.com
Principal Components Analysis (PCA)

We are now ready to overview the last two types of image enhancement discussed in this Tutorial. Both are also suited to Information Extraction and Interpretation, but are treated separately from Classification (considered later in the Section). We will embark first on a quick run-through of images of Morro Bay produced by Principal Components Analysis (PCA), with explanation and commentary. But you have the option of going directly at this time to Appendix C which treats the theory of PCA; the theory is "tough", so take a look at the link and then decide. But first, a digression to discuss the possible effects of interband correlations.

You may have noticed in this Section that Band 1 and Band 2 images look almost alike. Bands 5 and 7 images are similar. Bands 1 and 4 appear dissimilar. The main reason for this is that the spectral responses for, say, Bands 1 and 2 for some given object or class are close (in a spectral signature plot, the intensities for each band, despite the moderate spectral separation, are often about the same). If the reflectance levels (as DNs) for all pixels in an image of, for example, TM Bands 1 and 2 are plotted, this results:

The slightly scattered data form a narrow plot that is almost a straight line. The two bands are said to be correlated, that is, as Band 1 varies, so does Band 2, and either could be used in place of the other. But, there are 7 bands in the TM data set, and some are sufficiently different from others as to behave as though uncorrelated. Principal Components Analysis is a mathematic technique that uses statistical methods to decorrelate the data and reduce redundancy. All available bands participate in the calculations and a new set of data values - the components - results. Here is a plot of the resulting first two components, showing the now emphatic dissimilarities (again, for details you should work through Appendix C); note that the distribution of plotted values is spread apart, i.e., the data set is bimodal:

Having seen these two plots, we will digress for a moment to consider this general statistical property of data sets. Two measurements of the same variable, say reflectance, using two different ranges (for sensor data, two wavelength intervals or bands, here expressed as X and Y) will lead to several patterns in an X-Y two dimensional plot:

In the upper left diagram, the two measured variables X and Y plot for a large number of measurements in a Scatter Diagram along a central (unshown) straight line, on either side of which the points vary a little. This case denotes a high positive correlation - over the entire range, any measure of X (Measure 1) can predict the corresponding value of Y (2), and as X values increase, so do Y values. In the upper right, diagram a high negative correlation is indicated; as X values increase, their corresponding Y values decrease. The bottom two diagrams show a wide spread or scatter of values, such that the X and Y values are said to be poorly correlated, such that predictions of Y values from any X values are hampered since a wide range of Y values is possible. These diagrams refer to low correlations in which probabilities of getting specific values are low. As this applies to measurements of TM bands, 1 vs 2 produces a high correlation (so that their ability to discriminate is poor) whereas 2 vs 7 may give a low correlation (meaning that they may together yield independent assessments).

Let's check TM bivariate scatter plots for the Morro Bay bands. Various combinations were tried. One interesting one is Band 2 (abscissa) vs Band 4 (ordinate):

This plot shows a bimodal distribution of data points. The upper (blue & purple dots) plot shows strong correlation (spread around a mean line is small) between the two bands. This plot is for all the water in the scene - the DNs for water extend over a wide range of values but value changes in one band are matched by similar change increments in the other band. The second plot (orange/yellow/green) is for all other classes in the image. There is strong correlation when DN values are low but as these increase for both bands the plot widens. This means that for much of the DN value range, the two bands are less correlated and should serve increasingly well as discriminators in any classification.

Now, back to PCA: Lets assume for this page that you now know that the PCA output is a new set of DN pixels for each derived component. That DN set can be made to appear as an image that resembles to some extent any of the individual TM bands. We will now look at each of these components as images, keeping in mind that many of the tonal patterns in individual components do not seem to spatially match specific features or classes identified in the TM bands and represent linear combinations of the original values instead. We make only limited comments on the nature of those patterns that lend themselves to some interpretation.

1-14: After reading through the special review of PCA accessed by link, plus the above paragraph, see if you can come up with a single key word (or perhaps a key idea in several words) that describes the main benefit from using Principal Components Analysis. ANSWER

The first Principal Component contains the maximum amount of variation in the 7-dimensional space defined by the seven Thematic Mapper bands. The image produced from PC 1 data commonly resembles an actual aerial photograph.

In fact, this is the normal character of the first component, in that it broadly simulates standard black and white photography and it contains most of the pertinent information inherent to a scene. The hills appear more realistic because the sharp light-dark contrast in most TM bands is subdued. Note the internal structure of the waves and the absence of any indication of sediment load in the sea. The histogram of the first PC shows two peaks. The first, on the left,constitutes the ocean pixels and the second one, to the right, the land pixels.

1-15: Describe this image relative to, say, the histogram-equalization stretched image seen on the previous page. ANSWER

When we look at the histogram of the second PC (PC2), we see that even though the total range (maximum value - minimum value) is greater than for the first PC, most of the pixels fall in a small range around the mean of 49. Thus as is the convention the second PC has a smaller variance (variance is standard deviation squared) than the first PC. Since the bulk of the pixels falls in such a narrow range, the image does not seem to have as much interpretable patterns relatable to the classes as does any TM image (Band 3 appears in the next left image).

In order to make the PCA2 image (above right) viewable, we had to expand (extend the DN range numerically) its raw data set as a histogram equalization stretch. This procedure (histogram equalization) produces a histogram where the space between the most frequent values is increased and the less frequent values are combined and compressed. If we had not done this transformation, the image would appear tonally flat, with only two gray levels defining most of the land surfaces and one gray level defining the ocean. Some distinctions in the TM image that were previously small are now singled out and easier to see on the computer display. The breaker waves are uniquely singled out as very bright.

1-16: Make some general observations on how the tonal patterns in PC 2 differ from patterns observed in, for instance, Band TM 3. ANSWER

Some of the gray patterns in the PC3 image below can be broadly correlated with two combined classes of vegetation:

The brighter tones come from the fairways in the golf course and many of the agricultural fields. Moderately darker tones coincide with some of the grasslands, forest or tree areas, and coastal marshland. Note that both the beach and waves almost disappear as patterns.

The breakers completely disappear in the PC4 image below while the rest of the scene is rather flat and mostly dark but with several patterns set forth in medium grays.

You may be wondering what the remaining PCs (through PC7) look like, and if they show any useful information. The response, after examining, for example, PC6, is that the features we are familiar with do appear but probably offer little new in interpretation. Note that the waves in the image below now are black - interesting but perhaps meaningless; the golf course pattern is also black.

The information available in the PCA images can be revealed better by combining them visually as registered overlays. Any three of these four PC images can be made into color composites with various assignments of blue, green, and red. In all, 24 different combinations are possible. Of those made experimentally for this review, this next image composed of PC 4 = blue, PC 1 = green, and PC 3 = red has proved the most interesting. In this rendition, the golf course has a singular color signature (orange-red) and a unique internal structure. Most other vegetation shows as red to purple-red tones, but the grasslands (v) has an unusual color, describable as greenish-orange. The brighter slopes of the hills and mountains appear as medium green, while some areas in shadow, are bluish. The urban areas also have a deep blue color. The beach bar now appears as turquoise and the adjacent breakers are olive-green.

A very instructive example of a practical use of PCA is given in Section 5, page 3.

A variant of PCA is known as Canonical Analysis (CA). Whereas PCA uses all pixels regardless of identity or class to derive the components, in CA one limits the pixels involved to those associated with pre-identified features/classes. This requires that those features can be recognized (by photointerpretation) in an image display (single band or color composite) in one to several areas within the scene. These pixels are "blocked out" as training sites much as you will see done in the Classification discussion beginning on page 1-16. Their multiband values (within the site areas) are then processed in the manner of PCA. This selective approach is designed to optimize recognition and location of the same features elsewhere in the scene.

Another use of PCA, mainly as a means to improve image enhancement, is known as a Decorrelation Stretch (DS). The DS optimizes the assignment of colors that bring out subtle differences not readily distinguished in natural and false color composites. This reduction in interband correlations emphasizes small but often diagnostic reflectance or emittance variations owing to topography and temperature. The first step is to transform band data into at least the first three PCs. Each component is rescaled by normalizing the variance of the PC vectors. Then each PC image is stretched, usually following the Gaussian mode. The stretched PC data are then projected back into the original channels which are enhanced to maximize spectral sensitivity.

The IDRISI Windows version used to produce the various Morro Bay PCA images does not contain the last step in the DS process. However, here are two examples gleaned from the Internet. The first shows a Landsat subscene of an unidentified area: on the left is a standard false color composite; on the right a DS image - this illustrates the ability to extract and emphasize the tonal differences not apparent in the left image:

Users of ASTER data have found Decorrelation Stretching to be particularly effective. The stretches are variously informative depending whether the bands used are in the Visible, the SWIR, or the thermal IR interval. These three ASTER scenes (again, of an unidentified area) use inputs from 12 bands to show the effects of a DS; read the captions for the bands used.

The difference between the PC color composites and the DS color composites is generally not large, but extra statistic data manipulation in the latter often leads to a better product.

Source: http://rst.gsfc.nasa.gov