As was strongly implied on page 13-1, there are two general types of training sites: 1) those we actually visit so as to have firsthand knowledge of their suitability, and 2) those whose identities we deduce from photointerprettive experience. For cost and logistics reasons, the second approach is more common.
The alternative to depending on training sites for classification is to apply the concept of signature extension. This term refers to the assumption that we may define a single, fairly constant, spectral signature as characteristic of any class, and that this signature has broad (universal) applicability to any scene in a region, or even worldwide. As a specific example, the signature for mature winter wheat should be essentially the same for fields in the U.S. Great Plains, Argentina, the Ukraine, and Australia, provided we compensate for such variables as differing air masses, Sun position, soil types, soil moisture, etc. If that assumption proves true, then an unknown feature or class in a given scene anywhere should be classifiable by comparing its spectral properties (for a Landsat pixel, its multiband digital number [DN] values) to a data bank containing standard values for many classes. We assume the closest fit of the unknown's DN values to those of a class in the bank identifies it.
A "mixed pixel" results from the fact that individual areas, consisting of several different features or classes may be below (smaller than) the resolution capability of the sensor. Consider this hypothetical "map" of a rural setting
In this instance, we treat each category as though it were more or less homogeneous. As imaged by a sensor whose instantaneous field of view (IFOV) (controlled by optics and sampling rates) leads to a small pixel size, if an individual pixel happens to lie completely within, or fortuitously coincides with, the boundaries of a given class, then the multiband spectral properties of the dominant material(s) in the enclosed class will determine the DNs for that pixel. It is more likely, however, that the pixel will straddle or cut across several class or feature boundaries. The resulting spectral content is then a composite or weighted average of the spectral responses from each internal class. Recognition of each feature or class becomes difficult, since there are two primary unknowns to account for - the identity of the class and its relative proportion in the mix. Mathematical methods are available to solve for these unknowns, but there always remains some statistical uncertainty. One improvement is to reduce pixel size (increase resolution), as is done in the central rectangle above, so that more pixels fall within the space occupied by a single class/feature and fewer pixels cross boundaries. Going in the other direction, note the effect of enlarging the pixel, say, to the size of the outer boundary of the cluster of nine. The key rule in optimizing classification is to seek a resolution that approximates the sizes of the smallest specific classes whose identities we seek. It follows that accuracies of classification using, say, AVHRR images will be low unless the classes chosen are spatially large (e.g., a lake or a mountain range). Landsat should achieve superior accuracies and IKONOS even better because they can commonly pick out the smaller features as discrete entities within their pixel sizes.
Mixed pixels are usually the biggest reason for lowered success in classification accuracy. There are a variety of factors affecting the mixing. This is an important statement, so let us explore this idea a bit further. We'll consider two cases:
First, start with a small empty parking lot (for example, 50 meters on a side) covered uniformly by dark asphalt. If a lower resolution pixel (50 meters) on the remote sensor just happens to coincide with the lot as the scene is being imaged, a single uniform spatial measurement of, say, the lot's low reflectance for a given spectral spread will be recorded. If instead the lot is next to a large field of green grass and the pixel straddles the lot and the field at a 50-50 areal coverage, the measurement will be an average of the spectral response (wavelength peaks, etc.) for each class at this proportion. If the straddling were 75-25 for field and lot respectively, the measurement will differ, and so forth for different proportions. Thus the mixing varies, and the amount might not be decipherable. Now, let's make this a bit more complicated. We will populate the parking lot with crowds of people in small groups of varying numbers distributed haphazardly over the lot. The clothes colors in each group are allowed to vary. If the pixel size is still 50 meters, the distribution and color variance won't matter, as the groups contributions to the measurement are integrated. But now lets make the pixel size 5 meters instead, so that the sensor uses 100 pixels to sense the scene. In this case, the values of radiance or reflectance will vary from pixel to pixel as different groups are measured. Same scene content; different results. The 5 meter situation describes the improved information content that higher resolution affords. There is less uncertainty owing to mixing since each pixel is likely to encompass a group with its own distinctive character.
The second case is more a matter of semantics than of the physics of spectral and/or spatial remote sensing. Let us remotely sense a volcanic flow made up of dark basalt. A geologist in the field might map this flow as a single unit - basalt. Or, if it suits his/her purpose, the map might be more exacting, showing variants of the basalt that occur within a flow: the basalt could be dense, or vesicular, or of the aa or pahoehoe types, or show still other variations that are given descriptive names. In the field, on the ground, these variants can be assessed and mapped by the observer. Spectrally, in terms of material composition they are fairly similar but they are distinguishable by their shapes or extent of reflection, etc. When sensed remotely, they are probably indistinguishable to all but the highest resolution sensors. The point is this: what is being imaged, and hence mapped, depends in part on how definitive are the classes involved. If one needs only to establish the classes as trees, crops, rock outcrops, etc. the classes are "coarse" and hence the degree of spectral and spatial resolution does not have to be high. But if the types of trees (perhaps to species level), the identity of crops (corn versus wheat, etc.), the variety of rocks (limestones, sandstones, granites, etc.) are being sought, then the extent of mixing becomes a major factor. Higher spectral/spatial resolution becomes a requirement. More information - more detail - comes at a price: better remote sensors must be used.