Different multi spectral data bands have, often, a very high correlation and contain similar information. For example, the sensors of Landsat MSS bands 4and 5 (green and red respectively) produce visual images with very similar appearance being given that the reflectance for the same type of surface is almost identical. Image transformations based on complex statistical multi spectral data processing can be used to reduce data redundancy and the correlation between the bands. The analysis of the main components is a transformation of this type. The goal of this transformation is to reduce the number of dimensions (number of bands) and to produce a compression of information from several bands in a smaller number of bands. The ” new ” bands that result from this statistical compression are called components. This process aims to maximize (statistically) the quantity information (or variance) of the original data in a restricted number of components. For example, the analysis of the main components, can transform data from seven bands of the TM / Landsat (ThematicMapper) sensor so that the three main components of the transformation contain more than 90% of the information included in the seven initial bands. The interpretation and analysis of these three components by combining them visually or digitally, is simpler and more efficient than using the seven initial bands.The analysis of the main components or other complex transformations can be used as enhancement visual techniques to facilitate the interpretation or to reduce the number of bands that will be provided as input data to a digital classification procedure.
The ArcMap main Components tool
The main components main tool converts the channels input data into the multivariable attributes space when you make rotate the axes relative to the space of origin. The axes (attributes) of the new space are uncorrelated. The main reason for transforming data in a primary component analysis is to compress data by eliminating redundancy.
It is obvious that the data is redundant in a multichannel raster comprising the altitude, slope and exposure values (on a continuous scale). Being given that the slope and the exposure are usually derived from altitude, a large proportion of the variance can be explained, in the area of study, by the altitude.
The result is a multichannel raster having the same number of channels as the designated components (one channel per axis in the new multivariate space). The first main component will have the highest variance; the second, the highest variance that is not described by the first, and so on. Often, the three (or four) first rasters from the multichannel raster generated with the main components tool can describe more than 95% of the variance. You can delete the individual remaining channels. Insofar where the new raster contains fewer layers and more than 95 % of the variance of the original raster, the calculations are faster and the accuracy is retained.
The main components tool needs an input multichannel raster, the number of main components to convert the data, the name of the output statistics file and the name of the output raster. The output raster includes the same number of channels as the defined number of components. Each channel describes a component.
Main components analysis concepts
Theoretically, by using a two-channel raster, the offset, the axis rotation and the data transformation are performed as follows :
- The data is traced in a Scatter diagram.
- An ellipse is calculated to link the points included in the points cloud
Traced ellipse limit
The major axis of the ellipse is shown. The major axis becomes the new axis of the x, the first main component (PC1). PC1 describes the most important variance since it represents the largest cross section through the ellipse. The direction of PC1 is the vector itself and its size is the value itself. The angle of the x- axis with respect to PC1 is the angle of rotation used in the transformation.
First main component
- The system calculates a perpendicular line to the orthogonal line PC1. This line is the second main component (PC2) and the new axis for the origin axis y (see Figure below). The new axis describes the second largest variance which does not appear in PC1.
Second main component
Using the specific vectors, the specific values and the computed covariance matrix of the multichannel raster input , a linear formula defining the offset and the rotation is created. This formula is applied to transform each cell value relative to the new axis.
We will resume the Landsat 8 image of the previous chapters. Firstly, we will create a composite image with all 11 bands. With the Composites bands tool:
The result appears in ArcMap
We will use Toolbox-> Spatial Analyst Tools ->
Multivariate -> Main components
Re-enter a name for the resulting raster. It will contain the values calculated for each requested component, for each pixel of the image.
The number of main components can be left by default to the number of input bands, but to save computing time and produce resulting rasters less bulky, one can directly enter 3 as the number of components to calculate. It is almost impossible to find images where a fourth component brings additional significant information. The last file is the file where you will find the results of the analysis. For our example, at the bottom of the file we have
The Percent Eigen Values column shows the percentage of variance explained by each of the components calculated. In our case:
- The first main component explains 92% of the data variance of the 11 input channels
- The second main component explains 7%
- The third and last explains 1% of the total variance.
Here are the results for each calculated bands
1 of the main components
2 of the main components
Band 3 of the main components
Depending on the type of target sought in the classification it can be much easier to build a signature file using this new raster instead of the original raster.