Image classification tutorial with ArcMap: 1- Introduction

Firstly, a warning: this tutorial is not intended for those users whose job is spatial image processing . It is intended for GIS users who need to engage in image processing in order to improve their data. The goal is to even out the beginner’s pitfalls and provide some theoretical basis for not just following a cooking recipe. The theoretical aspects are not exhaustive and we will allow ourselves some approximations not to make the explanation incomprehensible for the neophyte.

 Much of the explanation comes from the excellent Canadian Natural Resources site, to which we have added the practical side with ArcMap.   You will find on our site, simultaneously,   the same tutorial but adapted to the tools proposed by QGis.

Among the wide variety of tools offered by ArcGis to perform the image classification work, in this tutorial we will use the following:

  • the toolbar   “Image classification”  
  • the Image Analysis window
  • the batch tools of the Toolbox

The tutorial will cover the three main phases of the image classification work:

  • the pre-processing and data exploration
  • the images classification strictly speaking
  • the classifications post-processing

1- Introduction

  The purpose of interpreting and analysing remote sensing imagery is to identify and measure different targets in an image in order to extract useful information. In remote sensing, a target is defined as any structure or object observable in an image. Targets can be points, lines, or surfaces. Thus they can have various forms but they must, obligatorily, meet the following feature:  they must be distinctive, that is, they must contrast with the surrounding structures.


Interpretation and identification of remote sensing targets may be performed visually, that is to say by a human interpreter. In that case,  imaging is presented in a photographic format, regardless of the type of sensors used and how the data was acquired.

Visual interpretation and analysis date back to the very beginning of remote sensing with the interpretation of aerial photos. Visual interpretation is, often, limited to a single data channel or a single image at a time, due to the difficulty of performing a visual interpretation with multiple images. Human interpretation is a subjective process, which means that the results may vary from one interpreter to another.

Image classification

An analyst who attempts to classify the characteristics of an image uses elements of visual interpretation (photo-interpretation) to identify homogeneous groups of pixels that represent interesting classes of surfaces. The digital classification of images uses the spectral information contained in the values ​​of one or more spectral bands to classify each pixel individually. This type of classification is called spectral cluster recognition. The two ways to proceed (manual or automatic) are to assign a particular class or theme (for example: water, coniferous forest, corn, wheat, etc.) to each pixel of an image. The “new” image that represents the classification is composed of a mosaic of pixels belonging to a particular theme. This image is essentially a thematic representation of the original image.

When we talk about classes, we have to distinguish between information classes and spectral classes. Information classes are categories of interest that the analyst attempts to identify in images, such as different types of crops, forests or tree species, different types of geological features or rocks, and so on. Spectral classes are groups of pixels that have the same characteristics (or almost) with respect to their intensity value in the different spectral bands of the data. The ultimate goal of the classification is to make the correspondence between the spectral classes and the information classes. It is quite unusual to find a direct correspondence between these two types of classes. Well-defined spectral classes can, sometimes, appear without, necessarily, correspond to information class interesting for our analysis. On the other hand, a very broad information class(eg forest) may contain several spectral sub classes with defined spectral variations. Using the example of the forest, spectral sub classes can be caused by variations in age, species, tree density, or simply by shading effects or variations in illumination. The analyst has the role of determining the utility of the different spectral classes and validating their correspondence to useful information classes.

The most common methods of classification can be divided into two broad categories: supervised and non-supervised classification methods.

Supervised classification

When using a supervised classification method, the analyst identifies fairly homogeneous samples of the image that are representative of different types of surfaces (information classes). These samples form a set of test data.The selection of these test data relies on the knowledge of the analyst, his familiarity with the geographical regions and the types of surfaces present in the image. Therefore the analyst supervises the classification of a specific set of classes. The numerical information for each of the bands and for each pixel of these sets is used by the computer to define the classes and, then, to recognize regions with properties similar to each class. The computer uses a special program or algorithm to determine the numerical “signature” of each class. Several different algorithms are possible. Once the computer has established the spectral signature of each class, it affects each pixel of the image of the class with which it has the most affinities.

Therefore a supervised classification starts with the identification of the information classes that are then used to define the spectral classes that represent them.

Non-supervised classification

A non-supervised classification proceeds in the opposite way. The spectral classes are formed first, based on the numerical information of the data only. These classes are then associated, by an analyst, with classes of useful information (if possible). Programs called classification algorithms are used to determine natural statistical groups or data structures. Usually, the analyst specifies the number of groups or classes that will be formed with the data. In addition, the analyst can specify certain parameters relating to the distance between classes and the variance within a class. The end result of this iterative classification process can create classes that the analyst will want to combine, or classes that should be separated again. Each of these steps requires a new application of the algorithm. Human intervention is still needed in the non-supervised classification. However this method does not start with a predetermined set of classes as the supervised classification.

Visual interpretation Target recognition is the key for interpreting and extracting information. Observing the differences between the targets and their backgrounds involves comparing different targets based on a combination of seven characteristics: tone, shape, size, pattern, texture, shadow and association. Consciously or not, we regularly use these features for the visual interpretations we make daily. The identification of remote sensing targets based on the seven visual characteristics allows us to improve our interpretation and analysis

The tone refers to the relative clarity or colour (hue) of the objects in an image. Generally, the nuance of tone is the fundamental element to differentiate targets and structures. The variations of tone also allow the differentiation of shapes, textures and patterns of objects.

The  form refers to the general look, structure or outline of the individual objects. The form can be a very important clue for interpretation. Straight-edged forms are generally found in urban areas where agricultural fields are found, while natural structures, such as forest edges,are generally more irregular, except where man has built a road or completed a clear cut. Farms with irrigated fields by automatic irrigation systems have circular shapes.  

The size of an object in an image is a function of the scale. It is important to evaluate the size of a target relative to other objects in a scene (relative size), as well as the absolute size, to help interpret that target. A quick assessment of the approximate size of a target often facilitates interpretation. For example, in an image where one would have to distinguish different areas of land use and identify an area with buildings,large structures such as factories or warehouses would suggest commercial properties, while smaller ones would suggest residential places.   

The pattern refers to the spatial arrangement of visibly discernible objects. An ordered repetition of similar tones and textures produces a distinctive and easily recognizable pattern.Orchards with evenly spaced trees or streets regularly lined with houses are good examples of patterns.   

Texture refers to the arrangement and frequency of hue variations in particular regions of an image. Rough textures would consist of striped tones where grey levels change abruptly in a small region, while smooth textures would have little or no tone variation. Smooth textures are often the result of uniform surfaces such as fields, pavement or lawns. A target with a rough surface and an irregular structure, such as a forest, results in a rough-looking texture. Texture is one of the most important elements for differentiating structures on a radar image.

Shadows are also useful for interpretation since they give a hint of ​​the profile and the relative height of targets that can be easily identified. Shadows can, however, reduce or eliminate interpretation in their surroundings, since the targets in the shadows are less, or not at all discernible.In radar imagery, shadows are particularly useful for enhancing or identifying topography and geological forms. 

The association considers the relationship between the target of interest and other recognizable objects or structures that are nearby. Identifying elements that are normally expected to be found near other structures can provide information that facilitates identification. In the example below,commercial properties can be associated with nearby roads, while residential areas would be associated with schools, playgrounds and sports fields. In our example, a lake is associated with boats, a marina and a recreational park nearby.

The image classification process

The entire process leading from the raw image acquired by satellite or plane to a thematic map including the selected geographical entities is broken down into a series of steps:

  • Data exploration and pre-processing
    • the pre-processing of images
    • the image enhancement
    • the image transformations
  • Image classification
    • Collection of learning samples
    • Evaluation of learning samples
    • Classes updates
    • Creation of the feature file
    • Aggregation(non-assisted classification)
    • Examination of the feature file
    • Editing the feature file
    • Application of the classification
  • Post-classification treatments
    • Filtering
    • Smoothing
    • Generalization

In future articles we will discuss each of these topics, and we will add the corresponding links.

Si cet article vous a intéressé et que vous pensez qu'il pourrait bénéficier à d'autres personnes, n'hésitez pas à le partager sur vos réseaux sociaux en utilisant les boutons ci-dessous. Votre partage est apprécié !

Leave a Reply

Your email address will not be published. Required fields are marked *