When Deep Learning Dives Beneath the Surface: Mapping Corals with QGIS and PyTorch

1 December 202528 November 2025 Atilio Francois No Comments

Deep learning is revolutionizing satellite image analysis.

Long reserved for large laboratories or proprietary software, it is now becoming available to the wider world thanks to PyTorch and QGIS.

This article explores the principles of Deep Learning applied to geomatics, compares ESRI models with those that can be used in QGIS, and concludes with a concrete example: the automatic detection of coral reefs using Sentinel-2 images.

Contenu

Introduction to Deep Learning applied to geomatics

Deep Learning is a branch of artificial intelligence inspired by the functioning of the human brain. It is based on artificial neural networks capable of learning from examples, without being explicitly told all the rules.

Unlike traditional classification methods, where the indicators and thresholds are chosen by the user, Deep Learning automatically discovers relevant structures and patterns in the data.

In the field of geomatics, this approach opens up impressive possibilities:

recognition of urban, agricultural, or forest areas from satellite images,

detection of changes or natural disasters,

identification of specific elements (roads, roofs, corals, ships, etc.),

detailed segmentation of landscapes from Sentinel, PlanetScope, or drone images.

The principle is simple: a model is trained on a large set of annotated images until it learns to reproduce the desired task (for example, distinguishing between water, vegetation, and sand). Once trained, the model can be applied to new areas to automatically generate high-resolution thematic maps.

Why “deep”?

The term “deep” comes from the fact that these networks have many successive layers—sometimes dozens.

Each layer learns to recognize increasingly complex patterns:

the first layers detect edges and textures,

the next identify shapes or structures,

and the last understand entire objects or contexts.

It is this hierarchy of representations that gives deep learning its power, but also its appetite for data and computing power.

Deep learning and remote sensing

In remote sensing, the most commonly used models are image segmentation models, which are capable of assigning a class to each pixel.

Architectures such as U-Net, DeepLab, and Mask R-CNN have become benchmarks for automatic mapping based on multispectral imagery.

These models are generally trained with frameworks such as PyTorch or TensorFlow, then deployed in GIS environments.

Both major worlds have taken an interest in this:

ESRI, with its .dlpk (Deep Learning Package) model format integrated into ArcGIS Pro and ArcGIS Online;

QGIS, which allows PyTorch or TensorFlow models to be used via the Processing Toolbox or custom Python scripts.

ESRI’s DLPK format

ESRI was one of the first GIS players to integrate deep learning directly into its ArcGIS ecosystem.

To facilitate the exchange and deployment of models, the company created a standard format: DLPK (Deep Learning Package).

A .dlpk file is not just a model: it is a complete and portable package containing all the elements necessary for its execution.

It generally contains:

The trained model (often in PyTorch .pth or TensorFlow .h5 format)

A JSON definition file describing the model architecture, expected parameters, and class names

Metadata: input type (raster, tile, image), patch size, number of bands, normalization, etc.

Training samples (optional) to document or retrain the model.

Thanks to this organization, ArcGIS Pro or ArcGIS Online can automatically interpret the model without the need to write code.

Tools such as “Classify Pixels Using Deep Learning” or “Detect Objects Using Deep Learning” directly read the .dlpk, load the model, and perform inference on a raster or image set.

This turnkey approach has two major advantages:

Interoperability: a model trained elsewhere can be used by any ArcGIS user, without complex dependencies.

Replicability: metadata ensures that the model is applied under the same conditions as when it was trained.

The downside, of course, is the closed format: .dlpk remains tied to the ArcGIS ecosystem and is not always easy to use elsewhere.

The open source counterpart: PyTorch and QGIS

On the open source side, QGIS does not impose a proprietary format.

Models are simply saved in their native format (often .pth for PyTorch or .pt) and executed via Python scripts integrated into the Processing Toolbox.

The idea is the same as with ESRI:

a multispectral image is loaded, a trained model is applied, and a class or probability map is generated.

But instead of relying on a packaged format such as .dlpk, QGIS gives developers complete freedom:

The model is read with torch.load().

The input bands can be selected dynamically (for example, B4-B3-B2 for RGB, or B8-B4-B3 in false color).

The Python script controls the entire processing flow: normalization, water masking (NDWI), block cutting, merging of results, etc.

This approach allows for maximum flexibility, which is particularly useful for research or experimentation.
For example, a U-Net model trained in PyTorch can be applied directly in QGIS via a custom script—without relying on ArcGIS Pro.

QGIS then becomes a true AI spatial analysis laboratory, where users can:

test several models (.pth) from the community,

adapt pre-processing according to the area (coastal strip, forest, urban, etc.),

automate an entire workflow using Processing algorithms,

and combine the results with other GIS layers (vegetation, bathymetry, land cover, etc.).

Summary

Aspect	ESRI (DLPK)	QGIS (PyTorch)
Format	`.dlpk` (complete package)	`.pth` or `.pt` (model only)
Interoperability	Simple, but proprietary	Free and modifiable
Use	Integrated ArcGIS tools (“Classify Pixels,” “Detect Objects”)	Custom Python Processing scripts
Customization	Limited	Total
Learning curve	Simpler for the end user	More flexible for the developer or researcher

Practical example: coral segmentation using Sentinel-2 images in QGIS

Now that we have explored the logic behind deep models and the formats used by ESRI and QGIS, let’s move on to a concrete example: analyzing coral reefs using Sentinel-2 satellite images.
The goal is to distinguish marine areas (sandy bottoms, seagrass beds, corals) from terrestrial or turbid areas using a pre-trained U-Net model in PyTorch.

Data used

The input image comes from Sentinel-2, a mission of the European Copernicus program.
These free multispectral images offer a resolution of 10 to 20 m and cover several bands in the visible and infrared:

BandNameWavelength (nm)Main useB2Blue490Water, turbidityB3Green560Vegetation, marine environmentB4Red665Soil, vegetation, coralsB8NIR842Land/water differentiationB11SWIR11610Humidity, sand, clouds

Band	Name	Wavelength (nm)	Main use
B2	Blue	490	Water, turbidity
B3	Green	560	Vegetation, marine environment
B4	Red	665	Soil, vegetation, corals
B8	NIR	842	Land/water differentiation
B11	SWIR1	1610	Humidity, sand, clouds

The raster is derived from S2DR3 processing – Sentinel-2 super-resolution at 1 m, which improves the resolution of all Sentinel 2 image bands to approximately 1 m (see “Using S2DR3 in Google Colab to study corals in Mauritius”). The image used is the one that served as an example in the article cited.
The user simply loads the multispectral .tif file, for example:
Palmar_MS.tif.

Processing in QGIS

The analysis is performed using a Python algorithm integrated into the Processing toolbox.

This script loads a PyTorch model (unet_coraux.pth) and applies segmentation to the image in several steps:

Reading the raster and normalizing the bands
The values are scaled to a level compatible with the model training.

Optional masking of land areas
A Normalized Difference Water Index (NDWI) is calculated to isolate the sea.
Pixels with high NDWI values are considered marine and processed by the model; the others are masked.

Cutting into blocks (patches)
The image is processed in portions to avoid memory overload.
Each block is analyzed independently, then the results are merged.

Application of the U-Net model

The model performs pixel-by-pixel segmentation: it assigns each pixel a probability of belonging to the “coral” or “non-coral” class.
The result is an output raster containing probability or class values.

Saving the output raster
The result is saved in GeoTIFF format (palmar_model_9.tif), ready to be overlaid with other GIS layers.

Procedure

Preliminary operations

We will create a QGis processing script. As a prerequisite, you must install

OsGeo4W shell

python -m pip install torchgeo
python -m pip install torch torchvision
python -m pip install segmentation-models-pytorch

Then we need to download the desired model. To do this, enter the following script in the QGIS Python console:

Python Console

import segmentation_models_pytorch as smp
import torch

# U-Net RGB pré-entraîné
model = smp.Unet(
    encoder_name="resnet34",
    encoder_weights="imagenet",
    in_channels=3,
    classes=1  # corail/non-corail
)

# Sauvegarde pour QGIS
torch.save(model, "c:/models/unet_coraux.pth")

In this example, we save the downloaded models in a directory c:/models

Setting up the processing script

In QGIS → Processing Toolbox → Scripts → New Script,

traitement

# -*- coding: utf-8 -*-
"""
Algorithme QGIS : Segmentation Coraux (U-Net) avec option masquage terrestre
Compatible QGIS 3.44+
"""
from qgis.core import (
    QgsProcessing,
    QgsProcessingAlgorithm,
    QgsProcessingParameterRasterLayer,
    QgsProcessingParameterFile,
    QgsProcessingParameterEnum,
    QgsProcessingParameterRasterDestination,
    QgsProcessingParameterBoolean,
)
from qgis.PyQt.QtCore import QCoreApplication
import torch
import torch.serialization
import numpy as np
from osgeo import gdal
from scipy.signal import windows


class SegmentationCoraux(QgsProcessingAlgorithm):
    """Segmentation des images Sentinel-2 à 1m avec un modèle U-Net"""

    BAND_OPTIONS = ["RGB (4-3-2)", "Fausse couleur (8-4-3)"]

    def initAlgorithm(self, config=None):
        self.addParameter(
            QgsProcessingParameterRasterLayer(
                "raster_input",
                self.tr("Image Sentinel-2 (GeoTIFF multibandes)")
            )
        )

        self.addParameter(
            QgsProcessingParameterEnum(
                "band_set",
                self.tr("Sélection manuelle des bandes"),
                options=self.BAND_OPTIONS,
                defaultValue=0
            )
        )

        self.addParameter(
            QgsProcessingParameterFile(
                "model_path",
                self.tr("Modèle PyTorch (.pth)"),
                extension="pth"
            )
        )

        self.addParameter(
            QgsProcessingParameterRasterDestination(
                "output_raster",
                self.tr("Raster de sortie (masque)")
            )
        )

        # --- Nouvelle option : masquage terrestre ---
        self.addParameter(
            QgsProcessingParameterBoolean(
                "mask_land",
                self.tr("Masquer les zones terrestres (NDWI)"),
                defaultValue=True
            )
        )

    def processAlgorithm(self, parameters, context, feedback):
        model_path = self.parameterAsFile(parameters, "model_path", context)
        mask_land = self.parameterAsBool(parameters, "mask_land", context)
        band_set = self.parameterAsEnum(parameters, "band_set", context)

        feedback.pushInfo("Chargement du modèle PyTorch...")
        try:
            from segmentation_models_pytorch.decoders.unet.model import Unet
            torch.serialization.add_safe_globals([Unet])
        except Exception as e:
            feedback.pushInfo(f"Avertissement : impossible d’ajouter Unet aux classes sûres : {e}")

        model = torch.load(model_path, map_location=torch.device('cpu'), weights_only=False)
        model.eval()

        raster_input = self.parameterAsRasterLayer(parameters, "raster_input", context)
        ds = gdal.Open(raster_input.source())
        feedback.pushInfo(f"Ouverture du raster : {raster_input.source()}")
        nrows, ncols = ds.RasterYSize, ds.RasterXSize

        # --- Détecter automatiquement les canaux d'entrée ---
        conv1_weights = model.encoder.conv1.weight.data
        n_channels = conv1_weights.shape[1]
        feedback.pushInfo(f"Le modèle attend {n_channels} canaux d'entrée.")

        if n_channels == 3:
            # Analyser l’importance des canaux
            channel_mean = conv1_weights.abs().mean(dim=(0, 2, 3)).numpy()
            sorted_idx = list(np.argsort(-channel_mean))
            feedback.pushInfo(f"Classement des canaux par importance : {sorted_idx}")

            band_selection = [4, 3, 2]  # RGB classique
            feedback.pushInfo(f"Utilisation automatique des bandes : {band_selection} (RGB)")
        else:
            band_selection = [4, 3, 2] if band_set == 0 else [8, 4, 3]
            feedback.pushInfo(f"Utilisation manuelle des bandes : {band_selection}")

        # --- Lecture des bandes sélectionnées ---
        img_list = []
        for b in band_selection:
            band = ds.GetRasterBand(b)
            arr = band.ReadAsArray().astype(np.float32)
            img_list.append(arr)
        img = np.stack(img_list, axis=0)

        # Normalisation
        feedback.pushInfo("Normalisation des valeurs...")
        img = (img - img.min()) / (img.max() - img.min() + 1e-6)

        # --- Calcul NDWI si option activée ---
        if mask_land and ds.RasterCount >= 8:
            try:
                B3 = ds.GetRasterBand(3).ReadAsArray().astype(np.float32)
                B8 = ds.GetRasterBand(8).ReadAsArray().astype(np.float32)
                ndwi = (B3 - B8) / (B3 + B8 + 1e-6)
                water_mask = ndwi > 0
                feedback.pushInfo("NDWI calculé : les zones terrestres seront exclues.")
            except Exception as e:
                water_mask = np.ones((nrows, ncols), dtype=bool)
                feedback.pushInfo(f"NDWI non calculable ({e}), toutes les zones seront traitées.")
        else:
            water_mask = np.ones((nrows, ncols), dtype=bool)
            feedback.pushInfo("Masquage terrestre désactivé.")

        # --- Traitement par blocs ---
        block_size = 2048
        overlap = 512
        output_mask = np.zeros((nrows, ncols), dtype=np.float32)
        weight = np.zeros((nrows, ncols), dtype=np.float32)

        total_blocks = ((nrows // (block_size - overlap) + 1) *
                        (ncols // (block_size - overlap) + 1))
        done = 0
        feedback.pushInfo("Début du traitement par blocs...")

        for y in range(0, nrows, block_size - overlap):
            for x in range(0, ncols, block_size - overlap):
                if feedback.isCanceled():
                    break

                block = img[:, y:y+block_size, x:x+block_size]
                if block.size == 0:
                    continue

                mask_block_water = water_mask[y:y+block_size, x:x+block_size]
                if not mask_block_water.any():
                    continue

                # ======= BLOC ORIGINAL PRÉSERVÉ =======
                with torch.no_grad():
                    block_tensor = torch.from_numpy(block).unsqueeze(0)
                    pred = model(block_tensor)
                    mask_block = pred.squeeze().numpy()

                h, w = mask_block.shape
                mask_block *= mask_block_water[:h, :w]

                # Fenêtre Hanning
                win_y = windows.hann(h)[:, None]
                win_x = windows.hann(w)[None, :]
                weight_block = win_y * win_x

                output_mask[y:y+h, x:x+w] += mask_block * weight_block
                weight[y:y+h, x:x+w] += weight_block

                done += 1
                progress = int(100 * done / total_blocks)
                feedback.setProgress(progress)
                # ======= FIN DU BLOC ORIGINAL =======

        # Fusion finale
        output_mask /= np.maximum(weight, 1e-6)
        output_mask = np.clip(output_mask, 0, 1)

        # --- Sauvegarde GeoTIFF ---
        feedback.pushInfo("Sauvegarde du masque final...")
        driver = gdal.GetDriverByName("GTiff")
        out_path = self.parameterAsOutputLayer(parameters, "output_raster", context)
        out_ds = driver.Create(out_path, ncols, nrows, 1, gdal.GDT_Float32)
        out_ds.SetGeoTransform(ds.GetGeoTransform())
        out_ds.SetProjection(ds.GetProjection())
        out_ds.GetRasterBand(1).WriteArray(output_mask)
        out_ds.FlushCache()

        feedback.pushInfo("✅ Segmentation terminée avec succès.")
        return {"output_raster": out_path}

    # --- Métadonnées ---
    def name(self):
        return "segmentation_coraux"

    def displayName(self):
        return self.tr("Segmentation des images Sentinel 2 à 1m (U-Net)")

    def group(self):
        return self.tr("Deep Learning")

    def groupId(self):
        return "deeplearning"

    def tr(self, string):
        return QCoreApplication.translate("SegmentationCoraux", string)

    def createInstance(self):
        return SegmentationCoraux()

You will find explanations of the different parts of the code in a future article.

The QGIS script presented here allows you to apply U‑Net-based image segmentation models saved in PyTorch (.pth) format. It is designed to process multispectral images, such as those from Sentinel‑2 satellites, and automatically adapts the bands used according to the number of input channels expected by the model. For example, if the model is trained on RGB images, the script will select the Red, Green, and Blue bands; if the model expects more channels, it will offer a manual selection or use all available bands. The code also includes the ability to mask terrestrial areas using NDWI calculations, allowing segmentation to focus on water areas, for example to identify corals. In practice, any properly saved U-Net PyTorch model can be loaded and applied, provided that the architecture and weights are included in the .pth file.

Usage

Run the script. The following window will open:

Manual band selection is only used if the script is unable to determine which bands to use based on the template.

Interpreting the results

The raster generated from the model represents a probability map:
values close to 1 indicate a high presence of corals, while values close to 0 correspond to non-coral areas (sand, algae, depth, etc.).

Appropriate symbology (from light blue to red) makes it easy to visualize the spatial distribution of probable coral areas.
By combining this map with other data (bathymetry, substrate, turbidity), it becomes possible to estimate the vulnerability or degradation of reefs over time.

Advantages of this approach

The integration of PyTorch into QGIS opens up new possibilities for AI-assisted environmental mapping:

Open source and reproducible: the entire process can be shared, modified, or adapted to other coastal areas.

Local autonomy: no need for ArcGIS Pro or expensive licenses to test or apply deep learning models.

Flexible experimentation: other architectures (SegNet, DeepLabV3, etc.) can be tested, or preprocessing can be adapted to the specific characteristics of each area.

Towards an ecosystem of open models

Ultimately, we could imagine a shared library of open source environmental models—the free equivalent of the DLPK format—where each .pth file would be accompanied by its description file (bands, normalization, classes).

QGIS could then offer an interface for importing, testing, and documenting these models, facilitating their reuse in tropical, coastal, or forest contexts.

Conclusion

The rise of deep learning marks a new stage in the evolution of geomatics.

While traditional image processing relied on thresholds and spectral indices, neural models now learn to recognize complex shapes, textures, and signatures directly in pixels.

Thanks to tools such as ArcGIS Pro (with its DLPK) and QGIS (via PyTorch and custom scripts), this power is now accessible to everyone: researchers, technicians, and environmental mapping enthusiasts.

The example presented here—the segmentation of corals from Sentinel-2 images—illustrates the potential of these approaches for detailed analysis of coastal environments and the preservation of marine ecosystems.

The challenge is no longer just technical, but also collective: pooling models, documenting their inputs, sharing methods, and making deep learning more transparent, reproducible, and open.

The future of remote sensing will likely be built at the crossroads of these two worlds—software engineering and field knowledge—to transform satellite data into true indicators of ecological status.

Si cet article vous a intéressé et que vous pensez qu'il pourrait bénéficier à d'autres personnes, n'hésitez pas à le partager sur vos réseaux sociaux en utilisant les boutons ci-dessous. Votre partage est apprécié !