Aerial images are a gold mine: old surveys, forest inventories, coastal monitoring, digitized cadastral maps, etc.
But they often suffer from a major problem: low resolution. Blurring, pixelation, crushed textures, loss of detail… these are all limitations that complicate spatial analysis.
In recent years, a technology derived from deep learning has completely changed the game: ESRGAN (Enhanced Super Resolution Generative Adversarial Network).
Initially developed to improve image quality in video games, ESRGAN has become a valuable tool in remote sensing, photogrammetry, and geomatics.
In this article, we will explore how ESRGAN can improve the resolution of old aerial images, what its advantages and limitations are, and how to use it in a QGIS-compatible processing chain.
Why use ESRGAN for aerial images?
Unlike traditional resampling algorithms (bicubic, bilinear), ESRGAN does not simply enlarge the image.
It reconstructs missing details using deep learning based on thousands of high-resolution images.
ESRGAN can:
- reconstruct the textures of roofs, roads, fields, and coastlines
- refine lines (riverbanks, buildings, agricultural boundaries)
- make small structures that would otherwise be invisible legible
- improve the quality of old digitized orthophotos
- prepare images for segmentation or automatic classification
For geomatics specialists, this is a game changer:
we can finally take advantage of old aerial images whose quality had previously been a hindrance.
How does ESRGAN work? (simplified version)
ESRGAN is based on two neural networks:
- A generator that attempts to produce a realistic high-resolution version.
- A discriminator that attempts to distinguish between the generated images and the actual high-resolution images.
The two improve each other until they produce a detailed, sharp, and consistent image.
ESRGAN is particularly useful for aerial photographs because:
- it learns from natural textures (ground, roofs, vegetation)
- it handles noise well
- it limits artifacts specific to old scans (grain, halos)
Example: 4x super-resolution on an aerial photo (generic)
We start with a digitized aerial image measuring 512×512 px.
- Original resolution: blurry, details erased
- ESRGAN ×4 → output in 2048×2048 px
- Improvements observed:
- better readability of plots
- clearer roof contours
- roads and tracks clearly visible
- better reconstructed ground textures
Using Real-ESRGAN-ncnn-vulkan on Windows
We will cover:
- download & folder
- basic command and useful parameters
- “georeferenced image → SR → GeoTIFF” pipeline (automatic)
- ready-to-paste Windows batch script
- tips/pitfalls to avoid and advanced options
- option: chain GFPGAN (face/detail restoration)
1) Download & preparation (quick)
- Download the realesrgan-ncnn-vulkan archive from the project’s Releases page (file realesrgan-ncnn-vulkan-*-windows.zip).
- Unzip into C:\SR\realesrgan-ncnn-vulkan-20220424-windows, for example.
- Place the input images in C:\SR\input\ (PNG/JPG/TIF formats).
- Create C:\SR\output\ for the results.
2) Basic command and parameters (explained)
Minimum command (x4, realesrgan-x4plus model):
.\realesrgan-ncnn-vulkan.exe -i C:\SR\input\image.png -o C:\SR\output\image_x4.png -n realesrgan-x4plus
Useful parameters:
- -i <path>: input image
- -o <path>: output image
- -n <model_name>: model (realesrgan-x4plus, realesrnet-x4plus, realesrgan-x4plus-anime, etc.)
- -s <scale>: magnification factor (sometimes managed by the model; keep -s 4 for x4)
- -tile <N>: tile size (e.g., -tile 200) — reduces GPU/CPU memory usage, useful for large images
- -t <N>: CPU threads (optional)
- -g <index>: GPU index (if multiple)
- –help: displays all options
Tip: if your machine is not very powerful, use -tile 150 or -tile 100.
3) Pipeline for georeferenced images (simple and reliable method)
Principle: upscaling is performed on the raster image (PNG/TIF), then a GeoTIFF is recreated by reapplying the geographic envelope (bounds) of the source image, which preserves the spatial geometry and adjusts the resolution to the magnification factor.
Steps (manual)
- Retrieve the bounds and CRS of the original: gdalinfo -json C:\SR\input\image.tif > info.json or directly read Upper Left (ulx, uly) and Lower Right (lrx, lry) from gdalinfo image.tif.
- Run Real-ESRGAN on the image (image_non_geo.png or converted): .\realesrgan-ncnn-vulkan.exe -i C:\SR\input\image.png -o C:\SR\output\image_x4.png -n realesrgan-x4plus -tile 150
- Recreate a GeoTIFF with the same envelope but a new size:
If the original envelope is (ulx, uly, lrx, lry), reuse it. Example: gdal_translate -of GTiff -a_srs EPSG:XXXXX -a_ullr ulx uly lrx lry C:\SR\output\image_x4.png C:\SR\output\image_x4_georef.tif- -a_srs EPSG:XXXXX: the CRS of the original (e.g., EPSG:4326 or EPSG:3857)
- -a_ullr ulx uly lrx lry: corners in spatial coordinates (Upper Left X,Y; Lower Right X,Y)
Why it works: we apply the original geographic envelope to the super-resolved image. The new pixel size will automatically correspond to a new spatial resolution (pixel size = original_pixel_size / scale).
Note: If you want to be extremely precise about the geo-transform header rather than -a_ullr, you can calculate and apply the new geotransform via gdal_edit.py -a_ullr or using a GDAL Python script, but gdal_translate -a_ullr is simple and robust.
4) Windows batch script (automatic for an entire folder)
Paste the following file into C:\SR\run_sr_georef.bat, modify the paths and the projection used, then run.
@echo off
setlocal enabledelayedexpansion
set OUTDIR=output
if not exist %OUTDIR% mkdir %OUTDIR%
REM === Boucle sur tous les tif de input ===
for %%F in (input\*.tif) do (
echo -------------------------------
echo Traitement de : %%F
set BASENAME=%%~nF
REM === Extraction coordonnées ===
for /f "tokens=2,3,4,5 delims=(,)" %%a in ('gdalinfo "%%F" ^| find "Upper Left"') do (
set ULX=%%a
set ULY=%%b
)
for /f "tokens=2,3,4,5 delims=(,)" %%a in ('gdalinfo "%%F" ^| find "Lower Right"') do (
set LRX=%%a
set LRY=%%b
)
echo ULX=!ULX!
echo ULY=!ULY!
echo LRX=!LRX!
echo LRY=!LRY!
REM === 1. Extraction PNG temporaire ===
gdal_translate -of PNG "%%F" temp.png
REM === 2. Super-résolution ===
C:\SR\realesrgan-ncnn-vulkan-20220424-windows\realesrgan-ncnn-vulkan.exe -i temp.png -o temp_sr.png -s 4
REM === 3. Conversion en TIF géoréférencé ===
gdal_translate temp_sr.png "%OUTDIR%\!BASENAME!_sr.tif" -a_ullr !ULX! !ULY! !LRX! !LRY! -a_srs EPSG:2154
echo Resultat : %OUTDIR%\!BASENAME!_sr.tif
echo DONE
)
REM === Suppression fichiers temporaires ===
del temp.png
del temp_sr.png

Notes:
- This script assumes that gdalinfo.exe and gdal_translate.exe can be found (add the paths if necessary).
- The script first converts to PNG to avoid certain TIF files with exotic compression that would cause the executable to fail.
- -a_ullr uses the exact bounds. If the original had a slightly non-axial geotransformation (rotation), -a_ullr would lose the rotation; this is rare in orthophotos: if there is rotation, a more advanced GDAL Python script will be needed.
5) Tips & pitfalls to avoid
- Keep the original: never work on the original—keep a copy.
- Bad compression/exotic TIFs: converting to PNG is often safer.
- Rotation/shearing: -a_ullr assumes no rotation. For georeferenced images with rotation, you need to rewrite the complete geotransform → GDAL Python script to apply SetGeoTransform. Let me know if you have this case.
- Hallucinations: ES(R)GAN can invent unreal details — check visually and document the processing.
- Huge size: for very large images (>10,000 px), use small -tile (50–150) and enough RAM/temp disk.
- Metadata: the final GeoTIFF does not automatically include the image’s EXIF metadata; if needed, use gdal_copy or gdal_translate -co to add info.
6) Add GFPGAN (optional)
If the photo contains damaged people/facades and you want to locally improve faces/details after SR:
- Install GFPGAN (or download the executable if available).
- Run GFPGAN on the SR output:
gfpgan.exe -i image_x4.png -o image_x4_gfpgan.png --upscale 1
Integration into QGIS
There are three simple methods for using ESRGAN in QGIS:
1. Via Python processing (Processing Toolbox)
Create a Python script that:
- takes a raster as input
- launches ESRGAN
- automatically reloads the enhanced image into QGIS
2. Via an external plugin
Create a custom plugin based on:
- an “ESRGAN Super-resolution” button
- a simple interface: model, ×2/×4 factor, input file
3. Via an image R/W workflow
Use ESRGAN for pre-processing before importing images into QGIS
→ ideal for old orthophotos.
Advantages
✔ Gives new life to aerial archives
✔ Enables more detailed remote sensing analysis
✔ Useful for extracting objects (buildings, beaches, ravines, etc.)
✔ Works with highly degraded images
✔ 100% open source ecosystem
Limitations
⚠ ESRGAN does not predict reality:
- it reconstructs plausible details, which are not guaranteed to be accurate.
⚠ Risk of artifacts on:
- high-contrast edges
- heterogeneous urban areas
- deep shadows
⚠ Avoid using for:
- regulatory mapping
- analyses intended to reflect reality down to the pixel
However, for exploratory mapping, data preparation, or heritage work, it is an extremely powerful tool.
Towards a specialized ESRGAN for aerial images?
An exciting avenue of research: training an ESRGAN specifically on:
- high-resolution orthophotos
- current aerial images (10–20 cm/pixel)
- natural tropical, coastal, or agricultural textures
This would result in a model perfectly suited to scanning archives from the 1960s to the 1990s.
Conclusion
ESRGAN is literally transforming our ability to exploit aerial imagery.
For the first time, it is possible to restore detail to old data and turn it into usable sources for modern geomatics.
Access to open source models and PyTorch paves the way for a new ecosystem of free tools… serving history, the environment, and cartography.

