We have already discussed the topic of geometry validation through a series of articles (in french):
- GIS tools for validation of geometries (1)
- Geometry validation GIS tools (2): Spatiality and PostGis
- Geometry validation tools (3): FME
- Geometry validation tools (4): Geomedia Intergraph
- Geometry validation tools (5) with QGis
In this article we will discuss, in detail, how to detect and rectify geometric problems in a PostgreSQL / Postgis table using SQL queries.
Error checking with QGis.
- GEOS validation method
- QGIS validation method, and
- method for interactive verification when creating a new geometry (Preferences → Scan tab).
The Qgis method can return multiple errors per object. This method returns in the ‘Invalid output’ file a layer of polygons completed by a _errors column with a less standardized message than by the GEOS method:
Personally, I only use this tool to check for errors. I execute both methods and, if they do not return any error I consider that the layer as valid.
Meanwhile, if there are errors, I go directly to pgAdmin 4 (or failing in the database manager of QGis) to work in SQL.
Detection of Geometric Errors with Postgis
The first step is the identification of errors. Note that PostGIS complies with the OGC OpenGIS Specifications, and therefore the validity is checked against GEOS.
Therefore, we will perform a query that will search the three main types of errors:
- GEOS errors ( see simple and valid geometries )
- Null geometries, which do not make sense
- Collections of geometries, ie objects composed by several geometries of the same type or not. The most frequent examples are multipolygons that are inaccurately cataloged as FeatureCollection . In general, spatial operators do not work with input collections.
A last type of error, usually of no consequence, is the presence of duplicated nodes (two points of the entity located exactly in the same place). They do not disturb space operators, but that’s no reason not to fix them.
Here is a query that allows you to detect the first three types of errors in one pass:
(We will be using the case of a polygon layer, which is the type of entity that originates most problems.)
Correct GEOS errors with St_MakeValid
Then, we rectify the collections by transforming them into Multi_polygons. Be careful not to reverse the steps. The St_MakeValid function can produce collections as an output.
Third, delete null geometries (if the error detection query found them).
And finally we remove the duplicate nodes :
This last request is a, somewhat, diverted way of doing the operation. The St_Simplify function is a generalization function, ie a function that reduces the number of points of an entity, by removing points that are at a distance smaller than the parameter of the function. By setting 0 to this distance, the function leaves only one point among all points in a zero radius (duplicates).
Once these queries have been executed, the fault finding request is executed again. If errors persist, it will be necessary to use another method to solve them.
A technique often used is the creation of a null buffer:
At this point, if errors persist, we must give up the idea of automatically correcting them, take a breath, and do it manually.