The first part of the Collaborative Financing for the Integration of Marine Data in QGIS project has been completed, thanks to contributions from ORANGE Marine, Geoconceptos Uruguay, Janez Avzec and Gaetan Mas. Our warmest thanks to all of them. We hereby publish the final content of this part of the work.
Given the volume of the result, we’re publishing it in two parts: the first concerns the creation and management of the database, the second the S57 symbology in QGis.
The S57 format is complex and does not fully comply with GIS format standards. Its conversion with GDAL is therefore either complex or incomplete. In this article, we’ll go into detail about the complete transformation of S57 into Geopackage file tables.
Geometry types
S57 “layers” are object classes. For example, the various land zones are encoded in the LNDARE object class.
The definition of this object class is:
Geo object: Land area (LNDARE) (P,L,A)
Attributes: CONDTN OBJNAM NOBJNM STATUS INFORM NINFOM
While all LNDARE objects have the same attributes, the same cannot be said for the type of geometry. The information (P,L,A) indicates that points, lines and polygons can be found in this object class. Contrary to GIS standards, all three types of geometry coexist within the same object class.
Whatever format we choose to integrate into QGis, we’ll need to create one layer per geometry type. If we don’t, GDAL will create the layer type according to the first entity found during conversion. If it’s a point type, the layer created will be a point type, and the line and polygon entities will be ignored. Similarly, if the first entity is of line type, points and polygons will be ignored.
It should also be noted that the S57 format has no constraints on object classes: if there is no object class in the area covered by the S57 file, there will be nothing concerning it in the file (no empty layer). Similarly, if only one type of geometry is present, even though all three types are possible, there will be no trace of the other types of geometry.
The processing of an S57 file with ogr2ogr must therefore be broken down into three steps, one for each type of geometry. The following options allow you to process each S57 object class by selecting only one type of geometry:
-where « OGR_GEOMETRY=’POINT’ or OGR_GEOMETRY=’MULTIPOINT’ »
-where « OGR_GEOMETRY=’LINESTRING’ or OGR_GEOMETRY=’MULTILINESTRING’ »
-where « OGR_GEOMETRY=’POLYGON’ or OGR_GEOMETRY=’MULTIPOLYGON’ »
For certain types of driver, GDAL allows you to create prefixes in output tables. In this case, you could create all tables (points, lines, polygons) in a single geopackage, prefixing them with pt_, li_ and pl_, for example. The problem is that GDAL’s S57 driver does not allow this option. This means creating three separate geopackages, in which tables are created according to geometry type. Each geopackage will then contain a table with the same name but a different geometry. While this is the simplest solution, it is not the most elegant or practical. Here, we’ll use three import geopackages for ogr2ogr commands, into which we’ll import tables from an S57 file. We’ll call them PointsENC, LinesENC and PolysENC. We’ll also create a single geopackage called ENC for the entire database. For each ENC map, we’ll import its contents into the three import geopackages with ogr2ogr, then run Python scripts to update the ENC geopackage database with the new imported tables. We are obliged to use Python scripts instead of SQL queries and functions because the SQL version of SQLite, used for geopackages, is very limited and does not allow the creation of procedures or functions.
If you have a single S57 file to import, the process is as follows:
- Create an empty ENC geopackage.
- Load point object classes into pointsENC geopackage with ogr2ogr
- Load Lines object classes into LinesENC geopackage with ogr2ogr
- Load Polygon object classes into PolysENC geopackage with ogr2ogr
- Remove empty tables from the three geopackages and add map_enc and scale attributes.
- Cloning of tables from the three import geopackages into the ENC geopackage, prefixing pointENC tables with pt_, LinesENC with li_ and PolysENC with pl_.
If you have a batch of S57 files to load simultaneously, the process is as follows:
- Load Point object classes into geopackage pointsENC with ogr2ogr, remove empty tables and add enc file name and scale to all table attributes.
- Load Lines object classes into LinesENC geopackage with ogr2ogr, deleting empty tables and adding enc file name and scale to all table attributes.
- Load Polygon object classes into PolysENC geopackage with ogr2ogr, deleting empty tables and adding enc file name and scale to all table attributes.
- Update existing tables in the ENC geopackage from the three ET import geopackages.
- Cloning of tables from the three import geopackages that were not present in the ENC geopackage,
- Detection and deletion of any duplicates resulting from ENC geopackage table updates (optional)
Processing bathymetric soundings
Bathymetric soundings pose several problems during format conversion. The first is that the depth value is not contained in an attribute, but as Z in the geometry (XYZ). The second is that the sounds are not of the Point type, but of the Multipoint type. To obtain sound values directly in an attribute field, you need to add two parameters to the ogr2ogr command line:
Retrieving “parent” identifiers
Some objects can be grouped by a non-geographic parent object. For example, fire sectors are coded with one sector per record in the “LIGHTS” table. The different sectors of the same fire have no particular attribute that could indicate that they correspond to the same entity. This information is contained in a “parent” record. To retrieve this identifier as an attribute from the LIGTHS table, you need to add an option to the command line:
This option creates a name_rcid attribute that is common to all sectors of the same fire, but also creates a series of fields (name_rcnm,ORNT,USAG,MASK).
“List” type processing
In addition to the classic field types (integer, real, text), the S57 format uses a special type: lists of character strings or integers. These field types are found in PostgreSQL, but not in shapefiles and geopackages.
In the absence of specific options, there will be warning messages indicating that the format is not supported, and the content of these fields will be ignored. To avoid losing content, add the :
This option takes a list {..,…} and generates a text string with a first number indicating the number of elements in the list, a colon, then the list elements separated by commas:
{3} -> ‘(1:3)’
{3,1} -> ‘(2:3,1)’
This inevitably complicates the processing of these fields, but the information is not lost. On the other hand, the GDAL driver presents a problem when it comes to handling the resulting String format.
First of all, there’s no problem with the attributes of S57 objects. The field in the geopackage table is of type ‘Text’, which allows the creation of a character string of any length.
However, for “parent identifiers”, the definition used by the driver is ‘Text()’, i.e. a character string with a maximum length. This length is, a priori, determined by the content of the first record processed. In other words, it’s only by chance that you have a field with the length needed to integrate all the values in the file.
You’ll then see messages like:
WARNING : Value of field ‘NAME_RCID’ has 371 characters, whereas maximum allowed is 30
The solution isn’t complicated, but it does involve a fair amount of work. Assume that for the classic display of ENC cards, this is not a problem and you can simply ignore these warning messages.
On the other hand, if you need more advanced processing including this information, here’s the only way I’ve found yet:
Workflow for loading a single S57 file
Creating the ENC geopackage
We’re going to create an empty geopackage that will receive the imported tables, then successive updates when other S57 files are loaded.
In the QGis Explorer panel, right-click on Geopackages and select Create database
In the window that opens, enter the geopackage name, leave the default option for table name and select No geometry in Geometry type.
Click OK to have the new empty geopackage added to the QGis geopackage connections. It is advisable to delete the empty ENC table created when the geopackage was created. To do this, open the QGis database manager from the main menu, position yourself on the ENC table of the ENC.gpkg geopackage and right-click to select Delete…
ogr2ogr commands for creating import geopackages
With all this in mind, here are the three lines of ogr2ogr commands for creating tables in import geopackages:
o execute these command lines, simply open the OSGeo4W shell window:
The Shell window opens
Enter the command lines, taking care to enter the path and filename with the .000 extension.
The result will be a series of tables created in each import geopackage. On the other hand, some tables will be completely empty. Indeed, if the geometry type is possible for an object class, the table will be created, but if there is no occurrence in the processed S57 file, the table will have no record.
Using Python scripts for database management
We will use the QGis Python console for all database management operations.
To open it, go to Extensions -> Python console. The Python console panels open:
To execute a Python script, load the code into the Editor panel. You can either copy and paste, or use the Directory icon to point to the .py file. As indentation is crucial for Python scripts, you should use the second option.
Remove empty tables and add map name and scale
As we saw earlier, when executing ogr2ogr commands, empty tables are created for layers where the geometry requested in the command line is not present. We will therefore remove these empty tables and take advantage of the Python script to add two attributes to all tables: the name of the ENC map and its scale.
As far as the name is concerned, it may be useful in future database maintenance to be able to select the entities corresponding to an S57 source file. This information is not contained in any attribute of the S57 file. We therefore have to enter it manually.
The same applies to the map scale. Although an attribute is provided in S57 to enter the scale of the “paper” map, the CSCALE attribute in the M_CSCL table, it is very rare for this table and its attribute to be present in S57 files.
While the name of the map is of relative use, the same cannot be said of the scale, as this is an essential criterion in the search for duplicates. Indeed, when mixing entities from several S57 files, data from different scales will coexist more or less happily.
The following Python code deletes empty tables from an import schema and adds two attributes (enc_chart and scale) to all tables:
Be careful to respect the indentation of the Python code. To simplify the task, download the .py file directly from this link: delete_empty_tables_in_geopackages.py.
For execution, be sure to modify the script’s input parameters:
You will only need to enter the geopackage_paths the first time you run the script. On the other hand, the ENC map name and scale must be modified when importing each new S57 file.
At the end of this run, you’ll have the new attributes in each table
Workflow for loading several S57 files at once
The command lines in the previous paragraphs assume that you have a single S57 file to load. But you can load several files simultaneously. By using the -append -update options, you can loop through all the .000 files in a directory. You’ll then have all the files loaded in the import geopackages.
Unlike loading a single file, when loading a batch of files, the addition of the file name and scale must be integrated into the S57->geopackage transcoding loop. To do this, we’ll use a python script that adds the enc_chart and scale attributes to the tables created by the ogr2ogr command and populates them with the corresponding file name and a scale defined by the producer. The script also deletes any empty tables present in the import geopackage.
Thanks to Janez AVSEC for the idea of using the DSID table to retrieve the data compilation scale.
The .bat files must be run from an OSGeo4W window to benefit from the QGis Python environment.
The call to each .bat file is of the form:
.\fichier.bat “directory containing S57 files to be loaded” “name of import geopackage”
For example:
.\mass_load_s57_polys.bat “c:/enc_db/enc000” “c:/enc_db/polys.gpkg”
The .bat files for this task are as follows:
To import point type tables:
You’ll notice that, in addition to the ogr2ogr command for importing point layers, we import two tables without geometry: DSID and C_AGGR. The DSID table contains two attributes: the S57 file name and the data compilation scale. This table will then be used by the Python script (update_geopackage_dsid.py) to fill each geopackage table with the original S57 file and compilation scale.
To import line-type tables:
To import polygon tables:
The .bat files contain the ogr2ogr command and a line to execute a Python script, update_geopackage_dsid.py. Be sure to change the path of this script in the .bat file to the one of your choice. This script adds the two attributes enc_chart and scale to all geopackage tables.
The python code used is as follows:
You could create a single .bat file for all three treatments, but you’ll need to check the results for each type of geometry.
You can download the .bat files and the Python code by clicking here.
Continues regardless of the number of files loaded
Cloning and/or adding imported tables to the ENC geopackage
For the first S57 load, the following Python script creates all the tables present in the three import geopackages. For subsequent loads, if the table already exists in the ENC geopackage, the records of the table in the import geopackage are added to the records already present in ENC. If, on the other hand, the table does not yet exist, it is created and populated with records from the import geopackage. In the ENC geopackage file, tables are prefixed with pt_, li_ and pl_ to indicate their geometry type.
Cloning is performed using the following code:
You can download the .py file here: clone_or_append_tables_with_prefix.py.
Duplicate management
When two ENC maps overlap, the entities present in the overlap zone will be inserted as many times as there are overlaps. To avoid duplicates and more, the database must be cleaned after each import of an S57 file.
But there are two different types of duplicates. On the one hand, overlapping areas between two maps of similar scale will produce “real” duplicates that need to be cleaned up. On the other hand, overlapping maps of very different scales need to be treated differently. The less detailed map (small scale) will have a selection of data from the more detailed map (large scale). In the latter case, it is not advisable to remove this information.
The Python code below will remove the “true” duplicates. For other duplicates, we recommend working with QGis filters. Depending on your needs, you can use the “scale” attribute we added when deleting the empty tables to define which data will or will not be displayed on your map.
In this example, only entities corresponding to a scale between 12500 and 45000 will be displayed on your map.
This operation corresponds to the filtering of a layer, but… what do you do when you have 130 layers displayed? To save you having to perform this operation 130 times, here’s a Python code that lets you filter all the layers in your project:
And another to remove the filter from all displayed layers:
Download the python codes here.
The following code creates a temporary table (temp_table) for each table in the ENC geopackage. It then fills this table with the separate LNAM-based records. It then deletes the original table and renames the temporary table with the name of the original table.
This treatment applies to all S57 tables, except:
- fire (LIGHTS) and bathymetric sounding (SOUNDG) tables, where the LNAM attribute is not useful. In fact, each fire sector has the same LNAM and name_rcid, but corresponds to a different color and geographic sector. Similarly, the LNAM and name_rcid of bathymetric soundings are the same for all sounding points in the original multipoint entity. In the first case, we need to search for duplicate colors and sectors for the same name_rcid, and for probes we need to find probes with the same XY in their geometry.
- SEAARE, LNDRGN and ADMARE tables, which cover large areas and are split according to the map footprint, while keeping the same LNAM. In this case, even if the LNAM is the same, the polygons do not have the same shape and area.
- DSID and C_AGGR tables, which have no LNAM and, by principle, cannot have duplicates.
You can download the .py file from this link: supprime_doublons.py. Don’t forget to change the paths and filenames to match your own. Note also the line :
if table_name not in [“pl_SEAARE”, “pl_LNDRGN”, “pl_ADMARE”, “layer_styles”, “natsurf”,”DSID”,”C_AGGR”]:
In principle, you’ll only have the prefixed tables and the layer_styles and natsurf tables. This line prevents pl_SEAARE, pl_LNDRGN and pl_ADMARE from being processed, but more importantly, it prevents the layer_styles and natsurf tables we use for symbologies from being emptied. If you need to create tables other than those from the s57 file in the geopackage, you need to take into account the names of these tables in this script, by adding them to this line of exceptions.