Virtual layers

There are some situations where the user would want to access data from one of different sources, indirectly, through some kind of transformations. This could be selecting a subset of attributes, renaming some of them, accessing subsets of a layer or compute dynamic attributes and geometries based on one or more underlying tables.
This concept can already be found in the context of RDBMS, where it is called a "view".

The following proposition will discuss how such a feature could be implemented in QGIS.

There is already something very close that can be found in the GDAL library: VRT. A VRT is a virtual dataset that can be composed of other datasets. Here is an example of such a file, where two X and Y woordinates from a primary dataset named "ROUTE_XY" are used to build a point layer :

<OGRVRTDataSource>
<OGRVRTLayer name="ROUTE_XY">
<SrcDataSource>ODBC:ROUTE_XY,ROUTE_XY</SrcDataSource>
<SrcLayer>ROUTE_XY</SrcLayer>
<SrcSQL>SELECT * FROM ROUTE_XY</SrcSQL>
<GeometryType>wkbPoint</GeometryType>
<LayerSRS>EPSG:2154</LayerSRS>
<GeometryField encoding="PointFromColumns" x="X" y="Y"/>
</OGRVRTLayer>
</OGRVRTDataSource>

The idea here is that if data of the primary dataset changes, the VRT layer is dynamically updated.

The GDAL's VRT driver is very powerful, but is limited to data sources handled by GDAL.

This is why we propose to have a similar concept for QGIS, where every data sources known by QGIS can be used.

Possible use cases

The proposed design is based on these different use cases for vector layers. Feel free to add/discuss your own here.

  • create a dynamic point layer based on X,Y(,Z) coordinates
  • give access to a subset of attributes of a layer
  • give access to some renamed attributes of a layer
  • give access to attributes of a layer with dynamic type casting
  • give access to attributes that are computed, based on the value of other attributes
  • allow to filter features based on a SQL query, even for data sources not designed for that (CSV, ODS, etc.), and including Spatialite operators (Transform, Intersects, etc.)
  • allow to join different layers (including with a spatial clause)
Two types of user profiles are considered here :
  • a "basic" usage of virtual layers where such a table is linked to only one layer and a subset of attributes where the user can specify their name, their type and possibly add new attributes based on an expression / SQL operators.
  • an "advanced" usage of virtual layers, where SQL can be used to build the view and where more than one layer can be linked.

Proposed design

Regarding the "advanced" mode, SQLITE has a very useful feature of virtual tables that could be used here. The idea would be to build a SQLITE database (possibly in memory) and give access to QGIS layers via a CREATE VIRTUAL TABLE statement. This way, we could benefit from the advanced SQL query language implemented in SQLITE.
Moreover, the built database should be a Spatialite one, in order to be able to access geometries and use spatial operators.
This leads us to propose a QGIS virtual table provider for Spatialite.

The "basic" mode would then be used through a special UI that ease creation of a Spatialite base with a virtual table.

Here are some questions/issues/remarks raised after a first very simple Python prototype of virtual layers by means of a QGIS plugin (https://github.com/Oslandia/qgis_vtable):
  • References to primary data sources can be stored either as source URI, or as QGIS layerid.
    • The former would allow to have virtual layers that directly embed other layers, without the need for the embedded layers to be loaded beforehand by QGIS.
    • The latter would allow to reference memory layers, but would need every embedded layers to appear as standalone layer.
  • Offering a complete Spatialite geometric view from QGIS data sources implies to return a BLOB for geometries formatted with the internal Spatialite format for geometries. The Python API regarding virtual tables support is too limited to implement that.
  • Regular indices for attributes (btree) cannot be created on SQLITE's virtual tables
  • R-Tree indices for spatial indexing can be added on a virtual table (not verified). This could be useful for computed geometries
  • In-memory spatialite databases are possible. It could be used for spatialite databases that are only composed of virtual layers. It would have to be storable directly into the project file, and re-created on project loading.
    • That implies there is a data provider (Spatialite) that is able to create a virtual layer based on a list of layer references, passed as its source URI (or through custom properties ?)
    • That also implies dependencies of vector layers during project loading : a virtual layer cannot reference a layer that has not yet been created. Possible workarounds: add a way to express dependencies between layers on project saving (-) or only check for embedded layer existence on first feature access, rather than during the creation (+)
    • If reference to embedded layers or stored as ID, a virtual layer could normally reference a memory layer that have been saved by the memory layer saver plugin.

Possible UI integration

The creation of a virtual layer would be available through a new option for existing Spatialite databases (in QSpatialite probably).

Some shortcuts would be provided to create a virtual table, for instance, by right-cliking on a vector layer, it would be proposed to create a new virtual layer based on the selected layer. This could also work for multiple layers (in case of an "advanced" use for SQL joins)

Possible dialog for the creation of a virtual layer:

These shortcuts would automatically create a spatialite databse with virtual tables and views according to parameters set by the user.
For the previous example, it would be equivalent to the creation of a Spatialite database with the following statements :

CREATE VIRTUAL TABLE virtual_layer_ USING QgsVirtualVectorLayer("point_layer");
CREATE VIEW virtual_layer AS SELECT NOM as Nom, CAST(HAUTZ AS INTEGER) AS Hauteur, MakePoint(X,Y) AS Geometry FROM virtual_layer_;
# commands to declare this view as a spatial view
INSERT INTO geometry_columns ....

For more advanced uses, a shortcut in QSpatialite would be added to ease the creation of a virtual table from a QGIS layer.
Example of a spatial join with a virtual layer that embeds two other layers:

CREATE VIRTUAL TABLE point_layer_vl USING QgsVirtualVectorLayer("point_layer");
CREATE VIRTUAL TABLE polygon_layer_vl USING QgsVirtualVectorLayer("polygon_layer");
CREATE VIEW virtual_layer AS SELECT b.id, b.geometry where Contains(b.geom, a.geom) FROM point_layer_vl AS a, polygon_layer_vl AS b;
INSERT INTO geometry_columns ...

new_virtual_layer.png (42.8 KB) Hugo Mercier, 2014-04-22 07:52 AM

new_virtual_layer.png (42.1 KB) Hugo Mercier, 2014-04-23 02:12 AM