Skip to content

Commit

Permalink
Merge pull request #527 from ccrook/master
Browse files Browse the repository at this point in the history
Fix of delimited text provider to handle CSV files including quoted newlines properly
  • Loading branch information
timlinux committed Apr 15, 2013
2 parents 82b41db + 632bfbb commit fab2c57
Show file tree
Hide file tree
Showing 21 changed files with 3,886 additions and 1,021 deletions.
263 changes: 259 additions & 4 deletions src/core/qgsvectorlayer.h
Expand Up @@ -142,8 +142,252 @@ struct CORE_EXPORT QgsVectorJoinInfo


/** \ingroup core
* Vector layer backed by a data source provider.
* Represents a vector layer which manages a vector based data sets.
*
* The QgsVectorLayer is instantiated by specifying the name of a data provider,
* such as postgres or wfs, and url defining the specific data set to connect to.
* The vector layer constructor in turn instantiates a QgsVectorDataProvider subclass
* corresponding to the provider type, and passes it the url. The data provider
* connects to the data source.
*
* The QgsVectorLayer provides a common interface to the different data types. It also
* manages editing transactions.
*
* Sample usage of the QgsVectorLayer class:
*
* \code
* QString uri = "point?crs=epsg:4326&field=id:integer";
* QgsVectorLayer *scratchLayer = new QgsVectorLayer(uri, "Scratch point layer", "memory");
* \endcode
*
* The main data providers supported by QGis are listed below.
*
* \section providers Vector data providers
*
* \subsection memory Memory data providerType (memory)
*
* The memory data provider is used to construct in memory data, for example scratch
* data or data generated from spatial operations such as contouring. There is no
* inherent persistent storage of the data. The data source uri is constructed. The
* url specifies the geometry type ("point", "linestring", "polygon",
* "multipoint","multilinestring","multipolygon"), optionally followed by url parameters
* as follows:
*
* - crs=definition
* Defines the coordinate reference system to use for the layer.
* definition is any string accepted by QgsCoordinateReferenceSystem::createFromString()
*
* - index=yes
* Specifies that the layer will be constructed with a spatial index
*
* - field=name:type(length,precision)
* Defines an attribute of the layer. Multiple field parameters can be added
* to the data provider definition. type is one of "integer", "double", "string".
*
* An example url is "Point?crs=epsg:4326&field=id:integer&field=name:string(20)&index=yes"
*
* \subsection ogr OGR data provider (ogr)
*
* Accesses data using the OGR drivers (http://www.gdal.org/ogr/ogr_formats.html). The url
* is the OGR connection string. A wide variety of data formats can be accessed using this
* driver, including file based formats used by many GIS systems, database formats, and
* web services. Some of these formats are also supported by custom data providers listed
* below.
*
* \subsection spatialite Spatialite data provider (spatialite)
*
* Access data in a spatialite database. The url defines the connection parameters, table,
* geometry column, and other attributes. The url can be constructed using the
* QgsDataSourceURI class.
*
* \subsection postgres Postgresql data provider (postgres)
*
* Connects to a postgresql database. The url defines the connection parameters, table,
* geometry column, and other attributes. The url can be constructed using the
* QgsDataSourceURI class.
*
* \subsection mssql Microsoft SQL server data provider (mssql)
*
* Connects to a Microsoft SQL server database. The url defines the connection parameters, table,
* geometry column, and other attributes. The url can be constructed using the
* QgsDataSourceURI class.
*
* \subsection sqlanywhere SQL Anywhere data provider (sqlanywhere)
*
* Connects to an SQLanywhere database. The url defines the connection parameters, table,
* geometry column, and other attributes. The url can be constructed using the
* QgsDataSourceURI class.
*
* \subsection wfs WFS (web feature service) data provider (wfs)
*
* Used to access data provided by a web feature service.
*
* The url can be a HTTP url to a WFS 1.0.0 server or a GML2 data file path.
* Examples are http://foobar/wfs or /foo/bar/file.gml
*
* If a GML2 file path is provided the driver will attempt to read the schema from a
* file in the same directory with the same basename + “.xsd”. This xsd file must be
* in the same format as a WFS describe feature type response. If no xsd file is provide
* then the driver will attempt to guess the attribute types from the file.
*
* In the case of a HTTP URL the ‘FILTER’ query string parameter can be used to filter
* the WFS feature type. The ‘FILTER’ key value can either be a QGIS expression
* or an OGC XML filter. If the value is set to a QGIS expression the driver will
* turn it into OGC XML filter before passing it to the WFS server. Beware the
* QGIS expression filter only supports” =, != ,<,> ,<= ,>= ,AND ,OR ,NOT, LIKE, IS NULL”
* attribute operators, “BBOX, Disjoint, Intersects, Touches, Crosses, Contains, Overlaps, Within”
* spatial binary operators and the QGIS local “geomFromWKT, geomFromGML”
* geometry constructor functions.
*
* Also note:
*
* - You can use various functions available in the QGIS Expression list,
* however the function must exist server side and have the same name and arguments to work.
*
* - Use the special $geometry parameter to provide the layer geometry column as input
* into the spatial binary operators e.g intersects($geometry, geomFromWKT('POINT (5 6)'))
*
* \subsection delimitedtext Delimited text file data provider (delimitedtext)
*
* Accesses data in a delimited text file, for example CSV files generated by
* spreadsheets. The contents of the file are split into columns based on specified
* delimiter characters. Each record may be represented spatially either by an
* X and Y coordinate column, or by a WKT (well known text) formatted columns.
*
* The url defines the filename, the formatting options (how the
* text in the file is divided into data fields, and which fields contain the
* X,Y coordinates or WKT text definition. The options are specified as url query
* items.
*
* At its simplest the url can just be the filename, in which case it will be loaded
* as a CSV formatted file.
*
* The url may include the following items:
*
* - encoding=UTF-8
*
* Defines the character encoding in the file. The default is UTF-8. To use
* the default encoding for the operating system use "System".
*
* - type=(csv|regexp|whitespace|plain)
*
* Defines the algorithm used to split records into columns. Records are
* defined by new lines, except for csv format files for which quoted fields
* may span multiple records. The default type is csv.
*
* - "csv" splits the file based on three sets of characters:
* delimiter characters, quote characters,
* and escape characters. Delimiter characters mark the end
* of a field. Quote characters enclose a field which can contain
* delimiter characters, and newlines. Escape characters cause the
* following character to be treated literally (including delimiter,
* quote, and newline characters). Escape and quote characters must
* be different from delimiter characters. Escape characters that are
* also quote characters are treated specially - they can only
* escape themselves within quotes. Elsewhere they are treated as
* quote characters. The defaults for delimiter, quote, and escape
* are ',', '"', '"'.
* - "regexp" splits each record using a regular expression (see QRegExp
* documentation for details).
* - "whitespace" splits each record based on whitespace (on or more whitespace
* characters. Leading whitespace in the record is ignored.
* - "plain" is provided for backwards compatibility. It is equivalent to
* CSV except that the default quote characters are single and double quotes,
* and there is no escape characters.
*
* - delimiter=characters
*
* Defines the delimiter characters used for csv and plain type files, or the
* regular expression for regexp type files. It is a literal string of characters
* except that "\t" may be used to represent a tab character.
*
* - quote=characters
*
* Defines the characters that are used as quote characters for csv and plain type
* files.
*
* - escape=characters
*
* Defines the characters used to escape delimiter, quote, and newline characters.
*
* - skipEmptyFields=(yes|no)
*
* If yes then empty fields will be discarded (eqivalent to concatenating consecutive
* delimiters)
*
* - trimFields=(yes|no)
*
* If yes then leading and trailing whitespace will be removed from fields
*
* - skipLines=n
*
* Defines the number of lines to ignore at the beginning of the file (default 0)
*
* - useHeader=(yes|no)
*
* Defines whether the first record in the file (after skipped lines) contains
* column names (default yes)
*
* - xField=column yField=column
*
* Defines the name of the columns holding the x and y coordinates for XY point geometries.
* If the useHeader is no (ie there are no column names), then this is the column
* number (with the first column as 1).
*
* - decimalPoint=c
*
* Defines a character that is used as a decimal point in the X and Y columns.
* The default is '.'.
*
* - xyDms=(yes|no)
*
* If yes then the X and Y coordinates are interpreted as
* degrees/minutes/seconds format (fairly permissively),
* or degree/minutes format.
*
* - wktField=column
*
* Defines the name of the columns holding the WKT geometry definition for WKT geometries.
* If the useHeader is no (ie there are no column names), then this is the column
* number (with the first column as 1).
*
* - geomType=(point|line|polygon|none)
*
* Defines the geometry type for WKT type geometries. QGis will only display one
* type of geometry for the layer - any others will be ignored when the file is
* loaded. By default the provider uses the type of the first geometry in the file.
* Use geomType to override this type.
*
* geomType can also be set to none, in which case the layer is loaded without
* geometries.
*
* - crs=crsstring
*
* Defines the coordinate reference system used for the layer. This can be
* any string accepted by QgsCoordinateReferenceSystem::createFromString()
*
* - quiet
*
* Errors encountered loading the file will not be reported in a user dialog if
* quiet is included (They will still be shown in the output log).
*
* \subsection gpx GPX data provider (gpx)
*
* Provider reads tracks, routes, and waypoints from a GPX file. The url
* defines the name of the file, and the type of data to retrieve from it
* ("track", "route", or "waypoint").
*
* An example url is "/home/user/data/holiday.gpx?type=route"
*
* \subsection grass Grass data provider (grass)
*
* Provider to display vector data in a GRASS GIS layer.
*
*
*
*/


class CORE_EXPORT QgsVectorLayer : public QgsMapLayer
{
Q_OBJECT
Expand Down Expand Up @@ -235,7 +479,18 @@ class CORE_EXPORT QgsVectorLayer : public QgsMapLayer
QList<GroupData> mGroups;
};

/** Constructor */
/** Constructor - creates a vector layer
*
* The QgsVectorLayer is constructed by instantiating a data provider. The provider
* interprets the supplied path (url) of the data source to connect to and access the
* data.
*
* @param path The path or url of the parameter. Typically this encodes
* parameters used by the data provider as url query items.
* @param baseName The name used to represent the layer in the legend
* @param providerLib The name of the data provider, eg "memory", "postgres"
*
*/
QgsVectorLayer( QString path = QString::null, QString baseName = QString::null,
QString providerLib = QString::null, bool loadDefaultStyleFlag = true );

Expand Down Expand Up @@ -337,7 +592,7 @@ class CORE_EXPORT QgsVectorLayer : public QgsMapLayer
* @see deselect(QgsFeatureIds)
* @see deselect(QgsFeatureId)
*/
void modifySelection(QgsFeatureIds selectIds, QgsFeatureIds deselectIds );
void modifySelection( QgsFeatureIds selectIds, QgsFeatureIds deselectIds );

/** Select not selected features and deselect selected ones */
void invertSelection();
Expand Down Expand Up @@ -940,7 +1195,7 @@ class CORE_EXPORT QgsVectorLayer : public QgsMapLayer
*
* @see deselect(QgsFeatureId)
*/
void deselect(const QgsFeatureIds& featureIds );
void deselect( const QgsFeatureIds& featureIds );

/**
* Clear selection
Expand Down
1 change: 1 addition & 0 deletions src/providers/delimitedtext/CMakeLists.txt
Expand Up @@ -5,6 +5,7 @@
SET (DTEXT_SRCS
qgsdelimitedtextfeatureiterator.cpp
qgsdelimitedtextprovider.cpp
qgsdelimitedtextfile.cpp
qgsdelimitedtextsourceselect.cpp
)

Expand Down

0 comments on commit fab2c57

Please sign in to comment.