Skip to content

Commit 4fa44cf

Browse files
committedApr 17, 2013
Merge pull request #532 from ccrook/delimited_text_bug_fixes
Three delimited text bug fixes, GUI tidyup, context help fix
2 parents 22e54b9 + e1421f5 commit 4fa44cf

20 files changed

+1070
-790
lines changed
 

‎resources/context_help/QgsDelimitedTextPluginGui-en_US

Lines changed: 0 additions & 59 deletions
This file was deleted.
Lines changed: 172 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,172 @@
1+
<h3>Delimited Text File Layer</h3>
2+
Loads and displays delimited text files
3+
<p>
4+
<a href="#re">Overview</a><br/>
5+
<a href="#creating">Creating a delimited text layer</a><br/>
6+
<a href="#csv">How the delimiter, quote, and escape characters work</a><br />
7+
<a href="#regexp">How regular expression delimiters work</a><br />
8+
<a href="#wkt">How WKT text is interpreted</a><br />
9+
<a href="#example">Example of a text file with X,Y point coordinates</a><br/>
10+
<a href="#wkt_example">Example of a text file with WKT geometries</a><br/>
11+
<a href="#notes">Notes</a><br/>
12+
</p>
13+
14+
<h4><a name="re">Overview</a></h4>
15+
<p>A &quot;delimited text file&quot; contains data in which each record starts on a new line, and
16+
is split into fields by a delimiter such as a comma.
17+
This type of file is commonly exported from spreadsheets (for example CSV files) or databases.
18+
Typically the first line of a delimited text file contains the names of the fields.
19+
</p>
20+
<p>
21+
Delimited text files can be loaded into QGIS as a layer.
22+
The records can be displayed spatially either as a point
23+
defined by X and Y coordinates, or using a Well Known Text (WKT) definition of a geometry which may
24+
describe points, lines, and polygons of arbitrary complexity. The file can also be loaded as an attribute
25+
only table, which can then be joined to other tables in QGis.
26+
</p>
27+
<p>
28+
In addition to the geometry definition the file can contain text, integer, and real number fields. QGis
29+
will choose the type of field based on its contents.
30+
</p>
31+
<h4><a name="creating">Creating a delimited text layer</a></h4>
32+
<p>Creating a delimited text layer involves choosing the data file, defining the format (how each record is to
33+
be split into fields), and defining the geometry is represented.
34+
This is managed with the delimited text dialog as detailed below.
35+
The dialog box displays a sample from the beginning of the file which shows how the format
36+
options have been applied.
37+
</p>
38+
<h5>Choosing the data file</h5>
39+
<p>Use the &quot;Browse...&quot; button to select the data file. Once the file is selected the
40+
layer name will automatically be populated based on the file name. The layer name is used to represent
41+
the data in the QGis legend.
42+
</p>
43+
<p>
44+
By default files are assumed to be encoded as UTF-8. However other file
45+
encodings can be selected. For example &quot;System&quot; uses the default encoding for the operating system.
46+
If you are expecting to move the QGis project then it is safer to use a specific encoding.
47+
</p>
48+
<h5>Specifying the file format</h5>
49+
<p>The file format can be one of
50+
<ul>
51+
<li>CSV file format. This is a format commonly used by spreadsheets, in which fields are delimited
52+
by a comma character, and quoted using a &quot;(quote) character. Within quoted fields, a quote
53+
mark is entered as &quot;&quot;.</li>
54+
<li>Selected delimiters. Each record is split into fields using one or more delimiter character.
55+
Quote characters are used for fields which may contain delimiters. Escape characters may be used
56+
to treat the following character as a normal character (ie to include delimiter, quote, and
57+
new line characters in text fields). The use of delimiter, quote, and escape characters is detailed <a href="#csv">below</a>.
58+
<li>Regular expression. Each line is split into fields using a &quot;regular expression&quot; delimiter.
59+
The use of regular expressions is details <a href="#regexp">below</a>.
60+
</ul>
61+
<h5>Record and field options</h5>
62+
<p>The following options affect the selection of records and fields from the data file</p>
63+
<ul>
64+
<li>Number of header lines to discard: used to skip over header lines at the beginning of the text file</li>
65+
<li>First record has fields names: if selected then the first record in the file (after the skipped lines) is interpreted as names of fields, rather than as a data record.</li>
66+
<li>Trim fields: if selected then leading and trailing whitespace characters will be removed from each field (except quoted fields). </li>
67+
<li>Discard empty fields: if selected then empty fields (after trimming) will be discard. This
68+
affects the alignment of data into fields and is equivalent to treating consecutive delimiters as a
69+
single delimiter. Quoted fields are never discarded.</li>
70+
<li>Decimal point is comma: if selected then commas in real numbers represent the decimal point. For
71+
example &quot;-51,354&quot; is equivalent to -51.354.
72+
</li>
73+
</ul>
74+
<h5>Geometry definition</h5>
75+
<p>The geometry is can be define as one of</p>
76+
<ul>
77+
<li>Point coordinates: each feature is represented as a point defined by X and Y coordinates.</li>
78+
<li>Well known text (WKT) geometry: each feature is represented as a well known text string, for example
79+
&quot;POINT(1.525622 51.20836)&quot;. See details of the <a href="#wkt">well known text</a> format.
80+
<li>No geometry (attribute only table): records will not be displayed on the map, but can be viewed
81+
in the attribute table and joined to other layers in QGis</li>
82+
</ul>
83+
<p>For point coordinates the following options apply:</p>
84+
<ul>
85+
<li>X field: specifies the field containing the X coordinate</li>
86+
<li>Y field: specifies the field containing the Y coordinate</li>
87+
<li>DMS angles: if selected coordinates are represented as degrees/minutes/seconds
88+
or degrees/minutes. QGis is quite permissive in its interpretation of degrees/minutes/seconds.
89+
A valid DMS coordinate will contain three numeric fields with an optional hemisphere prefix or suffix
90+
(N, E, or + are positive, S, W, or - are negative). Additional non numeric characters are
91+
generally discarded. For example &quot;N41d54'01.54&quot;&quot; is a valid coordinate.
92+
</li>
93+
</ul>
94+
<p>For well known text geometry the following options apply:</p>
95+
<ul>
96+
<li>Geometry field: the field containing the well known text definition.</li>
97+
<li>Geometry type: one of &quot;Detect&quot; (detect), &quot;Point&quot;, &quot;Line&quot;, or &quot;Polygon&quot;.
98+
QGis layers can only display one type of geometry feature (point, line, or polygon). This option selects
99+
which geometry type is displayed in text files containing multiple geometry types. Records containing
100+
other geometry types are discarded.
101+
If &quot;Detect&quot; is selected then the type of the first geometry in the file will be used.
102+
&quot;Point&quot; includes POINT and MULTIPOINT WKT types, &quot;Line&quot; includes LINESTRING and
103+
MULTLINESTRING WKT types, and &quot;Polygon&quot; includes POLYGON and MULTIPOLYGON WKT types.
104+
</ul>
105+
106+
<h4><a name="csv">How the delimiter, quote, and escape characters work</a></h4>
107+
<p>Records are split into fields using three character sets: delimiter characters, quote characters,
108+
and escape characters. Quote and escape characters cannot be the same as delimiter characters - they
109+
will be ignored if they are. Escape characters can be the same as quote characters, but behave differently
110+
if they are.</p>
111+
<p>The delimiter characters are used to mark the end of each field. If more than one delimiter character
112+
is defined then any one of the characters can mark the end of a field. The quote and escape characters
113+
can override the delimiter character, so that it is treated as a normal character.</p>
114+
<p>Quote characters may be used to mark the beginning and end of quoted fields. Quoted fields can
115+
contain delimiters and may span multiple lines in the text file. If a field is quoted then it must
116+
start and end with the same quote character. Quote characters cannot occur within a field unless they
117+
are escaped.</p>
118+
<p>Escape characters which are not quote characters force the following character to be treated normally
119+
(that is, to stop it being treated as a new line, delimiter, or quote character).
120+
</p>
121+
<p>If a quote character is also an escape character, then it can be represented in a quoted field by
122+
entering it twice. For example if ' is a quote character and an escape character, then the string
123+
'Smith''s&nbsp;Creek' will represent the value Smith's&nbsp;Creek.
124+
</p>
125+
<h4><a name="regexp">How regular expression delimiters work</a></h4>
126+
<p>Regular expressions are mini-language used to represent character patterns. There are many variations
127+
of regular expression syntax - QGis uses the syntax provided by the <a href="http://qt-project.org/doc/qt-4.8/qregexp.html">QRegExp</a> class of the <a href="http://qt.digia.com">Qt</a> framework.</p>
128+
<p>In a regular expression delimited file each line is treated as a record. Each match of the regular expression in the line is treated as the end of a field.</p>
129+
130+
<h4><a name="wkt">How WKT text is interpreted</a></h4>
131+
<p>
132+
The delimited text layer recognizes the following
133+
<a href="http://en.wikipedia.org/wiki/Well-known_text">well known text</a> types -
134+
POINT, MULTIPOINT, LINESTRING, MULTILINESTRING, POLYGON, and MULTIPOLYGON. It will accept geometries with
135+
a Z coordinate (eg &quot;POINT&nbsp;Z&quot;), a measure (&quot;POINT&nbsp;M&quot;), or both (&quot;POINT&nbsp;ZM&quot;).
136+
</p>
137+
<p>
138+
It can also handle the PostGIS EWKT variation, in which the geomtry is preceded by an spatial reference
139+
system id (eg &quot;SRID=4326;POINT(175.3&nbsp;41.2)&quot;), and a variant used by Informix in which the WKT is
140+
preceded by an integer spatial reference id (eg &quot;1 POINT(175.3&nbsp;41.2)&quot;).
141+
In both cases the SRID is ignored.
142+
</p>
143+
<h4><a name="example">Example of a text file with X,Y point coordinates</a></h4>
144+
<pre>
145+
X;Y;ELEV<br />
146+
-300120;7689960;13<br />
147+
-654360;7562040;52<br />
148+
1640;7512840;3<br />
149+
</pre>
150+
<p>This file:</p>
151+
<ul>
152+
<li> Uses <b>;</b> as delimiter. Any character can be used to delimit the fields.</li>
153+
<li>The first row is the header row. It contains the field names X, Y and ELEV.</li>
154+
<li>No quotes (") are used to delimit text fields.</li>
155+
<li>The x coordinates are contained in the X field.</li>
156+
<li>The y coordinates are contained in the Y field.</li>
157+
</ul>
158+
<h4><a name="wkt_example">Example of a text file with WKT geometries</a></h4>
159+
<pre>
160+
id|wkt<br />
161+
1|POINT(172.0702250 -43.6031036)<br />
162+
2|POINT(172.0702250 -43.6031036)<br />
163+
3|POINT(172.1543206 -43.5731302)<br />
164+
4|POINT(171.9282585 -43.5493308)<br />
165+
5|POINT(171.8827359 -43.5875983)<br />
166+
</pre>
167+
<p>This file:</p>
168+
<ul>
169+
<li>Has two fields defined in the header row: id and wkt.
170+
<li>Uses <b>|</b> as a delimiter.</li>
171+
<li>Specifies each point using the WKT notation
172+
</ul>

‎src/core/qgsvectorlayer.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -336,7 +336,7 @@ struct CORE_EXPORT QgsVectorJoinInfo
336336
*
337337
* - decimalPoint=c
338338
*
339-
* Defines a character that is used as a decimal point in the X and Y columns.
339+
* Defines a character that is used as a decimal point in the numeric columns
340340
* The default is '.'.
341341
*
342342
* - xyDms=(yes|no)

‎src/providers/delimitedtext/qgsdelimitedtextfeatureiterator.cpp

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -216,16 +216,24 @@ void QgsDelimitedTextFeatureIterator::fetchAttribute( QgsFeature& feature, int f
216216
switch ( P->attributeFields[fieldIdx].type() )
217217
{
218218
case QVariant::Int:
219-
if ( !value.isEmpty() )
220-
val = QVariant( value );
221-
else
219+
if ( value.isEmpty() )
222220
val = QVariant( P->attributeFields[fieldIdx].type() );
221+
else
222+
val = QVariant( value );
223223
break;
224224
case QVariant::Double:
225-
if ( !value.isEmpty() )
225+
if ( value.isEmpty() )
226+
{
227+
val = QVariant( P->attributeFields[fieldIdx].type() );
228+
}
229+
else if ( P->mDecimalPoint.isEmpty() )
230+
{
226231
val = QVariant( value.toDouble() );
232+
}
227233
else
228-
val = QVariant( P->attributeFields[fieldIdx].type() );
234+
{
235+
val = QVariant( QString( value ).replace( P->mDecimalPoint, "." ).toDouble() );
236+
}
229237
break;
230238
default:
231239
val = QVariant( value );

‎src/providers/delimitedtext/qgsdelimitedtextprovider.cpp

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -372,6 +372,10 @@ QgsDelimitedTextProvider::QgsDelimitedTextProvider( QString uri )
372372
}
373373
if ( couldBeDouble[fieldPos] )
374374
{
375+
if ( ! mDecimalPoint.isEmpty() )
376+
{
377+
value.replace( mDecimalPoint, "." );
378+
}
375379
value.toDouble( &couldBeDouble[fieldPos] );
376380
}
377381
fieldPos++;

‎src/providers/delimitedtext/qgsdelimitedtextsourceselect.cpp

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,8 @@ QgsDelimitedTextSourceSelect::QgsDelimitedTextSourceSelect( QWidget * parent, Qt
3535
mFile( new QgsDelimitedTextFile() ),
3636
mExampleRowCount( 20 ),
3737
mColumnNamePrefix( "Column_" ),
38-
mPluginKey( "/Plugin-DelimitedText" )
38+
mPluginKey( "/Plugin-DelimitedText" ),
39+
mLastFileType("")
3940
{
4041

4142
setupUi( this );
@@ -250,7 +251,7 @@ void QgsDelimitedTextSourceSelect::loadSettings( QString subkey, bool loadGeomSe
250251
QString encoding = settings.value( key + "/encoding", "" ).toString();
251252
if ( ! encoding.isEmpty() ) cbxEncoding->setCurrentIndex( cbxEncoding->findText( encoding ) );
252253
QString delimiters = settings.value( key + "/delimiters", "" ).toString();
253-
if ( delimiters.isEmpty() ) setSelectedChars( delimiters );
254+
if ( ! delimiters.isEmpty() ) setSelectedChars( delimiters );
254255

255256
txtQuoteChars->setText( settings.value( key + "/quoteChars", "\"" ).toString() );
256257
txtEscapeChars->setText( settings.value( key + "/escapeChars", "\"" ).toString() );
@@ -262,14 +263,14 @@ void QgsDelimitedTextSourceSelect::loadSettings( QString subkey, bool loadGeomSe
262263
cbxUseHeader->setChecked( settings.value( key + "/useHeader", "true" ) != "false" );
263264
cbxTrimFields->setChecked( settings.value( key + "/trimFields", "false" ) == "true" );
264265
cbxSkipEmptyFields->setChecked( settings.value( key + "/skipEmptyFields", "false" ) == "true" );
266+
cbxPointIsComma->setChecked( settings.value( key + "/decimalPoint", "." ).toString().contains( "," ) );
265267

266268
if ( loadGeomSettings )
267269
{
268270
QString geomColumnType = settings.value( key + "/geomColumnType", "xy" ).toString();
269271
if ( geomColumnType == "xy" ) geomTypeXY->setChecked( true );
270272
else if ( geomColumnType == "wkt" ) geomTypeWKT->setChecked( true );
271273
else geomTypeNone->setChecked( true );
272-
cbxPointIsComma->setChecked( settings.value( key + "/decimalPoint", "." ).toString().contains( "," ) );
273274
cbxXyDms->setChecked( settings.value( key + "/xyDms", "false" ) == "true" );
274275
}
275276

@@ -297,13 +298,13 @@ void QgsDelimitedTextSourceSelect::saveSettings( QString subkey, bool saveGeomSe
297298
settings.setValue( key + "/useHeader", cbxUseHeader->isChecked() ? "true" : "false" );
298299
settings.setValue( key + "/trimFields", cbxTrimFields->isChecked() ? "true" : "false" );
299300
settings.setValue( key + "/skipEmptyFields", cbxSkipEmptyFields->isChecked() ? "true" : "false" );
301+
settings.setValue( key + "/decimalPoint", cbxPointIsComma->isChecked() ? "," : "." );
300302
if ( saveGeomSettings )
301303
{
302304
QString geomColumnType = "none";
303305
if ( geomTypeXY->isChecked() ) geomColumnType = "xy";
304306
if ( geomTypeWKT->isChecked() ) geomColumnType = "wkt";
305307
settings.setValue( key + "/geomColumnType", geomColumnType );
306-
settings.setValue( key + "/decimalPoint", cbxPointIsComma->isChecked() ? "," : "." );
307308
settings.setValue( key + "/xyDms", cbxXyDms->isChecked() ? "true" : "false" );
308309
}
309310

@@ -313,7 +314,10 @@ void QgsDelimitedTextSourceSelect::loadSettingsForFile( QString filename )
313314
{
314315
if ( filename.isEmpty() ) return;
315316
QFileInfo fi( filename );
316-
loadSettings( fi.suffix(), true );
317+
QString filetype=fi.suffix();
318+
// Don't expect to change settings if not changing file type
319+
if( filetype != mLastFileType ) loadSettings( fi.suffix(), true );
320+
mLastFileType = filetype;
317321
}
318322

319323
void QgsDelimitedTextSourceSelect::saveSettingsForFile( QString filename )
@@ -476,6 +480,7 @@ void QgsDelimitedTextSourceSelect::updateFieldLists()
476480

477481
tblSample->setHorizontalHeaderLabels( fieldList );
478482
tblSample->resizeColumnsToContents();
483+
tblSample->resizeRowsToContents();
479484

480485
// We don't know anything about a text based field other
481486
// than its name. All fields are assumed to be text

‎src/providers/delimitedtext/qgsdelimitedtextsourceselect.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,7 @@ class QgsDelimitedTextSourceSelect : public QDialog, private Ui::QgsDelimitedTex
5252
int mExampleRowCount;
5353
QString mColumnNamePrefix;
5454
QString mPluginKey;
55+
QString mLastFileType;
5556

5657
private slots:
5758
void on_buttonBox_accepted();

0 commit comments

Comments
 (0)
Please sign in to comment.