Skip to content

Commit dfd448e

Browse files
committedApr 16, 2013
Delimited text file: tidying up UI layout and labels, updating help file
1 parent d06c308 commit dfd448e

File tree

2 files changed

+868
-655
lines changed

2 files changed

+868
-655
lines changed
 

‎resources/context_help/QgsDelimitedTextSourceSelectBase-en_US

Lines changed: 158 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -1,59 +1,172 @@
1-
<h3>Delimited Text Layer Plugin</h3>
2-
Loads and displays delimited text files containing x,y coordinates.
1+
<h3>Delimited Text File Layer</h3>
2+
Loads and displays delimited text files
33
<p>
4-
<p>
5-
<a href="#re">Requirements</a><br/>
6-
<a href="#example">Example of a valid text file</a><br/>
7-
<a href="#wkt_example">Example of a valid text file with a WKT field</a><br/>
4+
<a href="#re">Overview</a><br/>
5+
<a href="#creating">Creating a delimited text layer</a><br/>
6+
<a href="#csv">How the delimiter, quote, and escape characters work</a><br />
7+
<a href="#regexp">How regular expression delimiters work</a><br />
8+
<a href="#wkt">How WKT text is interpreted</a><br />
9+
<a href="#example">Example of a text file with X,Y point coordinates</a><br/>
10+
<a href="#wkt_example">Example of a text file with WKT geometries</a><br/>
811
<a href="#notes">Notes</a><br/>
12+
</p>
13+
14+
<h4><a name="re">Overview</a></h4>
15+
<p>A &quot;delimited text file&quot; contains data in which each record starts on a new line, and
16+
is split into fields by a delimiter such as a comma.
17+
This type of file is commonly exported from spreadsheets (for example CSV files) or databases.
18+
Typically the first line of a delimited text file contains the names of the fields.
19+
</p>
20+
<p>
21+
Delimited text files can be loaded into QGIS as a layer.
22+
The records can be displayed spatially either as a point
23+
defined by X and Y coordinates, or using a Well Known Text (WKT) definition of a geometry which may
24+
describe points, lines, and polygons of arbitrary complexity. The file can also be loaded as an attribute
25+
only table, which can then be joined to other tables in QGis.
26+
</p>
27+
<p>
28+
In addition to the geometry definition the file can contain text, integer, and real number fields. QGis
29+
will choose the type of field based on its contents.
30+
</p>
31+
<h4><a name="creating">Creating a delimited text layer</a></h4>
32+
<p>Creating a delimited text layer involves choosing the data file, defining the format (how each record is to
33+
be split into fields), and defining the geometry is represented.
34+
This is managed with the delimited text dialog as detailed below.
35+
The dialog box displays a sample from the beginning of the file which shows how the format
36+
options have been applied.
37+
</p>
38+
<h5>Choosing the data file</h5>
39+
<p>Use the &quot;Browse...&quot; button to select the data file. Once the file is selected the
40+
layer name will automatically be populated based on the file name. The layer name is used to represent
41+
the data in the QGis legend.
42+
</p>
43+
<p>
44+
By default files are assumed to be encoded as UTF-8. However other file
45+
encodings can be selected. For example &quot;System&quot; uses the default encoding for the operating system.
46+
If you are expecting to move the QGis project then it is safer to use a specific encoding.
47+
</p>
48+
<h5>Specifying the file format</h5>
49+
<p>The file format can be one of
50+
<ul>
51+
<li>CSV file format. This is a format commonly used by spreadsheets, in which fields are delimited
52+
by a comma character, and quoted using a &quot;(quote) character. Within quoted fields, a quote
53+
mark is entered as &quot;&quot;.</li>
54+
<li>Selected delimiters. Each record is split into fields using one or more delimiter character.
55+
Quote characters are used for fields which may contain delimiters. Escape characters may be used
56+
to treat the following character as a normal character (ie to include delimiter, quote, and
57+
new line characters in text fields). The use of delimiter, quote, and escape characters is detailed <a href="#csv">below</a>.
58+
<li>Regular expression. Each line is split into fields using a &quot;regular expression&quot; delimiter.
59+
The use of regular expressions is details <a href="#regexp">below</a>.
60+
</ul>
61+
<h5>Record and field options</h5>
62+
<p>The following options affect the selection of records and fields from the data file</p>
63+
<ul>
64+
<li>Number of header lines to discard: used to skip over header lines at the beginning of the text file</li>
65+
<li>First record has fields names: if selected then the first record in the file (after the skipped lines) is interpreted as names of fields, rather than as a data record.</li>
66+
<li>Trim fields: if selected then leading and trailing whitespace characters will be removed from each field (except quoted fields). </li>
67+
<li>Discard empty fields: if selected then empty fields (after trimming) will be discard. This
68+
affects the alignment of data into fields and is equivalent to treating consecutive delimiters as a
69+
single delimiter. Quoted fields are never discarded.</li>
70+
<li>Decimal point is comma: if selected then commas in real numbers represent the decimal point. For
71+
example &quot;-51,354&quot; is equivalent to -51.354.
72+
</li>
73+
</ul>
74+
<h5>Geometry definition</h5>
75+
<p>The geometry is can be define as one of</p>
76+
<ul>
77+
<li>Point coordinates: each feature is represented as a point defined by X and Y coordinates.</li>
78+
<li>Well known text (WKT) geometry: each feature is represented as a well known text string, for example
79+
&quot;POINT(1.525622 51.20836)&quot;. See details of the <a href="#wkt">well known text</a> format.
80+
<li>No geometry (attribute only table): records will not be displayed on the map, but can be viewed
81+
in the attribute table and joined to other layers in QGis</li>
82+
</ul>
83+
<p>For point coordinates the following options apply:</p>
84+
<ul>
85+
<li>X field: specifies the field containing the X coordinate</li>
86+
<li>Y field: specifies the field containing the Y coordinate</li>
87+
<li>DMS angles: if selected coordinates are represented as degrees/minutes/seconds
88+
or degrees/minutes. QGis is quite permissive in its interpretation of degrees/minutes/seconds.
89+
A valid DMS coordinate will contain three numeric fields with an optional hemisphere prefix or suffix
90+
(N, E, or + are positive, S, W, or - are negative). Additional non numeric characters are
91+
generally discarded. For example &quot;N41d54'01.54&quot;&quot; is a valid coordinate.
92+
</li>
93+
</ul>
94+
<p>For well known text geometry the following options apply:</p>
95+
<ul>
96+
<li>Geometry field: the field containing the well known text definition.</li>
97+
<li>Geometry type: one of &quot;Detect&quot; (detect), &quot;Point&quot;, &quot;Line&quot;, or &quot;Polygon&quot;.
98+
QGis layers can only display one type of geometry feature (point, line, or polygon). This option selects
99+
which geometry type is displayed in text files containing multiple geometry types. Records containing
100+
other geometry types are discarded.
101+
If &quot;Detect&quot; is selected then the type of the first geometry in the file will be used.
102+
&quot;Point&quot; includes POINT and MULTIPOINT WKT types, &quot;Line&quot; includes LINESTRING and
103+
MULTLINESTRING WKT types, and &quot;Polygon&quot; includes POLYGON and MULTIPOLYGON WKT types.
104+
</ul>
105+
106+
<h4><a name="csv">How the delimiter, quote, and escape characters work</a></h4>
107+
<p>Records are split into fields using three character sets: delimiter characters, quote characters,
108+
and escape characters. Quote and escape characters cannot be the same as delimiter characters - they
109+
will be ignored if they are. Escape characters can be the same as quote characters, but behave differently
110+
if they are.</p>
111+
<p>The delimiter characters are used to mark the end of each field. If more than one delimiter character
112+
is defined then any one of the characters can mark the end of a field. The quote and escape characters
113+
can override the delimiter character, so that it is treated as a normal character.</p>
114+
<p>Quote characters may be used to mark the beginning and end of quoted fields. Quoted fields can
115+
contain delimiters and may span multiple lines in the text file. If a field is quoted then it must
116+
start and end with the same quote character. Quote characters cannot occur within a field unless they
117+
are escaped.</p>
118+
<p>Escape characters which are not quote characters force the following character to be treated normally
119+
(that is, to stop it being treated as a new line, delimiter, or quote character).
120+
</p>
121+
<p>If a quote character is also an escape character, then it can be represented in a quoted field by
122+
entering it twice. For example if ' is a quote character and an escape character, then the string
123+
'Smith''s&nbsp;Creek' will represent the value Smith's&nbsp;Creek.
124+
</p>
125+
<h4><a name="regexp">How regular expression delimiters work</a></h4>
126+
<p>Regular expressions are mini-language used to represent character patterns. There are many variations
127+
of regular expression syntax - QGis uses the syntax provided by the <a href="http://qt-project.org/doc/qt-4.8/qregexp.html">QRegExp</a> class of the <a href="http://qt.digia.com">Qt</a> framework.</p>
128+
<p>In a regular expression delimited file each line is treated as a record. Each match of the regular expression in the line is treated as the end of a field.</p>
9129

10-
<a name="re">
11-
<h4>Requirements</h4>
12-
</a>
13-
To view a delimited text file as layer, the text file must contain:
14-
<ol>
15-
<li>A delimited header row of field names. This must be the first line in the text file.</li>
16-
<li>The header row must contain an X and Y field <em>or</em> a Well Known Text (WKT) field. These fields can have any name.</li>
17-
<li>The <B>x</B> and <B>y</B> coordinates must be specified as a number. The coordinate system is not important.</li>
18-
<li>A WKT field must be in the standard format.
19-
</ol>
20-
<a name="example">
21-
<h4>Example of a valid text file with x and y fields</h4>
22-
</a>
23-
<tt>
24-
X;Y;ELEV<br/>
25-
-300120;7689960;13<br/>
26-
-654360;7562040;52<br/>
27-
1640;7512840;3<br/>
28-
[...]<br/>
29-
</tt>
30-
<a name="wkt_example">
31-
<h4>Example of a valid text file with a WKT field</h4>
32-
</a>
33-
<tt>
34-
id|wkt<br/>
35-
1|POINT(172.0702250 -43.6031036)<br/>
36-
2|POINT(172.0702250 -43.6031036)<br/>
37-
3|POINT(172.1543206 -43.5731302)<br/>
38-
4|POINT(171.9282585 -43.5493308)<br/>
39-
5|POINT(171.8827359 -43.5875983)<br/>
40-
</tt>
41-
<a name="notes">
42-
<h4>Notes</h4>
43-
</a>
44-
<ol>
45-
<li>The example text file:</li>
130+
<h4><a name="wkt">How WKT text is interpreted</a></h4>
131+
<p>
132+
The delimited text layer recognizes the following
133+
<a href="http://en.wikipedia.org/wiki/Well-known_text">well known text</a> types -
134+
POINT, MULTIPOINT, LINESTRING, MULTILINESTRING, POLYGON, and MULTIPOLYGON. It will accept geometries with
135+
a Z coordinate (eg &quot;POINT&nbsp;Z&quot;), a measure (&quot;POINT&nbsp;M&quot;), or both (&quot;POINT&nbsp;ZM&quot;).
136+
</p>
137+
<p>
138+
It can also handle the PostGIS EWKT variation, in which the geomtry is preceded by an spatial reference
139+
system id (eg &quot;SRID=4326;POINT(175.3&nbsp;41.2)&quot;), and a variant used by Informix in which the WKT is
140+
preceded by an integer spatial reference id (eg &quot;1 POINT(175.3&nbsp;41.2)&quot;).
141+
In both cases the SRID is ignored.
142+
</p>
143+
<h4><a name="example">Example of a text file with X,Y point coordinates</a></h4>
144+
<pre>
145+
X;Y;ELEV
146+
-300120;7689960;13
147+
-654360;7562040;52
148+
1640;7512840;3
149+
</pre>
150+
<p>This file:</p>
46151
<ul>
47152
<li> Uses <b>;</b> as delimiter. Any character can be used to delimit the fields.</li>
48-
<li>The first row is the header row. It contains the fields X, Y and ELEV.</li>
153+
<li>The first row is the header row. It contains the field names X, Y and ELEV.</li>
49154
<li>No quotes (") are used to delimit text fields.</li>
50155
<li>The x coordinates are contained in the X field.</li>
51156
<li>The y coordinates are contained in the Y field.</li>
52157
</ul>
53-
<li>The example text file with WKT:</li>
158+
<h4><a name="wkt_example">Example of a text file with WKT geometries</a></h4>
159+
<pre>
160+
id|wkt
161+
1|POINT(172.0702250 -43.6031036)
162+
2|POINT(172.0702250 -43.6031036)
163+
3|POINT(172.1543206 -43.5731302)
164+
4|POINT(171.9282585 -43.5493308)
165+
5|POINT(171.8827359 -43.5875983)
166+
</pre>
167+
<p>This file:</p>
54168
<ul>
55169
<li>Has two fields defined in the header row: id and wkt.
56170
<li>Uses <b>|</b> as a delimiter.</li>
57171
<li>Specifies each point using the WKT notation
58172
</ul>
59-
</ol>

‎src/ui/qgsdelimitedtextsourceselectbase.ui

Lines changed: 710 additions & 610 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)
Please sign in to comment.