Bug report #17150
Processing: output for standard NxT distance matrix now wrong
Status: | Closed | ||
---|---|---|---|
Priority: | High | ||
Assignee: | Victor Olaya | ||
Category: | Processing/QGIS | ||
Affected QGIS version: | master | Regression?: | Yes |
Operating System: | Easy fix?: | No | |
Pull Request or Patch supplied: | No | Resolution: | |
Crashes QGIS or corrupts data: | No | Copied to github as #: | 25049 |
Description
Up 2.14 (with the ftools tool) the result used to be correct (despite non handling well the accented letters in "unique ID fields"):
The first column had the "unique ID field" for the first input and the first row the "unique ID field" for the second input.
In 2.18 (Processing) the first column seems ok, but the first row as only DIST_1, DIST_2, DIST_3... that make the result useless.
In master the result is as in 2.18 but as additional issue a vector is outputted instead of a table.
Associated revisions
[processing] fix Standard (N x T) distance matrix (ref #17150)
Merge pull request #5312 from alexbruy/distance-matrix
[processing] fix Standard (N x T) distance matrix (ref #17150)
[processing] keep column names in standard distance matrix (fix #17150)
Merge pull request #5427 from alexbruy/processing-distance
[processing] keep column names in standard distance matrix (fix #17150)
History
#1 Updated by Nyall Dawson about 7 years ago
I agree that the loss of information in the column names is a regression.
In master the result is as in 2.18 but as additional issue a vector is outputted instead of a table.
Why is this an issue? This change at least was intentional - since the geometry can always be ignored/discarded/output saved as csv or other non-geometric format now. But if we don't keep the original geometries, then you are FORCED to rejoin the table back to the original to get these back again. To me it's a big feature that it now maintains these and just adds extra columns to the data.
#2 Updated by Giovanni Manghi about 7 years ago
Nyall Dawson wrote:
I agree that the loss of information in the column names is a regression.
In master the result is as in 2.18 but as additional issue a vector is outputted instead of a table.
Why is this an issue? This change at least was intentional - since the geometry can always be ignored/discarded/output saved as csv or other non-geometric format now. But if we don't keep the original geometries, then you are FORCED to rejoin the table back to the original to get these back again. To me it's a big feature that it now maintains these and just adds extra columns to the data.
Hi Nyall,
I see your point, thanks.
The problem I see in this approach is the following:
as column names we are placing DIST_1, DIST_2, DIST_3... instead of the "unique ID field" of one of the inputs. I suspect that if we fix it and output a shape (as Processing does) then because of the 10 char limit a number of those "unique ID field" will be truncated with the very real possibility of having a few columns named the same way. If you output as CSV there is no such problem to take into account.
#3 Updated by Nyall Dawson about 7 years ago
Processing in 3.0 uses a memory layer as intermediate steps in models, add running individual algs allows choice of output format. So I don't think this will be a restriction.
Otherwise we just flip the name to 1_dist, 2_dist, etc
#4 Updated by Håvard Tveite almost 7 years ago
I also noticed this regression a few days ago. The result of the "Standard (N x T)" version not only has new (useless) column names, each row is also ordered / sorted on increasing distance, meaning that it is impossible to find out from the result matrix which pair of points a value corresponds to!
#5 Updated by Giovanni Manghi almost 7 years ago
Håvard Tveite wrote:
I also noticed this regression a few days ago. The result of the "Standard (N x T)" version not only has new (useless) column names, each row is also ordered / sorted on increasing distance, meaning that it is impossible to find out from the result matrix which pair of points a value corresponds to!
agree.
#6 Updated by Nyall Dawson almost 7 years ago
Giovanni - I've been trying to track down when this change was made to see what the rationale behind it was. I think it may have been an issue with the processing distance matrix since forever, and it's only since the other python distance matrix tool was replaced with the processing version that this issue became known. Does that sound correct to you?
#7 Updated by Giovanni Manghi almost 7 years ago
Nyall Dawson wrote:
Giovanni - I've been trying to track down when this change was made to see what the rationale behind it was. I think it may have been an issue with the processing distance matrix since forever, and it's only since the other python distance matrix tool was replaced with the processing version that this issue became known. Does that sound correct to you?
Hi Nyall,
I think you are right: the tool in "ftools" worked as expected (despite very slow for large datasets). Cheers!
#8 Updated by Alexander Bruy almost 7 years ago
Fixed in 2.18
#9 Updated by Giovanni Manghi almost 7 years ago
- Affected QGIS version changed from 2.18.12 to master
#10 Updated by Alexander Bruy almost 7 years ago
- % Done changed from 0 to 100
- Status changed from Open to Closed
Applied in changeset qgis|c64f03d0a43822c091e3f60bd4b8b3538d000c1e.