Bug report #8385

QGIS does not deal with shapefiles with two fields with the same name

Added by Pedro Venâncio about 6 years ago. Updated almost 6 years ago.

Status:Closed
Priority:Normal
Assignee:-
Category:Data Provider/OGR
Affected QGIS version:master Regression?:No
Operating System: Easy fix?:No
Pull Request or Patch supplied:No Resolution:
Crashes QGIS or corrupts data:No Copied to github as #:17158

Description

To reproduce, try to delete one column from the dataset I include attached. While saving, depending on the column that you try to delete, it shows one of these errors:

Could not commit changes to layer vegetacao

Errors: SUCCESS: 1 attribute(s) deleted.
  ERROR: the count of fields is incorrect after addition/removal of fields!
Could not commit changes to layer vegetacao

Errors: SUCCESS: 1 attribute(s) deleted.
  ERROR: field with index xx is not the same!

Another test that can be done is, in the attribute table, start editing, open the Field Calculator, select one of the fields [CLASSE_ELE, NOME_CELUL, DESCRICAO, INFO_LINK] in "Fields and Values​​" and "Load all unique values" (attached image). The information does not match with the contents of the attribute table.

The problem is in the shapefile, specifically in the attribute table that has two columns with the same name (Z83_PONTOS), and QGIS master "shuffle up" with this situation. Earlier versions (1.8, 1.7.4) were able to handle the situation and show the two columns, so it seems a regression.

I attached a set of images that illustrate this situation.

Thanks!

attribute_table_calc.png (33.7 KB) Pedro Venâncio, 2013-07-30 04:59 AM

dataset.zip (958 Bytes) Pedro Venâncio, 2013-07-30 04:59 AM

field_calculator_master.png (144 KB) Pedro Venâncio, 2013-07-30 04:59 AM

field_calculator_1.7.4.png (46.6 KB) Pedro Venâncio, 2013-07-30 04:59 AM

fields_attribute_table_master.png (137 KB) Pedro Venâncio, 2013-07-30 04:59 AM

fields_attribute_table_1.7.4.png (59.1 KB) Pedro Venâncio, 2013-07-30 04:59 AM

corrupted_record.png (23.6 KB) Pedro Venâncio, 2013-07-30 03:09 PM

Associated revisions

Revision 54140310
Added by Matthias Kuhn about 6 years ago

[ogr] Rename columns with non-unique name (Fix #8385)

History

#1 Updated by Matthias Kuhn about 6 years ago

  • Assignee deleted (Matthias Kuhn)

#2 Updated by Matthias Kuhn about 6 years ago

QgsFields does not accept more than one field with the same name.
  • If this is changed, what behavior should be expected of e.g. feat['attributeA']? To return just a random/first of the two attributes?
  • How many dataproviders do even support this? Is this a shapefile only problem?
  • Should rather be considered appending '_{nr}' to the field (e.g. 'attributeA', 'attributeA_1' etc.)

#3 Updated by Nathan Woodrow about 6 years ago

I'm not in favour of adding any work around in the code to handle having fields with the same name. I would just say we don't support it anymore and leave it as that. In most (all) cases it doesn't make sense to have two columns with the same name and it would make working with the data hard.

#4 Updated by Matthias Kuhn about 6 years ago

But the current behavior is bad: opening a layer and silently half-dropping a column because it doesn't meet some silent QGIS standard.

I'd say we declare this a standard and defer handling to the provider level. So the Shapefile (and CSV and whatever) provider have to take care of it and provide a wrapperlevel or ask the user to rename the column...

#5 Updated by Nathan Woodrow about 6 years ago

At the provider level should be fine. I think it would only be the ogr provider because most of the database types don't support columns with the same name.

#6 Updated by Pedro Venâncio about 6 years ago

I don't know if this involves a lot of change in the code, but this would be, in my opinion, the best solution.

  • Should rather be considered appending '_{nr}' to the field (e.g. 'attributeA', 'attributeA_1' etc.)

If you look at the attached image, libbreoffice also make this change.

#7 Updated by Pedro Venâncio about 6 years ago

The problem is definitely of the data, but this behavior

But the current behavior is bad: opening a layer and silently half-dropping a column because it doesn't meet some silent QGIS standard.

is not good. It took me some time to realize what was the source of the problem when I tried to delete a column from the dataset attached.

#8 Updated by Matthias Kuhn about 6 years ago

  • Status changed from Open to Closed

#9 Updated by Pedro Venâncio about 6 years ago

Thank you very much for the quick fix Matthias!

However, it seems that there are still some problems.

1) In the case of trying to eliminate the first column with the name repeated, while saving I get:

Could not commit changes to layer vegetacao

Errors: SUCCESS: 1 attribute(s) deleted.
  ERROR: field with index 2 is not the same!

2) Eliminating any of the other columns, generates no error but corrupts the last record of the attributes table (see the image).

3) In the Field Calculator, if I "Load all unique values" ​​in that first column repeated, nothing is shown. In the remaining fields, now everything works as expected, except in the second column with the name repeated, where the "Load" shows the values​​, but on Terminal I get "ERROR 1: Unrecognised Z83_PONTOS_1 field name in ORDER BY.".

#10 Updated by Matthias Kuhn about 6 years ago

  • Category changed from Vectors to Data Provider/OGR

#11 Updated by Jürgen Fischer about 6 years ago

  • Priority changed from Severe/Regression to Normal

I suppose the fieldname should be unique. As such it's an edge case and shouldn't be considered a blocker.

#12 Updated by Pedro Venâncio about 6 years ago

Hi Jurgen,

I suppose the fieldname should be unique. As such it's an edge case and shouldn't be considered a block.

I agree, but the most serious problem here is that the data may be corrupted.

#13 Updated by Giovanni Manghi about 6 years ago

Jürgen Fischer wrote:

I suppose the fieldname should be unique. As such it's an edge case and shouldn't be considered a blocker.

While I agree that is an edge case I guess that the point here was that previous qgis releases handled such case in a better way. Other software also handles this case but I would not mind if qgis does not, but I guess than that such vectors should be rejected on load.

#14 Updated by Jürgen Fischer about 6 years ago

Giovanni Manghi wrote:

While I agree that is an edge case I guess that the point here was that previous qgis releases handled such case in a better way. Other software also handles this case but I would not mind if qgis does not, but I guess than that such vectors should be rejected on load.

Does that mean you still consider this an very important issue? .oO(apparently I'm still not convinced that any regression how minor and "edgecasey" it might be, should be considered a blocker)

#15 Updated by Giovanni Manghi about 6 years ago

Jürgen Fischer wrote:

Giovanni Manghi wrote:

While I agree that is an edge case I guess that the point here was that previous qgis releases handled such case in a better way. Other software also handles this case but I would not mind if qgis does not, but I guess than that such vectors should be rejected on load.

Does that mean you still consider this an very important issue?

not for me

#16 Updated by Giovanni Manghi about 6 years ago

Jürgen Fischer wrote:

apparently I'm still not convinced that any regression how minor and "edgecasey" it might be, should be considered a blocker

probably you are right. My point of view is that if/when possible (time, money, etc) then all regressions should be squashed, if not possible then minor/edge cases can be left open and a "know regressions" list added in the release notes.

#17 Updated by Matthias Kuhn almost 6 years ago

  • Status changed from Reopened to Closed

Also available in: Atom PDF