Bug report #11007
Deleted/edited features within SHAPEFILE are still recognized in other software packages
Status: | Closed | ||
---|---|---|---|
Priority: | Severe/Regression | ||
Assignee: | Marco Hugentobler | ||
Category: | Data Provider/OGR | ||
Affected QGIS version: | 2.8.5 | Regression?: | No |
Operating System: | Easy fix?: | No | |
Pull Request or Patch supplied: | No | Resolution: | |
Crashes QGIS or corrupts data: | Yes | Copied to github as #: | 19349 |
Description
When working with SAGA modules, it ignores "Delete flag" in shapefile and uses those deleted features within the processes.
To replicate the issue:
1- Open a shapefile
2- Delete a feature
3- Save the shapefile
4- Run a SAGA process on the shapefile
5- Result of the process takes into account the deleted feature, despite the fact that it has been deleted and saved.
Related issues
Associated revisions
Repack() shapefiles on unload whenever they have been modified
Previous preconditions that would only repack them when features have been
deleted seems to not have covered everything.
Fix #11007
Repack shapefiles when saving after deleting features
- QgsVectorDataProvider::dataChanged() will be emitted
- QgsVectorLayer::dataChanged() will be emitted
- Clears QgsVectorLayerCache
- Reloads the attribute table
- Clears the selection
Looking forward to people complaining about their lost selection...
Fix #10560
Fix #11989
Refs #8317
Refs #8822
Refs #10483
Refs #11007
Refs #7540
Refs #11398
Refs #11296
History
#1 Updated by Giovanni Manghi over 10 years ago
- Status changed from Open to Feedback
Hi Saber, is this a SAGA issue or a QGIS one? What happens if you do the same using native SAGA instead of Processing?
#2 Updated by Saber Razmjooei over 10 years ago
Gio, I suppose it is a problem with SAGA. But will be to have it fixed upstream or work-around in QGIS. Otherwise, it might confuse users.
#3 Updated by Victor Olaya over 10 years ago
This is really weird. If you have modified the shapefile and deleted data...how can SAGA know that there was a feature in there that now is missing?? If you have changed the shapefile, there is no way the deleted feature can be recovered.
Maybe you are saving to a different file and then using the original to call SAGA?
#4 Updated by Saber Razmjooei over 10 years ago
Victor,
I guess in QGIS, when you delete a feature in a shapefile, it does not physically delete that feature. It flags it as deleted but it is still there.
The shapefile gets "cleaned" when you save as...
#5 Updated by Giovanni Manghi over 10 years ago
Saber Razmjooei wrote:
Victor,
I guess in QGIS, when you delete a feature in a shapefile, it does not physically delete that feature. It flags it as deleted but it is still there.
The shapefile gets "cleaned" when you save as...
Hi Saber, this does not seems to be the case. Delete a feature, save edits and load the shape in another QGIS window (or any other GIS package), the feature is not there.
#6 Updated by Saber Razmjooei over 10 years ago
Sorry Gio, that's correct. The bug only happens in active QGIS window, where editing took place.
#7 Updated by Victor Olaya about 10 years ago
Saber
Can you confirm the QGIS version you are using? I am finding that behaviour (deleted features that are actually not deleted in the shapefile, even after closing the edition), not just in Processing, but in QGIS in general. Taking the modified layer to a different software (as Processing does when passing the layer to SAGA) shows the deleted features. But I see this issue only in 2.4, not in 2.2. IF this is the case, then it's a QGIS issue and we should open another ticket.
#8 Updated by Victor Olaya about 10 years ago
IT seems DBF files have a deleted flag (http://www.clicketyclick.dk/databases/xbase/format/dbf.html#DBF_NOTE_9_TARGET). SAGA might not recognise it, and probably the way QGIS was removing features before was just actually eliminating them, while now it sets that flag to deleted. That would explain the error.
#9 Updated by Saber Razmjooei about 10 years ago
Victor, it is in 2.4. As you said the problem is not limited only to the Processing. The deleted features appear in Vector > Geoprocessing results too.
#10 Updated by Giovanni Manghi about 10 years ago
- Affected QGIS version set to 2.4.0
- Project changed from 78 to QGIS Application
- Category deleted (
56) - Crashes QGIS or corrupts data set to No
#11 Updated by Giovanni Manghi about 10 years ago
- Category set to Processing/SAGA
#12 Updated by Giovanni Manghi about 10 years ago
see also #11296
#13 Updated by Giovanni Manghi about 10 years ago
- Crashes QGIS or corrupts data changed from No to Yes
- Subject changed from SAGA does not recognise deleted features within SHAPEFILE to Deleted/edited features within SHAPEFILE are still recognized in other software packages
- Category changed from Processing/SAGA to Digitising
- Status changed from Feedback to Open
- Assignee deleted (
Victor Olaya) - Priority changed from Normal to High
- Affected QGIS version changed from 2.4.0 to master
Victor Olaya wrote:
But I see this issue only in 2.4, not in 2.2. IF this is the case, then it's a QGIS issue and we should open another ticket.
it affects also 2.2 and previous qgis releases (and master).
#14 Updated by Giovanni Manghi about 10 years ago
Saber Razmjooei wrote:
Victor, it is in 2.4. As you said the problem is not limited only to the Processing. The deleted features appear in Vector > Geoprocessing results too.
Hi Saber, while I confirm it affects also other software packages I cannot confirm that affects the own QGIS geoprocessing tools.
See also #11398
#15 Updated by Saber Razmjooei about 10 years ago
Gio,
My bad, you are right. It works even in edit mode, when your changes still have not been saved!
#16 Updated by Giovanni Manghi about 10 years ago
- Status changed from Open to Feedback
In #11398 there are example of software packages affected bu this issue, anyway SAGA it is also affected and it is easy to test as it comes with QGIS.
A few notes:
--------------------
a) after deleting a feature using the shape as input for SAGA gives
error: DBase file could not be opened.
removing and re-adding the shape and running it against SAGA again works without errors, but the operation (ex: buffer) takes into account also the features that it should have been deleted
--------------------
b) using the node tool to edit a feature and using the shape as input in SAGA gives
error: DBase file could not be opened.
removing and re-adding the shape and running it against SAGA again works without errors and as expected
--------------------
c) using the reshape tool to edit a feature and using the shape as input in SAGA gives
error: DBase file could not be opened.
removing and re-adding the shape and running it against SAGA again works without errors, but the operation (ex: buffer) result is as it was run against the input before being edited in qgis.
--------------------
d) using the split features tool to edit a feature and using the shape as input in SAGA gives
error: corrupted shapefile.
removing and re-adding the shape does not help.
It is not strictly a regression, but given the huge interoperability issues that this issue creates I would like to ask to raise this as blocker.
#17 Updated by Giovanni Manghi about 10 years ago
I made some tests with ArcGIS 10:
--------------------
a) delete a feature, save. Add the shape into arcgis, the deleted feature is there (both geometry and attribute).
so this confirms (again) #11296
Using the shape as input in arcgis (ex: buffer) then the "phantom" feature is removed from input and output is as expected.
--------------------
doing operations as b) c) d) does not seems to have any bad effect when adding the shape in arcgis
#18 Updated by Giovanni Manghi about 10 years ago
it is being very hard to find a clear pattern:
there are shapes that when edited always causes SAGA (again, we are using SAGA in first place because it is easy available and works together with QGIS) to use also the edited/deleted features or to throw an error (error: corrupted shapefile).
other sw packages seems affected also in the cases when SAGA it is not.
the only thing that seems to work almost all the times is to re-save the shapefile as a copy.
#19 Updated by Jukka Rahkonen about 10 years ago
- File shp_oj1.png added
- File shp_qgis.png added
- File shp_oj2.png added
I was reading this stackexchange question http://gis.stackexchange.com/questions/118689/cant-transform-lines-to-polygons with a sample data available for a few days now at http://dropcanvas.com/7i4oq.
When the shapefile is opened with QGIS it seems to have 12 linestrings (shp_qgis.png). However, when the same shapefile is opened with OpenJUMP it contains 85 lines which are different (shp_oj1.png). Ogrinfo does also see 85 linestrings but after running ogr2ogr from shape to shape, the new shapefile is clean and has 12 linestrings (shp_oj2.png). See attached images. Notice that for another software there are some extra features in the shapefile but also some other, newly added with QGIS, which are missing. I would suggest to raise the priority level from high because of corrupted data.
Ogrinfo summary:
ogrinfo TesteLayer.shp -al -so
INFO: Open of `TesteLayer.shp'
using driver `ESRI Shapefile' successful.
Layer name: TesteLayer
Geometry: Line String
Feature Count: 85
Extent: (-54831.944803, 144077.515208) - (-20131.746100, 200497.422392)
#20 Updated by Giovanni Manghi about 10 years ago
Jukka Rahkonen wrote:
I was reading this stackexchange question http://gis.stackexchange.com/questions/118689/cant-transform-lines-to-polygons with a sample data available for a few days now at http://dropcanvas.com/7i4oq.
When the shapefile is opened with QGIS it seems to have 12 linestrings (shp_qgis.png). However, when the same shapefile is opened with OpenJUMP it contains 85 lines which are different (shp_oj1.png). Ogrinfo does also see 85 linestrings but after running ogr2ogr from shape to shape, the new shapefile is clean and has 12 linestrings (shp_oj2.png). See attached images. Notice that for another software there are some extra features in the shapefile but also some other, newly added with QGIS, which are missing. I would suggest to raise the priority level from high because of corrupted data.
it is indeed a strange situation:
another software, gvsig, reads the shape in a different way from qgis and openjump. Re-saving the shape with this sw the first returns a shape with 85 features
and the other with 93 (as shown by ogr and qgis).
Any software shows the same data when re-saving the shapefile with qgis or ogr2ogr (12 features).
My guess is that the shape was edited (features deleted) with qgis, this we know now that leave the shape in a state that is inconsistent and gives unexpcted results in other software. If the shape is re-saved after the edits apparently something is used (a parameter in ogr2ogr?) and the vector gets a consistent state.
#21 Updated by Matthias Kuhn about 10 years ago
QGIS calls repack() which is supposed to clean the dbf whenever removing/unloading a layer from which features have been removed in the running QGIS session. You can test this behavior by deleting a feature from the test dataset mentioned above and then unloading the layer. The file size of the dbf decreases by a couple of kB.
Calling repack() in a running QGIS session is dangerous because it changes feature ids. And QGIS (and its plugins) should be able to rely on feature ids being unchanged for the lifetime of a layer.
- The easiest solution would be to rewrite the file to another location before sending it to SAGA.
- Or raising a feature request for SAGA to support the dbf flag for deleted features (is this an official feature of this file format?)
- Create an OGR feature request, that repack() can be called without changing feature ids (e.g. it could remap from old feature ids to new feature ids internally. We can't do this on our side, the repack() process is transparent to QGIS I think).
- Let SAGA use a QGIS feature iterator :)
- Decide that feature ids can change in a running session (and internally send a signal that any cached information needs to be discarded)
#22 Updated by Giovanni Manghi about 10 years ago
Matthias Kuhn wrote:
QGIS calls repack() which is supposed to clean the dbf whenever removing/unloading a layer from which features have been removed in the running QGIS session. You can test this behavior by deleting a feature from the test dataset mentioned above and then unloading the layer. The file size of the dbf decreases by a couple of kB.
Calling repack() in a running QGIS session is dangerous because it changes feature ids. And QGIS (and its plugins) should be able to rely on feature ids being unchanged for the lifetime of a layer.
- The easiest solution would be to rewrite the file to another location before sending it to SAGA.
- Or raising a feature request for SAGA to support the dbf flag for deleted features (is this an official feature of this file format?)
- Create an OGR feature request, that repack() can be called without changing feature ids (e.g. it could remap from old feature ids to new feature ids internally. We can't do this on our side, the repack() process is transparent to QGIS I think).
- Let SAGA use a QGIS feature iterator :)
- Decide that feature ids can change in a running session (and internally send a signal that any cached information needs to be discarded)
Hi Matthias, it is not (just) a SAGA issue, that would be the last of our problems.
The issue is also with other very popular gis packages.
Add a shape in qgis, edit it (delete,reshape,split,etc.), save edits. Remove the shape (or not) and open it in such software, the "phantom" features are there...
re-save the shape with another name in qgis and everything is ok also in other software. Asking users to re-save shapes before exchanging them seems a bit... strange and unpractical to say the least.
cheers!
#23 Updated by Matthias Kuhn about 10 years ago
If you remove features in QGIS, close the project and open the shapefile in other software, the file should be clean. Is it not?
Can you confirm that QGIS cleans the sample data from this report with the steps I outlined above:
- delete a feature, save
- unload layer
#24 Updated by Giovanni Manghi about 10 years ago
Matthias Kuhn wrote:
If you remove features in QGIS, close the project and open the shapefile in other software, the file should be clean. Is it not?
Can you confirm that QGIS cleans the sample data from this report with the steps I outlined above:
- delete a feature, save
- unload layer
cannot confirm, unloading the layer from project and/or closing qgis does not clean the shape.
#25 Updated by Giovanni Manghi about 10 years ago
cannot confirm, unloading the layer from project and/or closing qgis does not clean the shape.
re-saving it does.
#26 Updated by Giovanni Manghi about 10 years ago
- File screenshot3.png added
Matthias Kuhn wrote:
If you remove features in QGIS, close the project and open the shapefile in other software, the file should be clean. Is it not?
Can you confirm that QGIS cleans the sample data from this report with the steps I outlined above:
- delete a feature, save
- unload layer
in my pc I can only test open source programs, but what you can see in the attached image happens also with closed source one. The shape was reshaped in qgis, saved, removed from project, closed qgis and opened in openjump.
#27 Updated by Matthias Kuhn about 10 years ago
Cannot reproduce this here in this case.
Just to be sure: you deleted a feature first, then you saved, then you removed the layer from the legend, correct (i.e. the deleting part is important, just because you didn't mention it in your comment)?
After doing this here:
INFO: Open of `/tmp/orbit-kk/TesteLayer.shp'
using driver `ESRI Shapefile' successful.
Layer name: TesteLayer
Geometry: Line String
Feature Count: 11
Extent: (-43350.686045, 162465.264117) - (-31426.039433, 174737.909080)
Layer SRS WKT:
PROJCS["ETRS89_Portugal_TM06",
GEOGCS["GCS_ETRS_1989",
DATUM["European_Terrestrial_Reference_System_1989",
SPHEROID["GRS_1980",6378137,298.257222101]],
PRIMEM["Greenwich",0],
UNIT["Degree",0.017453292519943295]],
PROJECTION["Transverse_Mercator"],
PARAMETER["latitude_of_origin",39.66825833333333],
PARAMETER["central_meridian",-8.133108333333334],
PARAMETER["scale_factor",1],
PARAMETER["false_easting",0],
PARAMETER["false_northing",0],
UNIT["Meter",1]]
LAYER: String (32.0)
COLOR: Integer (6.0)
ID: Integer (5.0)
#28 Updated by Giovanni Manghi about 10 years ago
Matthias Kuhn wrote:
Cannot reproduce this here in this case.
Just to be sure: you deleted a feature first, then you saved, then you removed the layer from the legend, correct (i.e. the deleting part is important, just because you didn't mention it in your comment)?
- open shape in qgis
- edit shape (delete feature, reshape, split and probably other tools exept for the node tool, that does not seems (underline "seems") create issues)
- save shape
- remove shape from project
- close qgis
- add shape in another software
notes from the many comments (also in duplicate tickets)
- not all input shapes seems to be affected in the same way. This one
https://issues.qgis.org/attachments/7917/Test501_before_editing.zip
is a good example to replicate the issues.
- not all other gis sw shows the issues in the same way, for example: loading such "dirty" shapes in ArcGIS causes the program to show also the incorrect/deleted features, but then if the vector is used for a geoprocessing operation the program seems to clean the shape beforehand. SAGA and many other sw shows on canvas the incorrect/deleted features and the same features are considered as good when using the vector for geoprocessing. Common are also messages that the shapefile is corrupted, or the dbf is corrupted or the dbf has the wrong number of records.
#29 Updated by Matthias Kuhn about 10 years ago
I am using
ogrinfo -al -so
And it always shows the correct feature count. For Test501 it's originally 2 (it's clean in the state when it's downloaded, right?) and subsequently it always reflects the correct number (tried to remove / split / split and remove). I don't know what I can do to try harder... :(
#30 Updated by Saber Razmjooei about 10 years ago
- File edited_vector_buffered_1m.png added
- File sample_data.zip added
- File vector.zip added
Matthias,
Try the attached dataset (sample_data.zip).
In processing toolbox, use Shape Buffer (SAGA module with 1 metre buffer) and you should get something like edited_vector_buffered_1m.png. Despite the fact that there is no feature on the east side, still it uses the deleted features from the original vector (vector.zip)
Tested in QGIS 2.4 and Master under windows 7 (OSGeo4W install)
#31 Updated by Giovanni Manghi about 10 years ago
- File saga2.png added
- File gms.png added
- File gvsig.png added
- File openjump.png added
- File saga1.png added
Saber Razmjooei wrote:
Matthias,
Try the attached dataset (sample_data.zip).
In processing toolbox, use Shape Buffer (SAGA module with 1 metre buffer) and you should get something like edited_vector_buffered_1m.png. Despite the fact that there is no feature on the east side, still it uses the deleted features from the original vector (vector.zip)Tested in QGIS 2.4 and Master under windows 7 (OSGeo4W install)
good example Saber.
Matthias, the shapefile that Saber attached shows 4 records (and 4 geometries) when you open in QGIS. But ogrinfo says the features are 9.
Re-save the shape with QGIS and you'll get a copy that ogrinfo will say it has 4 features.
In a similar way I attach the screenshots of 4 gis software that all show the vector with 9 fetaures (9 geometries and 9 records in the attribute table).
#32 Updated by Matthias Kuhn about 10 years ago
When I delete a feature in the attached dataset, save the layer and remove it from the project, feature count decreases to 3. I.e. QGIS cleans it up.
Saber, how was this (unclean) dataset produced?
#33 Updated by Matthias Kuhn about 10 years ago
That QGIS produces strange results when an unclean dataset is opened is not necessarily a bug in the software, but in the data.
If QGIS originally produced the unclean data, that is something we should worry. But for this we need a reproducible way to create an unclean shapefile.
Right now we only repack() on layer unload if a feature has been deleted in the dataset (that means, that most likely the workflow to corrupt a dataset must not include deleting a feature). We could change the preconditions for a repack(). To identify the proper preconditions we need a reproducible way to create an unclean dataset.
We could also integrate a "corrupted dataset detection" when opening a shapefile that prints a warning and allows to repack() before loading the layer. That would not be a bad idea but if QGIS itself produces unclean datasets right now, this does not solve the root cause.
#34 Updated by Saber Razmjooei about 10 years ago
Saber, how was this (unclean) dataset produced?
In QGIS 2.4 Windows 7 (OSGeo4w 64 bit). I will try to see if it is reproducible.
#35 Updated by Jukka Rahkonen about 10 years ago
It is understandable from the QGIS point of view to mark deletions only to .dbf during the edit session. However, delivering such shapefile for other users is not safe. GDAL obviously detects correctly the rows which are marked as deleted in .dbf file and can handle the situation but some other software don't do that. For example OpenJUMP can actually read the geometries from the .shp part even if the .dbf file does not exist at all. OpenJUMP is reading shapefiles with modified GeoTools driver and I suppose that gvSIG is using GeoTools as well which makes me to think that QGIS can make shapefiles which do not work as supposed in GeoServer.
For making it safe it should be guaranteed that 1) Repacking finally happens and 2) Users can't capture edited but not yet repacked shapefiles and pass them on to be used in other software. I guess that 1) is somehow doable but how to implement 2)?
#36 Updated by Giovanni Manghi about 10 years ago
Hi Matthias,
I just made a pretty basic test:
- opened qgis master
- created from scratch a (polygon) shapefile and added a few features
- saved the edits
- removed the shapefile from project
- added the shapefile to other gis software -> OK
- removed the shapefile from that sw
- added the shapefile to qgis
- edited the shapefile with the reshape tool
- saved edits
- removed the shapefile from qgis
- closed qgis
- added the shapefile to other gis software -> CORRUPTION
#37 Updated by Giovanni Manghi about 10 years ago
Giovanni Manghi wrote:
Hi Matthias,
I just made a pretty basic test:
- opened qgis master
- created from scratch a (polygon) shapefile and added a few features
- saved the edits
- removed the shapefile from project
- added the shapefile to other gis software -> OK
- removed the shapefile from that sw
- added the shapefile to qgis
- edited the shapefile with the reshape tool
- saved edits
- removed the shapefile from qgis
- closed qgis
- added the shapefile to other gis software -> CORRUPTION
not strictly a regression but I really suggest to tag this as blocker and fix/workaround it before 2.6
#38 Updated by Matthias Kuhn about 10 years ago
Jukka, it's actually OGR that marks the entries in .dbf as deleted and not QGIS itself. So it's actually not a QGIS issue but an OGR issue and luckily OGR manages to a large degree to handle its own files except for ogrinfo -al -so. For #1 see the next paragraph. For #2 some kind of file locking would need to take place, I don't know if that can be done and if yes, it should probably be done in OGR and not QGIS.
Giovanni, the reshape tool does only change the geometry and not delete a feature. However this somehow still creates deleted entries in the .dbf (not sure what exactly happens there) but I think we could change the preconditions for repack() from "only when features have been deleted" to "always when a layer has been modified".
#39 Updated by Matthias Kuhn about 10 years ago
- Status changed from Feedback to Closed
Fixed in changeset 05157f89a06dd65565770303c985a6d0d137ea98.
#40 Updated by Giovanni Manghi about 10 years ago
Matthias Kuhn wrote:
Fixed in changeset 05157f89a06dd65565770303c985a6d0d137ea98.
Hi Matthias,
this may have introduced a regression when digitizing (adding) a new feature:
now after finishing to digitize a new feature (with right mouse button), the feature does not show until another action is done, like a pan or a zoom.
let me know if you want me to file a new ticket.
cheers!
#41 Updated by Giovanni Manghi about 10 years ago
- Status changed from Closed to Reopened
- Priority changed from High to Severe/Regression
Unfortunately I can see that the discussed issue is not solved (but again, re-saving the edited shape "solves" the issue) and that apparently the patch also introduced a regression when adding new features (see above comment).
#42 Updated by Matthias Kuhn about 10 years ago
It is very unlikely that the regression you observe and this fix are related. This fix only touches provider code and the feature is not in the provider at the time your regression appears. This is rather introduced by 5e54912565. So, yes, please file a new ticket.
Just to be sure: You did the same test again (on a new shapefile, not an already corrupted one)? I fail to create a corrupted shapefile here, it's pretty hard to isolate the issue like this...
(Tried reshape and subsequent ogrinfo -al -so the Test501.shp from comment #28)
#43 Updated by Giovanni Manghi about 10 years ago
Matthias Kuhn wrote:
It is very unlikely that the regression you observe and this fix are related. This fix only touches provider code and the feature is not in the provider at the time your regression appears. This is rather introduced by 5e54912565. So, yes, please file a new ticket.
sorry Matthias you are right, there were so many commits yesterday that I lost track of changes while compiling to test your patch.
Just to be sure: You did the same test again (on a new shapefile, not an already corrupted one)? I fail to create a corrupted shapefile here, it's pretty hard to isolate the issue like this...
yes, the test started from a non corrupted shape. I will create a screencast for you, it is very easy to replicate.
#44 Updated by Giovanni Manghi about 10 years ago
Just to be sure: You did the same test again (on a new shapefile, not an already corrupted one)? I fail to create a corrupted shapefile here, it's pretty hard to isolate the issue like this...
here
#45 Updated by Giovanni Manghi about 10 years ago
#46 Updated by Matthias Kuhn about 10 years ago
Ok, I can reproduce it now with OpenJUMP. There seem to be two different problems:
- Marking features as deleted in .dbf (seems to be taken care of with repack)
- Not removing features from .shp (seems not to be taken care of with repack)
The command ogrinfo -al -so seems to count .dbf entries and I could therefore not detect the corruption with this command. OpenJUMP is more sensitive to this.
The problem is, that even after calling repack, the superfluous geometries in the .shp do not get removed. I am currently out of ideas why that could be.
#47 Updated by Giovanni Manghi about 10 years ago
Matthias Kuhn wrote:
OpenJUMP is more sensitive to this.
basically any other sw tested shows same or similar behavior
The problem is, that even after calling repack, the superfluous geometries in the .shp do not get removed. I am currently out of ideas why that could be.
:(
#48 Updated by Matthias Kuhn about 10 years ago
Excerpt from #gdal
TL;DR; OpenJUMP ignores the (updated) .shx, the solution will be that repack also triggers when geometries change. That will be a feature for GDAL/OGR and Even Rouault will be taking care of that.
Please consider buying him a beer at the next FOSS4G.
14:53 < EvenR> well, I know what happens
14:54 < EvenR> when you call SetFeature() in the shapefile driver and that the size of the new geometry blob is larger than the previous size, it puts the new geometry at the end of the shapefile
14:54 < EvenR> and update the index in the .shx accordingly
14:54 < EvenR> perhaps OpenJUMP doesn't read the .shx at all and tries to identify the geometries by scanning the .shp directly
14:55 < mkuhn> sounds like a plausible explanation
14:56 < mkuhn> and like something that will be hard to fix from within QGIS
14:56 < EvenR> that's the way the shapefile driver works
14:57 < EvenR> it could be argued that repack() should also trigger in such situation (rewriting of geometries). Or perhaps a "FORCE REPACK" mode
14:58 < EvenR> repack() basically iterates over all (non-deleted) shapes, writes them in a temporary file, and then rename the temporary file over the normal file
14:58 < mkuhn> the first one would certainly be easier from a qgis developer's perspective ;-)
15:00 < EvenR> mkuhn: I've remove the .shx and openjump can still open the shape (with the warning/error message). So theory confirmed regarding OpenJUMP
15:02 < mkuhn> EvenR: Either it can be blamed on OpenJUMP or the shp repacked under such circumstances, correct?
15:04 < EvenR> mkuhn: OpenJUMP should use the .shx when present certainly. Regarding OGR, if you don't delete records, Repack is currently a no-op as documented ( "Deleted shapes are marked for deletion in the
.dbf file, and then ignored by OGR. To actually remove them permanently (resulting in renumbering of FIDs) invoke the SQL 'REPACK...)
15:05 < EvenR> but we could basically make REPACK also trigger each time SetFeature() is called and the new geometry size is != old geometry size
15:06 < mkuhn> In terms of interoperability this would probably be the preferred solution
#49 Updated by Matthias Kuhn about 10 years ago
- Status changed from Reopened to Closed
- Resolution set to up/downstream
#50 Updated by Even Rouault about 10 years ago
See OGR enhancement: http://trac.osgeo.org/gdal/ticket/5706
#51 Updated by Jukka Rahkonen about 10 years ago
"OpenJUMP should use the .shx when present certainly".
Probabably yes but OpenJUMP is just a "Small Open Source GIS that can". Because it can skip both .shx and .dbf I have been able to save at least geometries from some corrupted shapefiles and saved a lot of work.
This ticket was not created because of any trouble with OpenJUMP but SAGA and issues were found also with gvSIG and others. I have not been able to follow if the unpacked .shp was making trouble for SAGA or if it was something else.
#52 Updated by Giovanni Manghi about 10 years ago
Jukka Rahkonen wrote:
I have not been able to follow if the unpacked .shp was making trouble for SAGA or if it was something else.
basically for any software I tested that is able to open a shapefile, open or closed source.
#53 Updated by Saber Razmjooei about 10 years ago
Tested against gdal 2.0.0dev (trunk revision 27897) and qgis master (revision aeb9d93)and it still gives the same result in saga.
os: debian jessie x86
#54 Updated by Giovanni Manghi about 10 years ago
Saber Razmjooei wrote:
Tested against gdal 2.0.0dev (trunk revision 27897) and qgis master (revision aeb9d93)and it still gives the same result in saga.
os: debian jessie x86
Hi Saber, I have also tested by compiling both gdal and qgis trunk/master on a clean VM, and I must say that the issue seems (partially?) fixed to me.
Now after editing and removing the shapefile from the qgis project, the shape shows fine in other gis sw (like openjump) and there are no errors/warnings about the incorrect number of records.
The remaining issue, that at this point I don't know if it solvable, is when the shape is edited and not removed from the qgis project, it still give errors. This affects QGIS, for instance when an edited shape is needed to be used as input in Processing/SAGA (but other affected sw may show in Processing in the future), but also interoperability: it doesn't seems to me very strange the case where a qgis user edits a shape and copy/send to others before removing it from qgis or closing the qgis project. To tell the truth is a pretty common situation.
#55 Updated by Paolo Cavallini about 10 years ago
Seems a reasonable workflow, Giovanni
#56 Updated by Even Rouault about 10 years ago
The remaining issue, that at this point I don't know if it solvable, is when the shape is edited and not removed from the qgis project, it still give errors. This affects QGIS, for instance when an edited shape is needed to be used as input in Processing/SAGA (but other affected sw may show in Processing in the future), but also interoperability: it doesn't seems to me very strange the case where a qgis user edits a shape and copy/send to others before removing it from qgis or closing the qgis project. To tell the truth is a pretty common situation.
This is kind of dangerous to try reading/copying a file being edited by a software, whatever the software is (Do you copy a Word/Excel document being edited by Work/Excel ? I don't...). Anyway you could run SyncToDisk() + REPACK after each SetFeature()/DeleteFeature() call (or at the end of the editing session more reasonably), but that might have bad performance effects. For example if you work on a shapefile with hundreds of thousands of records, or with very big geometries, repacking it might be slow. So there's a trade-off to consider.
One point to be careful is that REPACK can change feature ids (when compacting deleted features). Is QGIS ready to deal with that (e.g if you have a form opened on a feature, with a feature id greater than the feature that has been deleted?).
#57 Updated by Giovanni Manghi about 10 years ago
Hi,
This is kind of dangerous to try reading/copying a file being edited by a software, whatever the software is (Do you copy a Word/Excel document being edited by Work/Excel ? I don't...).
of course I don't, but someone could argue that in a mixed/shared working environment this could be a issue. Anyway the real issue was that the shapes remained in this state even after closing the qgis project, this has been solved by your patch in ogr so I guess is just a matter to wait for gdal 2.
cheers!
#58 Updated by Matthias Kuhn about 10 years ago
- Category changed from Digitising to Data Provider/OGR
Originally repack was called after every delete. This was changed to "on unload" because of changing feature ids, leading to strange behavior (e.g. attribute table showing Errors)
This behavior (repack after delete and now also after change geometry) could be reintroduced and a signal sent that feature ids (may) have changed. The attribute table would need to reload. Caches would need to be dismissed. Some plugins probably will not react to this change.
Another scenario is when the shapefile is open twice, one instance will call repack (and could send a signal) for the other one this will be hard to detect (maybe a filechanged signal from the OS...).
My preferable solution would be to have guaranteed stable feature ids from gdal (or at least a signal about how feature ids changed, so they can be remapped inside QGIS) but I don't know if this is something that can be done.
#59 Updated by Giovanni Manghi about 10 years ago
Another scenario is when the shapefile is open twice, one instance will call repack (and could send a signal) for the other one this will be hard to detect (maybe a filechanged signal from the OS...).
so this #7540 ?
#60 Updated by Jürgen Fischer about 10 years ago
rouault - wrote:
One point to be careful is that REPACK can change feature ids (when compacting deleted features). Is QGIS ready to deal with that (e.g if you have a form opened on a feature, with a feature id greater than the feature that has been deleted?).
REPACK
was used to be run after deleting feature (6149d34a5). But the changing feature ids had different issues in feature selection and the attribute table and hence it was reverted. AFAIK we cannot easily anticipate how the feature ids will change, can we? We'd at least have to run REPACK before we start using any ids...
#61 Updated by Saber Razmjooei about 10 years ago
rouault - wrote:
This is kind of dangerous to try reading/copying a file being edited by a software, whatever the software is (Do you copy a Word/Excel document being edited by Work/Excel ? I don't...).
2 examples of other software using files immediately after edit and save:
- User removes the feature and calls SAGA or GRASS (or other sw) from Processing.
- I use QGIS to edit my hydraulic models shapefiles. There are some times 10s of them and they need to be constantly edited/saved.
In both cases, you don't expect users to close QGIS/Shapefile for every edit.
#62 Updated by Giovanni Manghi about 10 years ago
at least for the case of edited features the issue seems has been solved by 4cf08c5c112278ab4f50cf21b735a4a58d6a98aa
#63 Updated by Saber Razmjooei about 9 years ago
- Status changed from Closed to Reopened
There are a couple of tickets reported in 2.10. I am closing them and re-opening this one.
#64 Updated by Danny Duong about 9 years ago
Hi guys,
Just confirming that I'm having this issue on Pisa 2.10.1. (EDIT: Issue continues with Lyon 2.12)
The .SHP files I'm working with don't appear to repack properly, so the features are still present. When the file is opened in another GIS package, such as ArcGIS, they can see all of the features that have been marked as "deleted".
Is it possible to force a repack of the files?
#65 Updated by Giovanni Manghi about 9 years ago
see also #13771
#66 Updated by Giovanni Manghi about 9 years ago
see also #13821
#67 Updated by Dan Isaacs about 9 years ago
Just to give what might be a simpler method of identifying shapefiles which may behave this way. I have a shapefile which is displaying these 'phantom' polygons after deletion (originally bug #13821 re-directed here).
To test - Open the .dbf file in something like Open Office (calc or base) and it displays only the data that Qgis displays. Open the same file in WPS (a free Excel equivalent) and all the old polygon's data is still there.
Also, I'd like to clear up all the comments about 'save as' or removing the layer as being an acceptable workaround. This will eliminate any virtual fields one might have attached to that layer, a very important feature of Qgis for many users.
Furthermore, some users (myself included) need to be able to edit .dbf files directly, this cannot be done with 'phantom' entries that don't show up in a normal dbase editor.
We do need a solution which produces a 'clean' .dbf file after saving any polygon edits for all files.
#68 Updated by Jérôme Guélat about 9 years ago
This problem can be extremely dangerous when sharing data with ArcGIS users, since they will see all the deleted features!
Moreover there's also a small problem in QGIS: if you activate "Show Feature Count" (right click on the layer), the number of features will be wrong!
#69 Updated by Saber Razmjooei about 9 years ago
Jérôme Guélat wrote:
Moreover there's also a small problem in QGIS: if you activate "Show Feature Count" (right click on the layer), the number of features will be wrong!
There are other tickets related to that. The root of all the tickets are the same. See #13422
#70 Updated by Björn Harrtell about 9 years ago
Was affected by this issue today (I think) and investigated. If I read the history correct here, the fix for this issue is in upstream GDAL but not merged until 2.0.0.
It seems that QGIS 2.12 for Windows is built against GDAL 1.11 (Windows) and also most common Linux packages for QGIS seem to depend on GDAL 1.x.
This might explain why this issue persists which is unfortunate because editing shapefiles probably is one of the more common cases for QGIS newcommers.
#71 Updated by Yuval Lorig about 9 years ago
Hi,
I'm having the same issue on 2.10 on various machines. When i try to open SHP that was created on QGIS 2.10 after some polygons were splted or deleted. When i open it on ArcGIS either the SHP is coruppted or it shows deleted polygons. "save as" doesn't solve the problem all the time.
I didn't understand if this problem was solved on 2.12 or not. could someone advise what to do?
Thanks!
#72 Updated by Saber Razmjooei about 9 years ago
My work-arounds so far have been using QGIS 2.8 + Save as...
#73 Updated by Yuval Lorig about 9 years ago
Björn Harrtell wrote:
Was affected by this issue today (I think) and investigated. If I read the history correct here, the fix for this issue is in upstream GDAL but not merged until 2.0.0.
What do you mean? should i upgrade GDAL to vs. 2 by myself? how will it effect QGIS?
Thanks
#74 Updated by Uroš Preložnik about 9 years ago
Björn Harrtell wrote:
... which is unfortunate because editing shapefiles probably is one of the more common cases for QGIS newcommers.
Not just newcommers... :(
Had this today (QGIS 2.10, Windows, tested also with 2.8.3 Linux and same problem) and took me some time to figure out what was the problem. It was really simple to reproduce, as with others:- Create Shapefile, Add couple polygons, Save. Everything OK in QGIS and ArcGIS.
- Edit Shapefile, delete one record, Save. QGIS OK, but when opened with ArcGIS deleted feature is there!
This is such a serious problem if you cannot trust QGIS doing such a basic task. I think there should be some release notes warning about that and similar problems if they exist.
#75 Updated by Björn Harrtell about 9 years ago
Yuval: AFAIK the issue remains on QGIS 2.12. There is hope for the 2.14 LTS release which will probably be built against a newer GDAL. Upgrading the bundled GDAL in the current release could in best case scenario resolve the issue but I have not tried it and would expect that it can also cause other problems.
Uros: Agreed. I even think editing shapefiles should be disabled in future releases if this issue remains.
#76 Updated by Jürgen Fischer about 9 years ago
can anyone confirm that this is fixed with GDAL 2?
#77 Updated by Saber Razmjooei almost 9 years ago
Jürgen Fischer wrote:
can anyone confirm that this is fixed with GDAL 2?
Hi Jef,
It does work fine in master (001f4bc) using Gdal/ogr 2.0.1.
I tested it with GRASS 7, but I suppose other external programms will be fine too.
#78 Updated by Antoine SIG almost 9 years ago
Jürgen Fischer wrote:
can anyone confirm that this is fixed with GDAL 2?
Hi,
Me, same problem with GDAL 2 (test with this version : http://www.geoinformations.developpement-durable.gouv.fr/installer-la-version-recommandee-de-qgis-a2747.html)
I have same problem with 2.12.1... but with 2.8.4 version it's OK ! Both have GDAL 1.11.3 and there is no problem on 2.8
(delete features appear with ArcGIS and SAGA)
#79 Updated by jerome andal almost 9 years ago
Hi,
IF you're opening attribute table, select all in attribute table, and save as with the option only selected features... it's ok, 'phantom polygons' are really deleted.
This bug makes me waste a lot of time. I'm obliged to do this each time, after deleting or spatial operation.I need to check often if the feature count is identical with the number of lines in attibute table. If a forgot to check this i obtain a corrupted shapefile (offset between geometries and atttribute table).
#80 Updated by Marco Hugentobler almost 9 years ago
- Assignee set to Marco Hugentobler
#81 Updated by Marco Hugentobler almost 9 years ago
200ce04b88c515d9880615d7254e45d1a50049a4 fixes the problem (at least for my test cases on windows).
Please do some further testing (especially on windows).
#82 Updated by Antoine SIG almost 9 years ago
Marco Hugentobler wrote:
200ce04b88c515d9880615d7254e45d1a50049a4 fixes the problem (at least for my test cases on windows).
Please do some further testing (especially on windows).
It's OK for me, (test with ArcMAP 10.3 / windows 10).
Thanks for fix !!
#83 Updated by Saber Razmjooei almost 9 years ago
- Affected QGIS version changed from master to 2.8.5
Tested in the latest master and it works as expected.
Is there a chance to have a patch for 2.8 LTR or is it reliant on the latest OGR in master?
#84 Updated by Marco Hugentobler almost 9 years ago
- Status changed from Reopened to Closed
Is there a chance to have a patch for 2.8 LTR or is it reliant on the latest OGR in master?
This bug is not in 2.8 ( only in 2.10 and 2.12 ).
#85 Updated by Sandro Santilli almost 9 years ago
Could anyone test this ticket again with the code in https://github.com/qgis/QGIS/pull/2755 ?
#86 Updated by Danny Duong almost 9 years ago
- Status changed from Closed to Reopened
I've just tested this on Windows 8 64-bit, QGIS 2.12.3 64-bit and it seems to still happen.
The steps I did was:
1. Open a .SHP file and toggle on "Show Feature Count".
2. Delete objects from the .SHP within QGIS. The feature count drops down.
3. Save the .SHP within QGIS. The feature count goes back up to the original number.
4. Open this .SHP in another package. The deleted items are still there.
Otherwise, re-opening this file within QGIS 2.12.3 will show the items have been deleted but the feature count remains unchanged.
This problem does not occur in QGIS 2.8.6 LTR. Re-opening and saving within QGIS 2.8.6 changes the feature count number to the correct one. After doing this, opening this file in another package works as expected.
#87 Updated by Nyall Dawson almost 9 years ago
- Status changed from Reopened to Closed
This fix is not present in 2.12.3
#88 Updated by Uroš Preložnik almost 9 years ago
Marco Hugentobler wrote:
Is there a chance to have a patch for 2.8 LTR or is it reliant on the latest OGR in master?
This bug is not in 2.8 ( only in 2.10 and 2.12 ).
Hi Marco,
This bug was also in 2.8 branch all along as it was reported by many users. I did another test today and it looks like it is solved on 2.8.6 (tested on Ubuntu) But it definitely exists in 2.8.5.
Another thing is 2.12 release. As others, I can also confirm bug still exists on 2.12.3 (tested on Windows).
So I think this should be reopened.
#89 Updated by Danny Duong almost 9 years ago
Uros: Only after posting did I realise that "Master" referred to the 2.13 Master; I wrongly thought the fix was incorporated into 2.12.3.
Assuming all goes well, I'd expect the fix in 2.13 to be incorporated into the 2.14 LTR release which is due at the end of this month.
#90 Updated by Miroslav Umlauf over 8 years ago
This issue persist in 2.8.8 but seems to be fixed in 2.14.
#91 Updated by Antoine SIG over 8 years ago
- Target version set to Version 2.14
#92 Updated by Antoine SIG over 8 years ago
- Status changed from Closed to Reopened
No issue with 2.14.1 / 2.14.2 but the same issue with 2.14.3
(test with Arcgis 10.4)
#93 Updated by Andreas Neumann over 8 years ago
Antoine SIG: can you please add more details (Devs can't reproduce your issue):
- QGIS code revision
- GDAL/OGR version
- operating system
You can find all this information in the QGIS about screen.
Please also explain steps to reproduce and some demodata.
#94 Updated by Andreas Neumann over 8 years ago
see discussion at http://lists.osgeo.org/pipermail/qgis-developer/2016-June/043531.html
#95 Updated by Giovanni Manghi over 8 years ago
Hi Andreas.
Andreas Neumann wrote:
Antoine SIG: can you please add more details (Devs can't reproduce your issue):
- QGIS code revision
- GDAL/OGR version
- operating systemYou can find all this information in the QGIS about screen.
Please also explain steps to reproduce and some demodata.
replicating is easy: edit a shapefile (save edits) and without exiting qgis (or removing the shape from project) open the vector with another gis software.
This step is not that unusual, think about people that send shapes to each other, eventually after an edit: if I'm using the shape why I should think about closing qgis or removing the vector from project before sharing it?
having a shape (or other formats) open in qgis has recently also started to cause various issues with locking, just search redmine for it and you'll see a few tickets about it.
cheers!
#96 Updated by Even Rouault over 8 years ago
- what kind of edits exactly: adding shapes, removing shapes, adding vertex to a pre-existing shape, ... ?
- I'm surprised that removing the shape from project causes issue: this should be equivalent to exiting QGIS. You don't have problem when saving & exiting QGIS right ?
#97 Updated by Even Rouault over 8 years ago
Giovanni: and which GDAL version and OS ?
#98 Updated by Giovanni Manghi over 8 years ago
Hi Even, good morning.
Even Rouault wrote:
Giovanni:
- what kind of edits exactly: adding shapes, removing shapes, adding vertex to a pre-existing shape, ... ?
The last time I was warned about this problem it was triggered by deleting features in a shapefile: this didn't happened to me directly, but to people I work with. At that point I was sure that the issue was solved (this QGIS users are using QGIS 2.14.3 on Windows, so GDAL 2.0.* I guess) and I have confirmed that opening the shape on a machine with ESRI software the deleted feature where there.
- I'm surprised that removing the shape from project causes issue: this should be equivalent to exiting QGIS. You don't have problem when saving & exiting QGIS right ?
I'm sorry I probably expressed myself not very well. The issue seems to arise when not removing the shapefile from the project or closing qgis after editing it, and then sharing it.
cheers!
-- G --
#99 Updated by Giovanni Manghi over 8 years ago
Even Rouault wrote:
Giovanni: and which GDAL version and OS ?
The last I was warned about the issue the user was using QGIS 2.14.3 on Windows, so GDAL 2.0.*.
#100 Updated by Giovanni Manghi over 8 years ago
Even Rouault wrote:
Giovanni:
- what kind of edits exactly: adding shapes, removing shapes, adding vertex to a pre-existing shape, ... ?
- I'm surprised that removing the shape from project causes issue: this should be equivalent to exiting QGIS. You don't have problem when saving & exiting QGIS right ?
Hi again,
I just tested on Windows, using QGIS 2.14.3 (now with gdal 2.1) and OpenJump/ogrinfo.
I opened a point shapefile in QGIS and deleted a few features. Then opened the shape in OpenJump and did a ogrinfo, both still show the original number of features. Removing the shape from QGIS or closing QGIS makes no difference.
I will test master now.
#101 Updated by Giovanni Manghi over 8 years ago
I will test master now.
on qgis master seems everything work as expected. So if something was fixed in master I guess that it would need to be backported to 2.14 as it the next lts(?).
#102 Updated by Antoine SIG over 8 years ago
My computer : Windows 10 (64)
QGIS 2.8.9 (23c3ece / GDAL 2.1.0) > no issue after close QGIS or delete layer from project (REPACK use after closing layer ?)
QGIS 2.14.3 (cf2ebb8 / GDAL 2.1.0) > issue
Except if layer have sbn/sbx files.... the first saving delete sbn/sbx and DBF is update(and repack ?), but after there is the issue for all save.
QGIS 2.15 (b5f0fc8 / GDAL 2.1.0) > no issue
#103 Updated by Jukka Rahkonen over 8 years ago
When you test with OpenJUMP please report the exact OpenJUMP revision from Info-About. There has been some changes in the OpenJUMP shp driver and it is possible that different revisions give different results.
#104 Updated by Even Rouault over 8 years ago
OK, I confirm that (regarding the scenario of deletion of features in a shapefile):
- QGIS 2.14.3 on Windows has the issue (with GDAL 2.1.0)
- QGIS 2.14.3 and latest state of 2.14 branch on Linux do NOT have the issue
- QGIS master (on Windows or Linux) does NOT have the issue (with GDAL 2.1.0)
Since QGIS 2.14.3, I backported most relevant changes in 2.14 branch (https://github.com/qgis/QGIS/commits/release-2_14/src/providers/ogr), so I'm confident that 2.14.4 will behave like master. I do think the issue is specific to Windows with the way it manage exclusive locks. It has probably been fixed by this series of commits : #c913e83f5ad14b011be0a44ec9a59d80d103f929, #32ea6514a92610bba9d29fd85792204f78b584ac and #fe32ba40b920038e418a653b51a69ce43549e4d9
#105 Updated by Andreas Neumann over 8 years ago
In the OSGeo4W build there are nightlies of the 2.14 branch. I just checked and it seems to be up-todate with this commit: #843d17e - this version contains the fixes that Even mentioned.
So Antoine or Giovanni - are you able to test with this version and report back if the issue still exists?
Thanks,
Andreas
#106 Updated by Even Rouault over 8 years ago
Andreas. I also now see those nightlies of the 2.14 branch (the tilebar says 2.14.3 which is a bit confusing. Should ideally be release-2_14 to avoid any confusion that it isn't a tagged version.), and my above tests work now with them!
I've also added a unit test for that issue (would need to run on Windows to be really significant as Linux "hides" the issue) : #3cba1aa263a72e6aadcb002730f221c0e2e50f1e
#107 Updated by Antoine SIG over 8 years ago
Test with nighlty (Windows 10 64)
QGIS 2.14.3 nightly (843d17e / GDAL 2.1.0) : no issue ! It works perfectly.
#108 Updated by Andreas Neumann over 8 years ago
- Status changed from Reopened to Closed
- Resolution changed from up/downstream to fixed/implemented
Phew - that was a complicated bug with lots of discussion - over two years since the original report of the issue.
Thanks to anyone involved!
#109 Updated by Saber Razmjooei over 8 years ago
- Status changed from Closed to Reopened
- Resolution deleted (
fixed/implemented)
The resolutions seems to cause some side-effects/data corruption, see #15407.
#110 Updated by Sandro Santilli about 8 years ago
I'm not sure it's useful to keep this ticket open, the bug in this ticket subject was fixed, better use the new ticket for the new problem.
#111 Updated by Even Rouault about 8 years ago
- Status changed from Reopened to Closed
Closing. This ticket is not managable and #15407 has been opened