Bug report #6327

non-ASCII character corruption in shapefile

Added by Leyan Ouyang over 11 years ago. Updated almost 11 years ago.

Affected QGIS version:master Regression?:No
Operating System: Easy fix?:No
Pull Request or Patch supplied:No Resolution:fixed
Crashes QGIS or corrupts data:No Copied to github as #:15614


The non-ASCII characters seem to be corrupted on a new shapefile. Steps to reproduce:
  • Create a shapefile, choosing UTF8 and adding a text attribute
  • Create a feature and fill the text attribute with non-ASCII text (for example 汉字)
  • Save the shapefile.

The text of the attribute is replaced by "??"

This does not happen when loading an existing UTF8 shapefile, it can even be modified and saved. "Save as", however, is also affected.

I use QGis master and latest GDAL build from SVN. I am using ArchLinux with a en_US.UTF-8 locale.

I am aware of the previous issues between QGIS and GDAL, but Alexander Bruy told me they were mostly solved now and this problem looks different, hence this new bug report.


#1 Updated by Leyan Ouyang over 11 years ago

Everything seems to work much better if I choose the option "Ignore shapefile encoding". What does this new option do exactly ?

#2 Updated by Alexander Bruy over 11 years ago

  • Operating System deleted (ArchLinux)
  • Status changed from Open to Feedback

Should be fixed in master, see 75dc85b4d652116814873bb7674cab15ce6cde66.

#3 Updated by Leyan Ouyang over 11 years ago

This is the option "Ignore shapefile encoding" I mentioned in the first comment, and it does seem to solve the issue. However, the bug is not fixed in my opinion as the default setting will lead to loss of data. This option should be checked by default while the issues with GDAL are not sorted out.

#4 Updated by Giovanni Manghi about 11 years ago

  • Priority changed from High to Normal

#5 Updated by Borys Jurgiel almost 11 years ago

  • Resolution set to fixed
  • Status changed from Feedback to Closed

Fixed in 4fb9879

Also available in: Atom PDF