Bug report #13796
Attribute table swallow values
Status: | Closed | ||
---|---|---|---|
Priority: | Normal | ||
Assignee: | - | ||
Category: | Attribute table | ||
Affected QGIS version: | master | Regression?: | No |
Operating System: | Ubuntu | Easy fix?: | No |
Pull Request or Patch supplied: | No | Resolution: | end of life |
Crashes QGIS or corrupts data: | No | Copied to github as #: | 21821 |
Description
If open attached shapefile (original encoding is windows-1251) and choose UTF-8 then you can see question marks - this is what expected, but if choose System encoding, then you can see that value of CLNAME attribute is empty.
History
#1 Updated by dr - about 9 years ago
#2 Updated by Giovanni Manghi about 9 years ago
- Category set to Attribute table
#3 Updated by Saber Razmjooei over 8 years ago
- File encoding_master.PNG added
- Status changed from Open to Feedback
Looks fine in the latest master under Windows.
#4 Updated by Giovanni Manghi over 8 years ago
- Resolution set to worksforme
- Status changed from Feedback to Closed
closing for lack of feedback, please reopen if necessary.
#5 Updated by dr - over 8 years ago
- Resolution deleted (
worksforme) - Status changed from Closed to Reopened
Issue is still present.
#6 Updated by Jürgen Fischer over 8 years ago
- Status changed from Reopened to Feedback
And what behavior do you expect? If you select "windows-1251" the value is "ЛЕСА" (forests). I suppose that would also occur if you "System"'s encoding was "windows-1251".
#7 Updated by dr - over 8 years ago
If I select "UTF-8" then I see field value as "����" but if I select "System" then I see field value as empty string. Why I get different result if my system encoding is UTF-8?
#8 Updated by Jürgen Fischer over 8 years ago
dr - wrote:
If I select "UTF-8" then I see field value as "����" but if I select "System" then I see field value as empty string. Why I get different result if my system encoding is UTF-8?
Qt's detection (or iconv's) of the system encoding is tricky. With LC_CTYPE=de_DE.UTF-8
the system codec is different from the UTF-8
codec.
#!/usr/bin/python
from PyQt4.QtCore import QTextCodec
sc = QTextCodec.codecForName( "System" )
lc = QTextCodec.codecForLocale()
uc = QTextCodec.codecForName( "utf-8" )
wc = QTextCodec.codecForName( "windows-1251" )
a = "\\313\\305\\321\\300"
print "input:", a
b = sc.toUnicode(a)
print u"sc output: {} ({})".format(unicode(b), sc.name())
b = lc.toUnicode(a)
print u"lc output: {} ({})".format(unicode(b), lc.name())
b = uc.toUnicode(a)
print u"uc output: {} ({})".format(unicode(b), uc.name())
b = wc.toUnicode(a)
print u"wc output: {} ({})".format(unicode(b), wc.name())
produces
input: ���� sc output: (System) lc output: (System) uc output: ���� (UTF-8) wc output: ЛЕСА (windows-1251)
But I don't see much difference between the empty string and the question mark junk. What's the point of not using the actual encoding of the data?
#9 Updated by dr - over 8 years ago
I see a lot of difference between empty string and questions marks! Questions marks are strongly associated with 'something is wrong', empty strings on the other side are common in the data and 'look' correct.
#10 Updated by Giovanni Manghi almost 8 years ago
- Status changed from Feedback to Open
#11 Updated by Giovanni Manghi over 7 years ago
- Regression? set to No
- Easy fix? set to No
#12 Updated by Giovanni Manghi over 5 years ago
- Resolution set to end of life
- Status changed from Open to Closed
End of life notice: QGIS 2.18 LTR
Source:
http://blog.qgis.org/2019/03/09/end-of-life-notice-qgis-2-18-ltr/