How to get unicode from python to $INFO label - bossanova808 - 2012-02-23 07:17
I have some code that uses this string:
Code:
u'Sigur R\xc3\xb3s'
(python repr())
...which would appear a to be a utf-8 encoded unicode string (Although I ma very weak in this area!)
and I am setting that to a window property via:
Code:
xbmcgui.Window(xbmcgui.getCurrentWindowId()).setProperty("CURRENTARTIST", artist)
(in a WindowXML)
I suspect I am going wrong somewhere basic but an arvo of researching various encoding things has got me no closer...
anyone have ideas??
...however, this results in gobbledygook on screen.
- VictorV - 2012-02-23 21:30
Try to convert it to a bytestring
s = u'Sigur R\xc3\xb3s'.encode('utf-8')
- bossanova808 - 2012-02-25 02:19
Unfortunately that doesn't work...same result.
Any other ideas - I think the info IS unicode utf-8, but I think maybe XBMC isn't interpreting it as such
- bossanova808 - 2012-02-25 02:24
Hmmm ok passing it just artist = 'Sigur R\xc3\xb3s' WITHOUT making it a uncide string works!
That's odd...must be a double translation thing I guess?
Now, how to get the unciode strings into basic string in python - i.e. cast them I guess. I find this area a bit confusing....
- bossanova808 - 2012-02-25 02:36
The problem is I am using a downstream library and is returning strings with these characters in them, so 'Sigur R\xc3\xb3s' - and these are type as unicode.
If I then pass them as this type, they come out in xbmc wonky. I need to just cast them or get the literal value of the string...but I can't seem to just get the literal value from a unicode string in a variable...
I think I am missing something obvious but have been missing it for two days now and it's driving me nuts!
Any python experts know how to do this??
- giftie - 2012-02-25 04:48
bossanova808 Wrote:The problem is I am using a downstream library and is returning strings with these characters in them, so 'Sigur R\xc3\xb3s' - and these are type as unicode.
If I then pass them as this type, they come out in xbmc wonky. I need to just cast them or get the literal value of the string...but I can't seem to just get the literal value from a unicode string in a variable...
I think I am missing something obvious but have been missing it for two days now and it's driving me nuts!
Any python experts know how to do this??
I thought it looked like a unicoded utf-8 string...
I use the following python code to insure that the string is in utf-8 coding.
Code:
def get_unicode( to_decode ):
final = []
try:
temp_string = to_decode.encode('utf8')
return to_decode
except:
while True:
try:
final.append(to_decode.decode('utf8'))
break
except UnicodeDecodeError, exc:
# everything up to crazy character should be good
final.append(to_decode[:exc.start].decode('utf8'))
# crazy character is probably latin1
final.append(to_decode[exc.start].decode('latin1'))
# remove already encoded stuff
to_decode = to_decode[exc.start+1:]
return "".join(final)
Then I send to XBMC the string with a '.decode("utf-8")' This shows the artist in the proper format(usually..)
- bossanova808 - 2012-02-25 05:09
mmm, that seemed to give me the same results. This might make it clearer (perhaps)!
Code:
title, artist, album = self.player.getCurrentTrack()
print "artist (raises exception about ordinal out of range if printed as is) "
print repr(artist)
artist2 = 'Sigur R\xc3\xb3s'
print "artist2 is " + artist2
print type(artist2)
#newa =self.get_unicode(artist)
xbmcgui.Window(xbmcgui.getCurrentWindowId()).setProperty("CURRENTTITLE", title)
xbmcgui.Window(xbmcgui.getCurrentWindowId()).setProperty("CURRENTARTIST", artist)
and output:
Code:
14:06:58 T:756 NOTICE: artist (raises exception about ordinal out of range if printed as is)
14:06:58 T:756 NOTICE: u'Sigur R\xc3\xb3s'
14:06:58 T:756 NOTICE: artist2 is Sigur Rós
14:06:58 T:756 NOTICE: <type 'str'>
If I pass artist 2 - correct onscreen display
pass artist 1 - gobbldeygook
- giftie - 2012-02-25 06:12
What's the code in self.player.getCurrentTrack() I think the problem is there. With out the u' prefix it properly works, as you say, but nothing seems to be able to strip out.
bossanova808 Wrote:mmm, that seemed to give me the same results. This might make it clearer (perhaps)!
Code:
title, artist, album = self.player.getCurrentTrack()
print "artist (raises exception about ordinal out of range if printed as is) "
print repr(artist)
artist2 = 'Sigur R\xc3\xb3s'
print "artist2 is " + artist2
print type(artist2)
#newa =self.get_unicode(artist)
xbmcgui.Window(xbmcgui.getCurrentWindowId()).setProperty("CURRENTTITLE", title)
xbmcgui.Window(xbmcgui.getCurrentWindowId()).setProperty("CURRENTARTIST", artist)
and output:
Code:
14:06:58 T:756 NOTICE: artist (raises exception about ordinal out of range if printed as is)
14:06:58 T:756 NOTICE: u'Sigur R\xc3\xb3s'
14:06:58 T:756 NOTICE: artist2 is Sigur Rós
14:06:58 T:756 NOTICE: <type 'str'>
If I pass artist 2 - correct onscreen display
pass artist 1 - gobbldeygook
- bossanova808 - 2012-02-25 06:18
Code:
artist = self.playlist[currentIndex]['artist']
...which is looking at the result of getplaylist:
self.playlist = self.sb.playlist_get_info()
...
def playlist_get_info(self):
"""Get info about the tracks in the current playlist"""
amount = self.playlist_track_count()
response = self.request('status 0 %i' % amount, True)
encoded_list = response.split('playlist%20index')[1:]
playlist = []
for encoded in encoded_list:
data = [self.__unquote(x) for x in ('position' + encoded).split(' ')]
item = {}
for info in data:
info = info.split(':')
key = info.pop(0)
if key:
item[key] = ':'.join(info)
item['position'] = int(item['position'])
item['id'] = int(item['id'])
item['duration'] = float(item['duration'])
playlist.append(item)
return playlist
and __unquote is:
def __unquote(self, text):
try:
import urllib.parse
return urllib.parse.unquote (text, encoding=self.charset)
except ImportError:
import urllib
return urllib.unquote(text)
(it does raise the exception and fo through ro just urllib.unquote(text) rather than the .parse version).
I wrote basically none of those functions, they are from pysqueezecenter and I use this in lots of places, so ideally I want to fix it externally if I can...as if I change the output it will likely break other things.
I even tried using repr() on it and then stripping off the u' and the final ' in a gross hack but that didn't work...which surprised me.
- giftie - 2012-02-25 07:33
I know you really don't want to change the coding, but can you change the response line to the following:
Code:
response = self.request('status 0 %i' % amount, False)
bossanova808 Wrote:
Code:
artist = self.playlist[currentIndex]['artist']
...which is looking at the result of getplaylist:
self.playlist = self.sb.playlist_get_info()
...
def playlist_get_info(self):
"""Get info about the tracks in the current playlist"""
amount = self.playlist_track_count()
response = self.request('status 0 %i' % amount, True)
encoded_list = response.split('playlist%20index')[1:]
playlist = []
for encoded in encoded_list:
data = [self.__unquote(x) for x in ('position' + encoded).split(' ')]
item = {}
for info in data:
info = info.split(':')
key = info.pop(0)
if key:
item[key] = ':'.join(info)
item['position'] = int(item['position'])
item['id'] = int(item['id'])
item['duration'] = float(item['duration'])
playlist.append(item)
return playlist
and __unquote is:
def __unquote(self, text):
try:
import urllib.parse
return urllib.parse.unquote (text, encoding=self.charset)
except ImportError:
import urllib
return urllib.unquote(text)
(it does raise the exception and fo through ro just urllib.unquote(text) rather than the .parse version).
I wrote basically none of those functions, they are from pysqueezecenter and I use this in lots of places, so ideally I want to fix it externally if I can...as if I change the output it will likely break other things.
I even tried using repr() on it and then stripping off the u' and the final ' in a gross hack but that didn't work...which surprised me.
|