Please help with unicode issue
#1
I'm working on a new add-on (github), and I'm running into issues displaying arabic text on screen.
I've looked at but unfortunately, I still can't get it working right.

The problem would be in
I'm close but I don't know what I'm doing wrong.

If you run the add-on,
- under "add-on settings" change language to "Arabic", and
- in the add-on, select "Middle East" from the menu, it just shows

instead of
  1. {arabic}
  2. {arabic}

Edit: added links to github file line
Reply
#2
You have to be sure that your text is in unicode (the un-encoded form) before calling .encode(), otherwise, you're "double translating" it so to speak. (Spolier alert: it's probably not. Very few websites transmit unencoded unicode). Once you know the encoding it's being sent in, you can decode it to unicode, and then re-encode it to whatever you want (utf-8 in your case)

Example:
Code:
#assuming the website sends you the text encoded as iso-8859-1
text = get_webpage_function('www.example.com')
text = text.decode('iso-8859-1')
text = text.encode('utf-8')
print text

How to find out what the page is encoded as:
http://stackoverflow.com/questions/14592...-in-python
Reply
#3
(2013-12-26, 19:59)Bstrdsmkr Wrote: You have to be sure that your text is in unicode (the un-encoded form) before calling .encode(), otherwise, you're "double translating" it so to speak. (Spolier alert: it's probably not. Very few websites transmit unencoded unicode). Once you know the encoding it's being sent in, you can decode it to unicode, and then re-encode it to whatever you want (utf-8 in your case)

Example:
Code:
#assuming the website sends you the text encoded as iso-8859-1
text = get_webpage_function('www.example.com')
text = text.decode('iso-8859-1')
text = text.encode('utf-8')
print text

How to find out what the page is encoded as:
http://stackoverflow.com/questions/14592...-in-python


I'm using the requests library, and I've specified 'utf-8' in my method:
resources/lib/ util.py
That should take care of it, no?

Also, I had tried r.text instead of r.content and I got the same results...

When I encode the text in abc_base.py, I get the correct result in my xbmc.log file. But I can't get it to show up on screen.
Reply
#4
Check out http://www.python-requests.org/en/latest...se-content
When you set r.encoding() here: https://github.com/irfancharania/plugin....til.py#L28
you're telling requests to expect the page to be sent as utf-8, are you sure that's what it's being sent as?

Another possibility is the font in your skin might not support those characters?
Reply
#5
(2013-12-26, 22:51)Bstrdsmkr Wrote: Check out http://www.python-requests.org/en/latest...se-content
When you set r.encoding() here: https://github.com/irfancharania/plugin....til.py#L28
you're telling requests to expect the page to be sent as utf-8, are you sure that's what it's being sent as?

Another possibility is the font in your skin might not support those characters?

I'll check it out. I'll try out your earlier provided code and report back tomorrow

I don't think the skin's the problem. I'm using Confluence to test.
Reply
#6
You may have to change the font type in skin settings, I don't think the default in Confluence supports Arabic chars.
Reply
#7
(2013-12-27, 13:22)divingmule Wrote: You may have to change the font type in skin settings, I don't think the default in Confluence supports Arabic chars.

You guys were right -- it's the skin...
It was working correctly all along. D'oh!
Changing Font in Appearance to "Arial Based" did the trick.
Reply

Logout Mark Read Team Forum Stats Members Help
Please help with unicode issue0