Subtitle character joining/shaping for Semitic languages (Arabic, Hebrew, Farsi...)
#1
hello everyone!

first of all... mad respect to the developers of xbmc and everyone else who put so much time and effort into making it what it is today.. simply amazing!

now for my question..
i have modded over 100 xboxes and have played around with xbmc in a lot of occasions. most of the time if i had a problem i could solve it directly, got help on irc, or it was fixed in a newer release.

but now i am stuck on a tricky lil thing.. i am trying to show arabic subtitles for some very good friends of mine and they tell me the subtitles are not appearing correctly.

i have downloaded the latest build i could find on the server (the one of 15-01-2005), uploaded a ttf font which contains the arabic alphabet, and changed the settings accordingly (charset and bi-dir-flipping). it is showing the correct font, the letters are appearing in arabic, and the subtitles are showing from right to left.

however.. the words are single letters, as opposed to true arabic words.. like you would see in windows. so the letters are not joined into a word, but are seperate letters, with spaces inbetween the words. this happens with every subtitle file and every movie i try them on.

is there anyone that has experience with getting arabic subtitles to work in xbmc? or can anyone help me fix them?

i can ask anyone here with me to help, as i can not read arabic myself, but they (my friends and their family here in qatar) sometimes have a hard time reading/understanding english, and there are a lot arabic subtitles for download which is super, but unfortunatly not for xbmc at the moment.

please help me and others by making this software even better, so the arabic speaking parts of the world can enjoy this great piece of soft as much as all of us are.

thanks a lot,


ralph
Reply
#2
i believe yuvalt mentioned this issue when he first implemented bi-dir, it's not a bug but by designed, bi-dir only 'mirror' the letters, not words as it's not intelligent, that is fine for hebrew but cause problems with arabic as apparently you can't just do that with arabic (i can't read either so i can't answer you why). only solution is for you to get subtitles that are formated correctly so you don't have toi use bi-dir at all.
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
Reply
#3
hey man, thanks for replying.
i understand what you're saying, it makes sense, and sounds like that's exactly what my prob is.

so what you're saying is that i should have like.. .idx subs or something..? with embedded fonts etc?
is there a way to make those when i have just .srt?

and does anyone know whether the developers will be looking into this issue? if they need any help with the arabization, please dont hesitate to contact me.

thanks a lot for replying.
cheers,

ralph
Reply
#4
(canphaz @ feb. 11 2005,10:00 Wrote:so what you're saying is that i should have like.. .idx subs or something..? with embedded fonts etc? is there a way to make those when i have just .srt?
no, not exacly. yes vobsub (.idx + .sub) subs will work because they are pictures in the correct order to be displayed (eg xbmc doesn't have to flip them, they are just displayed 'as is'). no you can not convert yours or any srt/ssa etc. to vobsub, (and even if you could they would be displayed in the same order as the original srt/ssa, eg wrong). no what you must do if want to use text based subtitles (like srt and ssa) is to find/download such subtitle files that already is displayed in the correct order, simply open the subtitle file in a text-editor/reader that support arabic (like example notepad or wordpad on a arabic windows pc) and check, you should be see it directly as it is written as it will be displayed on the screen when bi-di is not enabled.

(canphaz @ feb. 11 2005,10:00 Wrote:does anyone know whether the developers will be looking into this issue? if they need any help with the arabization, please dont hesitate to contact me.
i very much doubt that this is something our xbmc devs will or even can work on, (for starters we don't have anyone who speak/read arabic, and even if we did, if i'm right, it would take a lot to code an intelligent library that can flip the arabic language in text-based subtitles), ...if you like to see this happen in xbmc someday then you would probebly be far better of lobbying/helping the developers of fribidi (link) instread. but like i already mentioned the fastest way for a solution is to get subtitles that are correctly formated in the first place, or if you even plan to create/modify subtitles yourself; do it right the first time so you don't have to use workarounds like a bidirectional algorithm to flip the text. good luck.
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
Reply
#5
hey man, thanks again for the time and effort you took to reply to my post. i really appreciate it!

Quote:no what you must do if want to use text based subtitles (like srt and ssa) is to find/download such subtitle files that already is displayed in the correct order

which do you mean? you mean i should find those subs in the .idx/.sub format? i think they are rather rare, especially in arabic. i think most common ofcourse are the .srt, .sub etc. they are more likely to be found in arabic.

Quote:simply open the subtitle file in a text-editor/reader that support arabic (like example notepad or wordpad on a arabic windows pc)

the texts in the .sub look like this: "ßçä íçãç ßçä". if i change the font in notepad / xbmc these weird looking characters appear as arabic characters. what should i do after that? or did you just try to show me how to verify that they'll appear correct?

isn't there a way to render the subtitles in an arabic font, into a .idx/.sub? can you only convert dvd subs to .idx/.sub?


Quote:...if you like to see this happen in xbmc someday then you would probebly be far better of lobbying/helping the developers of fribidi (link) instread.

yeah, it looks like they are working on something interesting. has the xbmc team no plans to integrate that into xbmc?

anyhow.. sorry to keep bothering you.. i hate to ask for things, rather try to fix them myself. but it looks like this is bigger than i can handle Sad i really appreciate how you helped me. thanks a lot man.

ralph
Reply
#6
hello

first i want to thank all of those who worked to bring xbmc into existance.

xbmc is a very good media center and it works fine in all aspects except for arabic language support.

i tried everything. changing the font, flipping the direction. but nothing seem to work.

in one of the last cvs hebrew and arabic subtitles were fixed. but arabic still have a big problem.

arabic letters in one word should be connected with each other and appear in different shapes according to thier location in the word. the problem with xbmc now is that it doesn't connect the letters with each other. Confused

can this problem be solved ?
Reply
#7
(still_alive @ april 16 2005,11:03 Wrote:arabic letters in one word should be connected with each other and appear in different shapes according to thier location in the word. the problem with xbmc now is that it doesn't connect the letters with each other. can this problem be solved ?
not by us. read this forum-thread that i merged your post into as same thing applies (xbmc only displays what's in txt-based subtitle files)
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
Reply
#8
i found some patch for mplayer to display arabic language

but we need someone to try it

mplayer patch

here is some thread about the patch
http://lists.arabeyes.org/archives/devel...00041.html

hope someone is interested in fixing that problem
Smile
Reply
#9
guys ! i found the way to convert a aubtitle with srt format or whatever to idx and i tried it ...it works :kickass:


========================
you'll need these progz:
subtitle workshop: to convert the sub into .ssa
maestrosbt : to convrt .ssa to .son
son2vobsub : to convert .son to .idx-.sub

=======================
enjoy ! :thumbsup:
Reply
#10
(balbaid @ may 08 2005,14:33 Wrote:guys ! i found the way to convert a aubtitle with srt format or whatever to idx and i tried it ...it works :kickass:


========================
you'll need these progz:
subtitle workshop: to convert the sub into .ssa
maestrosbt : to convrt .ssa to .son
son2vobsub : to convert .son to .idx-.sub

=======================
enjoy ! :thumbsup:
thanks for the nice idea.

i used that way before but it is better to get xbmc to be compatible with arabic language.

we need arabic interface Wink
Reply
#11
i believe that arabic requires certain letter combinations to be swapped around - not just stuff to be read from right to left, correct?

i also believe these character combinations are context-sensitive, in that they depend somewhat on both the word they're part of, and the sentence they're part of.

the bidi stuff swaps letters around and has a few simple rules as to what must stay in order, but for things that are context-sensitive, it's very hard without the program having a large arabic dictionary/rule set. i therefore don't see how xbmc can really do much about this without the code knowing arabic!

if not, please correct me - i know virtually nothing about foreign languages, and am going purely from memory.

cheers,
jonathan
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.


Image
Reply
#12
thanks everyone =)
that has helped me a lot! i guess we are all waiting for fribidi to release final code so it can be implemented in xbmc =)
i'm gonna try that combo of software now to convert those subs Wink
thanks again!

ralph
Reply
#13
(jmarshall @ may 21 2005,18:54 Wrote:i believe that arabic requires certain letter combinations to be swapped around - not just stuff to be read from right to left, correct?

i also believe these character combinations are context-sensitive, in that they depend somewhat on both the word they're part of, and the sentence they're part of.
100% correct Smile


if the developers need an arabic speaker , i would be happy to help Wink
Reply
#14
(jmarshall @ may 21 2005,18:54 Wrote:i believe that arabic requires certain letter combinations to be swapped around - not just stuff to be read from right to left, correct?

i also believe these character combinations are context-sensitive, in that they depend somewhat on both the word they're part of, and the sentence they're part of.

the bidi stuff swaps letters around and has a few simple rules as to what must stay in order, but for things that are context-sensitive, it's very hard without the program having a large arabic dictionary/rule set.  i therefore don't see how xbmc can really do much about this without the code knowing arabic!

if not, please correct me - i know virtually nothing about foreign languages, and am going purely from memory.

cheers,
jonathan
thanks for your interest in the problem.

i think that the problem would be easy to solve by integrating some other projects to xbmc like what canphaz sayed. (fribidi) or any other library that is compatible with arabic

if you need support then i can help
Reply
#15
i found some usefull sources to understand how arabic is displayed

from what i understand. arabic need an extra function to be added after the bidi funcion that is shared between arabic and hebrew.

this function deside which glyph to be used from the font.

the glyphs are having the same letter encoding in iso-8859-6 but choosing which glyph depend on the letter location in the word.

http://www.global-translation-services.c...iso-8859-6

http://lists.arabeyes.org/archives/core/...00132.html

http://ead.staatsbibliothek-berlin.de/20...arabic.pdf

http://www.leeds.ac.uk/acom/teaching/mlc...andout.pdf
Reply

Logout Mark Read Team Forum Stats Members Help
Subtitle character joining/shaping for Semitic languages (Arabic, Hebrew, Farsi...)0