Show originaltitle (international original movie title) of movies in an extra field?
#46
I am interested, definitely. I do not live in an English speaking country.

I feel that having something clear in mind might help any developer willing to look into it.
For troubleshooting and bug reporting please make sure you read this first (usually it's enough to follow instructions in the second post).
Reply
#47
This is already partially supported in Dharma and more support is in trunk.
The db can hold both title and originaltitle
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
Reply
#48
Hi,

First of all, my congrats to all the XBMC community, great development !!!

Probably I am doing something wrong, but I am trying to get only english titles too but no success.

I am running 10.0 r35648 and IMDB 2.1.6, my preferred title language is set to "USA/International" but I am getting original titles, for example:

http://akas.imdb.com/title/tt0389557/ --> "Swartboek" instead of "Black book"
http://akas.imdb.com/title/tt0389557/ --> "Wo hu cang long" not "Crouching Tiger..."

Using "Keep Origial" brings the same results as well as "Canada"

Any help?
Reply
#49
It turns out that the criteria used to determine if the film is "foreign" includes studios.
So if one of the film's production companies is from an English speaking country (which is undoubtedly the case for these movies), XBMC will determine the original title as international / US.
I have been told that is the best that can be done, altering these rules will create even more false positives in other areas.
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
Reply
#50
sho Wrote:This is already partially supported in Dharma and more support is in trunk.
The db can hold both title and originaltitle
Ok, so is it just a matter of skins giving visibility to the information stored?
For troubleshooting and bug reporting please make sure you read this first (usually it's enough to follow instructions in the second post).
Reply
#51
Sho, thansk for your prompt response.

Cheers
Reply
#52
sho Wrote:It turns out that the criteria used to determine if the film is "foreign" includes studios.
So if one of the film's production companies is from an English speaking country (which is undoubtedly the case for these movies), XBMC will determine the original title as international / US.
I have been told that is the best that can be done, altering these rules will create even more false positives in other areas.

I found that if in the file:

\XBMC\userdata\addon_data\metadata.imdb.com\settings.xml

The line

<setting id="akatitles" value="USA / International" />

is forced to only "USA":

<setting id="akatitles" value="USA" />

The number of true American Titles improves a lot.

In a collection with 60 or so non-USA/UK, with the standard "USA / International" I got like 40~50% with the original title (or worse, the spanish title) and not the American one.

when forced to "USA" less than 10%. I tried also with just "International" but there were no significant changes..

It may be a good idea to separate USA and International.

My movie collection is organized each movie in its own folder. And by default there is a Movie.nfo in each folder with the imdb url. So no problem with false positives but there may be.

I also discovered that for this stubborn movies I can force the title in the same nfo file (Library Import Export section in the wiki)
Reply
#53
Example movies please.
Reply
#54
olympia Wrote:Example movies please.

I will send them in the next couple of days, to do so I have to re-scrape my collection with the two settings to identify differences,

Something worth to mention is that I have played some time with ember media manager -Revisited, and, even though I have not made a complete test, EMM-r is able to present the American title in all the cases I tested.

It has an option to force the language of the title to a selected one by the user. One of my stubborns movies is Crouching Tiger, Hidden Dragon (2002) and under any circumstances I have been able to make the xbmc imdb scrapper to bring that title, except by forcing it in the movie.nfo file. EMM-r picks that one. I will send also a comparison with EMM-r scrapper.

Cheers.
Reply
#55
Yeah, just had a look at it and because imdb changed their layout since the "preferred title language" part of the scraper has been written, it is completely broken.

Would need a rewrite for very few gain. Probably it wasn't broke down today and you're the first one who noticed it.

I will think about that, but this is very hard to maintain because:
- no common way of how imdb is naming the titles, there are many exceptions
- seems to changing more often than other part of the layout
Reply
#56
Hi Olympia,

Some examples,

forcing the settings to "USA" with a text editor I got:

9th Company
Black Book
Come and See
Mesrine: Killer Instinct
Mesrine: Public Enemy No. 1
Tae Guk Gi: The Brotherhood of War
The Barbarian Invasions
The Secret of the Grain

For the same movies with the standard "USA / International" I got:

9 rota
Zwartboek
Go and Look
L'instinct de mort
L'ennemi public n°1
Brotherhood
Les invasions barbares
Couscous

I tested EMM-R setting the title language to USA and no problems.

I am not sure the scraper is broken, for me it seems that it is working as designed. I am totally ignorant of the code, but my intuition tells me that the scraper goes to the akas-imdb page and looks for the title with the language tag chosen in the scraper.

Actually I changed the setting to "English title" with a text editor because I saw this tag in akas.imdb and yes I got the English title but it was a mess since what it is defined as the English title could be anything, i.e. the movie "Almost Famous (2000)" is "Back in Those Days with Penny Lane" (the english title in Japan)!!!

As I mentioned, I am totally incompetent in the coding part, and in many more things about IMDB, and you probably know this but if you allow me
what I think EMM-R does is a request to IMDB with an additional parameter.

No idea how they do this but I have my own account in IMDB and in my settings I have the following two site preferences set to:

Title display country: USA
Title display language: English

I am not sure if it is possibe to add these parameters to the scraper and that's all. Maybe, as you say, it requires a complete rebuilt.

But for me it works great but as everything it could be improved.

Decoupling the "USA / International" in two, should not be difficult (I think), the other part is a different story.

Thanks for your attention.
Reply
#57
You're right, the function works, I was looking at the wrong imdb page.

On the other hand, please stop comparing to EMM as it is not allowing to scrape title on foregin languages, therefore its job a little bit easier, and EMM is EMM and XBMC is XBMC.

Please revert ALL your changes until the last bit and try to test your movies if you delete line 175-180 of metadata.imdb.com/imdb.xml.
Reply
#58
olympia Wrote:You're right, the function works, I was looking at the wrong imdb page.

Please revert ALL your changes until the last bit and try to test your movies if you delete line 175-180 of metadata.imdb.com/imdb.xml.

Hi Olympia,

Mixed results, in some movies helped in others no. But my conclusion is that by going to AKAS it is impossible to get 100% accurate results (for me an accurate result is to get the title displayed in http://www.imdb.com/title/ttxxxxxx) - maybe I am wrong in my criteria.

Ideally, according to me and after analyzing countries and akas of several movies in my collection, the logic should be something like:

1. if country movie in (USA, UK, Canada and alike) then Title = Original Title

2. if there is a country not in the above list, then look for the AKA title tagged as USA.

3. And if there is no AKA title tagged as USA then take the one with the International tag.

I am incapable of doing the above because I have no idea of regex and xml (I am impressed that there are no "if"s and "then"s in the xmdb.xml file) but in any case it is no perfect, it would fit 95% percent of the movies but take the following 4 exception examples:

Code:
*** Das Experiment (2001) (Germany)

The displayed title in www.imdb.com is Das Experiment, the country is not USA or alike, but no way to get it from the AKAS list below.

AKAS:

"The Experiment" - International (English title) (literal title), USA (video title)
"El experimento" - Argentina, Peru, Spain
"A Experiência" - Brazil, Portugal
"L'expérience" - Canada (French title), France
"Эксперимент" - Russia
"A kísérlet" - Hungary
"Deney" - Turkey (Turkish title)
"Eksperiment" - Croatia (imdb display title)
"Eksperiment" - Serbia (imdb display title)
"Eksperimentet" - Norway
"Eksperimentet" - Denmark
"Eksperyment" - Poland
"Es" - Japan
"Experiment - Ihmiskoe" - Finland (TV title)
"Experimentet" - Sweden
"L'expériment" - Canada (French title) (DVD title)
"O Experimento" - Brazil (festival title)
"The Experiment" - Italy
"The Experiment - Cercasi cavie umane" - Italy (imdb display title)
"To peirama" - Greece (transliterated ISO-LATIN-1 title)


*** Walk the Line (2005) (USA, Germany)

Germany is in the country list, The AKA USA is not the American/International (not displayed at www.imdb.com) and the International tag corresponds to the Spanish title.

AKAS:

Walk the Line    Canada (French title) / Finland / France / Germany / Greece
Johnny & June - Pasión y locura    Argentina / International (Spanish title) / Venezuela
A nyughatatlan    Hungary
Cash    USA (original script title)
En la cuerda floja    Spain
Hod po rubu    Croatia
Johnny & June    Brazil
Nagu noateral    Estonia
Quando l'amore brucia l'anima    Italy
Sinirlari asmak    Turkey (Turkish title)
Spacer po linie    Poland
Wôku za rain: Kimi ni tsuzuku michi    Japan

**** The Good, The Bad, The Weird (2008) (South Korea)

There are two International tags that also has "English Title", the scraper is returning the wrong: "Nom Nom Nom"

AKAS:

The Good, the Bad, the Weird    International (English title) / Sweden (imdb display title)
Хороший, плохой, долбанутый    Russia
A jó, a rossz és a furcsa    Hungary (imdb display title)
Dobar, los, cudan    Croatia (imdb display title)
Dobry, zly i zakrecony    Poland
El bueno, el malo y el raro    Spain (imdb display title)
El bueno, el malo, el loco    Argentina (festival title)
Good Bad Weird    Japan (English title)
Hea, halb ja kummaline    Estonia (alternative title)
Hodný, zlý a divný    Czech Republic
Le bon, la brute et le cinglé    France
Nom Nom Nom    International (informal short title) (English title)
O kalos, o kakos kai o paraxenos    Greece (festival title)
O kalos, o kakos ki o periergos    Greece (transliterated ISO-LATIN-1 title)
Os Invisíveis    Brazil (DVD title)

**** Shinjuku Incident (2009) (Hong Kong)

Country is not USA but the AKA-USA is not correct so my ideal algorithm does not work here neither.

AKAS:

Jackie Chan in Shinjuku Incident    USA (dubbed version)
Kanli hesaplasma    Turkey (Turkish title)
La Vendetta del Dragone    Italy
Leszámolás Tokióban    Hungary (imdb display title)
O ektelestis tis Yakuza    Greece (DVD title)
Shinjuku Incident    International (English title)
Shinjuku Incident: Guerre Des Gangs à Tokyo    France (DVD title)
Stadt der Gewalt    Germany (DVD title)
Xin Su shi jian    Hong Kong (Mandarin title)

Is there a way to make the search of the title in http://www.imdb.com (or google) instead of akas.imdb.com and get the displayed title in order to get the American/International title?

My second option would be to give the option to the user that in addition to get fanart from TheMovieDB.com, get also the title (a recommended option for those looking for the American/International title).

In my collection TheMovieDB titles are perfect except for the "Das Experiment" movie, I got there also "The Experiment". I prefer the movie info provided by IMDB, so I will keep this scraper with "Keep Original" as IMDB is unpredictable and forcing the foreign titles with the movie.nfo file, but would be nice if other approaches as the suggested are taken into consideration.

Again, thank you very much for your attention and sorry for the extension of this note.
Reply
#59
There are more things to take into consideration when you design a scraper logic. One (and one of the most importants) is speed. If you fetch info from different pages with no brain, then it will slow down scraping like hell. So IMDB scraper is using akas, because that page is already cached. If it was using another page, that's an additional page to open. No way I can be convinced to to that. Given the chaos IMDB is doing with AKA titles (as you discovered as well), I consider 5% failure rate to be a good result already. If we can make it better within the given page, I am up to it.

As you probably noticed, if you delete the lines I suggested, the country bit gets disregarded, but international title wins over USA title (in most of the cases), so the current logic is the opposite to the one you drafted.

Going to www instead of akas is a nogo too, because then it gets unpredictable to which server IMDB drops you and there can be differences between servers. Also in some countries not always the International title gets displayed at the top, but the local title (iirc - but pretty sure there was some issue with that).

The other thing is, that later on, there will be another user with another set of foregin movies, where the same logic might not work as good. So this is not easy.
Reply
#60
Hi Olympia,

I see, I was not aware of these things. And I agree, IMDB scraper is really fast and this is something that has to be kept.

I said that the "patch" yielded mixed results, what I forgot to said (sorry) is that it was much better for non-USA (or English speaking countries) but much worse for USA or alike countries.

Take a look below to some examples of USA (or alike) movies with the 175-180 lines deleted and settings set to "USA / International". The first part is the American title, the second part, after ==>, the returned by the scraper and the tag in AKAS that made the scraper to choose that title.

There are really strange things, for the "Shawshank Redemption" and "Due Date" the scraper returns the Ukranian name, I guess because the "UK" in Ukraine. In other cases, like in the 3rd or 4th one, is really strange too, the USA or International names could be in Spanish or other.

Code:
The Shawshank Redemption (1994) ==> "Втеча з Шоушенка" - Ukraine

Due Date (2010) ==> "Встигнути до" Ukraine

Monty Python and the Holy Grail (1975) ==> "Mønti Pythøn ik den Høli Gräilen" - International

Office Space (1999) ==> "Cubiculos de la oficina" - USA (Spanish title)

Harry Potter and the Deathly Hallows: Part 1 (2010) ==> "The Deathly Hallows" "The Deathly Hallows" - USA (short title)

Interstate 60: Episodes of the Road (2002) ==> "I-60" USA (promotional abbreviation)

Almost Famous (2000) ==> "Untitled: Almost Famous the Bootleg Cut" USA (director's cut (DVD title))

Arsenic and Old Lace (1944) ==> "Frank Capra's 'Arsenic and Old Lace" USA (complete title)

In Cold Blood (1967) ==> "Truman Capote's In Cold Blood" USA (complete title)

Magnolia (1999) ==> "Mag·no'li·a" USA (Promotional title)

North by Northwest (1959) ==> "Alfred Hitchcock's North by Northwest" - UK (complete title), USA (complete title)

Precious (2009) ==> "Precious (Base on Nol by Saf) (Based on the Novel 'Push' by Sapphire)" - USA (complete title)

Rear Window (1954) ==> "Alfred Hitchcock's Rear Window" - USA (complete title)

The Bad Lieutenant: Port of Call - New Orleans (2009) ==> "Bad Lieutenant" - UK

The Man from Earth (2007) ==> "Jerome Bixby's The Man from Earth" - USA (complete title)

With the new configuration, and the big mess in IMBD akas, I have to take care of all, USA and non-USA, with the old one, just for non-USA. I prefer the old method, forcing the settings to "USA". Or to "Keep Original" and forcing the title with the Movie.nfo file but only for non-English movies, this may represent more work but ensures that in any change in the scraper or Imdb it will work (hopefully).

The 95% of accuracy is achieved, in theory because I can't test it, with the algorithm I mentioned, not with this patch.

I am aware that my movie collection may be different from other users in terms of Akas "messness", I am just trying to help and as you may note my comments are based on very detailed and extensive data.

I have heard that in some countries, http://www.imbd.com returns the title by countries' ip. This was the case for a while here in Mexico but not anymore, maybe in other countries this is still the case. But in case of doubt, I agree this is not an option.

Taking the title, but keeping IMDB's orginal title, from Themoviedb is not an option? My understanding is that TheMovieDB page is visited in any case for the Fanarts (if the user checks this option).

Last but no least, something very good about IMDB scraper, it's its support - thanks again for listening.
Reply

Logout Mark Read Team Forum Stats Members Help
Show originaltitle (international original movie title) of movies in an extra field?0