Sho, thansk for your prompt response.
Cheers
Show originaltitle (international original movie title) of movies in an extra field?
carmatana
Junior Member Posts: 37 Joined: Jan 2011 Reputation: 0 |
2011-01-17 18:49
Post: #51
|
| find quote |
carmatana
Junior Member Posts: 37 Joined: Jan 2011 Reputation: 0 |
2011-05-12 04:04
Post: #52
sho Wrote:It turns out that the criteria used to determine if the film is "foreign" includes studios. I found that if in the file: \XBMC\userdata\addon_data\metadata.imdb.com\settings.xml The line <setting id="akatitles" value="USA / International" /> is forced to only "USA": <setting id="akatitles" value="USA" /> The number of true American Titles improves a lot. In a collection with 60 or so non-USA/UK, with the standard "USA / International" I got like 40~50% with the original title (or worse, the spanish title) and not the American one. when forced to "USA" less than 10%. I tried also with just "International" but there were no significant changes.. It may be a good idea to separate USA and International. My movie collection is organized each movie in its own folder. And by default there is a Movie.nfo in each folder with the imdb url. So no problem with false positives but there may be. I also discovered that for this stubborn movies I can force the title in the same nfo file (Library Import Export section in the wiki) |
| find quote |
olympia
Team-XBMC Member Joined: May 2008 Reputation: 30 |
2011-05-12 07:25
Post: #53
Example movies please.
ASUS P5N7A-VM - Intel C2D E7300 - 2GB RAM - Silverstone GD02 Black - MS MCE Remote - Patriot Warp 32GB SSD Drive |
| find quote |
carmatana
Junior Member Posts: 37 Joined: Jan 2011 Reputation: 0 |
2011-05-12 07:53
Post: #54
olympia Wrote:Example movies please. I will send them in the next couple of days, to do so I have to re-scrape my collection with the two settings to identify differences, Something worth to mention is that I have played some time with ember media manager -Revisited, and, even though I have not made a complete test, EMM-r is able to present the American title in all the cases I tested. It has an option to force the language of the title to a selected one by the user. One of my stubborns movies is Crouching Tiger, Hidden Dragon (2002) and under any circumstances I have been able to make the xbmc imdb scrapper to bring that title, except by forcing it in the movie.nfo file. EMM-r picks that one. I will send also a comparison with EMM-r scrapper. Cheers. |
| find quote |
olympia
Team-XBMC Member Joined: May 2008 Reputation: 30 |
2011-05-12 11:58
Post: #55
Yeah, just had a look at it and because imdb changed their layout since the "preferred title language" part of the scraper has been written, it is completely broken.
Would need a rewrite for very few gain. Probably it wasn't broke down today and you're the first one who noticed it. I will think about that, but this is very hard to maintain because: - no common way of how imdb is naming the titles, there are many exceptions - seems to changing more often than other part of the layout
ASUS P5N7A-VM - Intel C2D E7300 - 2GB RAM - Silverstone GD02 Black - MS MCE Remote - Patriot Warp 32GB SSD Drive |
| find quote |
carmatana
Junior Member Posts: 37 Joined: Jan 2011 Reputation: 0 |
2011-05-13 05:15
Post: #56
Hi Olympia,
Some examples, forcing the settings to "USA" with a text editor I got: 9th Company Black Book Come and See Mesrine: Killer Instinct Mesrine: Public Enemy No. 1 Tae Guk Gi: The Brotherhood of War The Barbarian Invasions The Secret of the Grain For the same movies with the standard "USA / International" I got: 9 rota Zwartboek Go and Look L'instinct de mort L'ennemi public n°1 Brotherhood Les invasions barbares Couscous I tested EMM-R setting the title language to USA and no problems. I am not sure the scraper is broken, for me it seems that it is working as designed. I am totally ignorant of the code, but my intuition tells me that the scraper goes to the akas-imdb page and looks for the title with the language tag chosen in the scraper. Actually I changed the setting to "English title" with a text editor because I saw this tag in akas.imdb and yes I got the English title but it was a mess since what it is defined as the English title could be anything, i.e. the movie "Almost Famous (2000)" is "Back in Those Days with Penny Lane" (the english title in Japan)!!! As I mentioned, I am totally incompetent in the coding part, and in many more things about IMDB, and you probably know this but if you allow me what I think EMM-R does is a request to IMDB with an additional parameter. No idea how they do this but I have my own account in IMDB and in my settings I have the following two site preferences set to: Title display country: USA Title display language: English I am not sure if it is possibe to add these parameters to the scraper and that's all. Maybe, as you say, it requires a complete rebuilt. But for me it works great but as everything it could be improved. Decoupling the "USA / International" in two, should not be difficult (I think), the other part is a different story. Thanks for your attention.
(This post was last modified: 2011-05-13 05:18 by carmatana.)
|
| find quote |
olympia
Team-XBMC Member Joined: May 2008 Reputation: 30 |
2011-05-13 10:05
Post: #57
You're right, the function works, I was looking at the wrong imdb page.
On the other hand, please stop comparing to EMM as it is not allowing to scrape title on foregin languages, therefore its job a little bit easier, and EMM is EMM and XBMC is XBMC. Please revert ALL your changes until the last bit and try to test your movies if you delete line 175-180 of metadata.imdb.com/imdb.xml.
ASUS P5N7A-VM - Intel C2D E7300 - 2GB RAM - Silverstone GD02 Black - MS MCE Remote - Patriot Warp 32GB SSD Drive |
| find quote |
carmatana
Junior Member Posts: 37 Joined: Jan 2011 Reputation: 0 |
2011-05-15 03:53
Post: #58
olympia Wrote:You're right, the function works, I was looking at the wrong imdb page. Hi Olympia, Mixed results, in some movies helped in others no. But my conclusion is that by going to AKAS it is impossible to get 100% accurate results (for me an accurate result is to get the title displayed in http://www.imdb.com/title/ttxxxxxx) - maybe I am wrong in my criteria. Ideally, according to me and after analyzing countries and akas of several movies in my collection, the logic should be something like: 1. if country movie in (USA, UK, Canada and alike) then Title = Original Title 2. if there is a country not in the above list, then look for the AKA title tagged as USA. 3. And if there is no AKA title tagged as USA then take the one with the International tag. I am incapable of doing the above because I have no idea of regex and xml (I am impressed that there are no "if"s and "then"s in the xmdb.xml file) but in any case it is no perfect, it would fit 95% percent of the movies but take the following 4 exception examples: Code: *** Das Experiment (2001) (Germany)Is there a way to make the search of the title in http://www.imdb.com (or google) instead of akas.imdb.com and get the displayed title in order to get the American/International title? My second option would be to give the option to the user that in addition to get fanart from TheMovieDB.com, get also the title (a recommended option for those looking for the American/International title). In my collection TheMovieDB titles are perfect except for the "Das Experiment" movie, I got there also "The Experiment". I prefer the movie info provided by IMDB, so I will keep this scraper with "Keep Original" as IMDB is unpredictable and forcing the foreign titles with the movie.nfo file, but would be nice if other approaches as the suggested are taken into consideration. Again, thank you very much for your attention and sorry for the extension of this note.
(This post was last modified: 2011-05-15 04:30 by carmatana.)
|
| find quote |
olympia
Team-XBMC Member Joined: May 2008 Reputation: 30 |
2011-05-15 07:45
Post: #59
There are more things to take into consideration when you design a scraper logic. One (and one of the most importants) is speed. If you fetch info from different pages with no brain, then it will slow down scraping like hell. So IMDB scraper is using akas, because that page is already cached. If it was using another page, that's an additional page to open. No way I can be convinced to to that. Given the chaos IMDB is doing with AKA titles (as you discovered as well), I consider 5% failure rate to be a good result already. If we can make it better within the given page, I am up to it.
As you probably noticed, if you delete the lines I suggested, the country bit gets disregarded, but international title wins over USA title (in most of the cases), so the current logic is the opposite to the one you drafted. Going to www instead of akas is a nogo too, because then it gets unpredictable to which server IMDB drops you and there can be differences between servers. Also in some countries not always the International title gets displayed at the top, but the local title (iirc - but pretty sure there was some issue with that). The other thing is, that later on, there will be another user with another set of foregin movies, where the same logic might not work as good. So this is not easy.
ASUS P5N7A-VM - Intel C2D E7300 - 2GB RAM - Silverstone GD02 Black - MS MCE Remote - Patriot Warp 32GB SSD Drive |
| find quote |
carmatana
Junior Member Posts: 37 Joined: Jan 2011 Reputation: 0 |
2011-05-15 18:22
Post: #60
Hi Olympia,
I see, I was not aware of these things. And I agree, IMDB scraper is really fast and this is something that has to be kept. I said that the "patch" yielded mixed results, what I forgot to said (sorry) is that it was much better for non-USA (or English speaking countries) but much worse for USA or alike countries. Take a look below to some examples of USA (or alike) movies with the 175-180 lines deleted and settings set to "USA / International". The first part is the American title, the second part, after ==>, the returned by the scraper and the tag in AKAS that made the scraper to choose that title. There are really strange things, for the "Shawshank Redemption" and "Due Date" the scraper returns the Ukranian name, I guess because the "UK" in Ukraine. In other cases, like in the 3rd or 4th one, is really strange too, the USA or International names could be in Spanish or other. Code: The Shawshank Redemption (1994) ==> "Втеча з Шоушенка" - UkraineWith the new configuration, and the big mess in IMBD akas, I have to take care of all, USA and non-USA, with the old one, just for non-USA. I prefer the old method, forcing the settings to "USA". Or to "Keep Original" and forcing the title with the Movie.nfo file but only for non-English movies, this may represent more work but ensures that in any change in the scraper or Imdb it will work (hopefully). The 95% of accuracy is achieved, in theory because I can't test it, with the algorithm I mentioned, not with this patch. I am aware that my movie collection may be different from other users in terms of Akas "messness", I am just trying to help and as you may note my comments are based on very detailed and extensive data. I have heard that in some countries, http://www.imbd.com returns the title by countries' ip. This was the case for a while here in Mexico but not anymore, maybe in other countries this is still the case. But in case of doubt, I agree this is not an option. Taking the title, but keeping IMDB's orginal title, from Themoviedb is not an option? My understanding is that TheMovieDB page is visited in any case for the Fanarts (if the user checks this option). Last but no least, something very good about IMDB scraper, it's its support - thanks again for listening.
(This post was last modified: 2011-05-15 18:28 by carmatana.)
|
| find quote |

Search
Help