Kodi Community Forum

Full Version: Scraping doesn't work with file names containing foreign characters
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello

I'm using XBMC 12.2 (Release May 2 2013) on a Mac mini with OS X 10.8.5
My language settings are german (OS and XBMC).
All my media files (only mkv) are located on a SMB share and named with title (year).mkv. Some titles contain foreign characters (e.g. german umlaut like öäü).
I'm using themoviedb.org scraper (with german setting).

When scraping titles with german umlaut, the scraping process fails with the following message: "Unable to connect remote server. Would you like to continue scanning?"
After renaming the titles to standard characters (e.g. ü=u, ö=o, ä=a), it works and because themoviedb has all movie titles with correct foreign characters, XBMC will show them correctly after scraping has finished (database mode). But the files are still named "wrongly".

It seems this issue is not new, as I found reports in different forums. But it was never fixed.
Here an example: http://forum.xbmc.org/showthread.php?tid=175654

Today I wanted to reproduce this issue to generate some debug logs for this post and found an interesting behaviour. When changing the UI language of XBMC from german to english, the scraping works correctly, even with german umlauts!

Following 3 log entries from different tests for the movie "Kingdom of Heaven":

Test 1: Scraping the filename "Königreich der Himmel (2005).mkv" with XBMC UI set to english (works)
Code:
19:40:32 T:2957651968   DEBUG: GetMovieId (smb://ad-server.intra/movies/test/Königreich der Himmel (2005).mkv), query = select idMovie from movie where idFile=702
19:40:32 T:2957651968   DEBUG: VideoInfoScanner: No NFO file found. Using title search for 'smb://ad-server.intra/movies/test/Königreich der Himmel (2005).mkv'
19:40:32 T:2957651968   DEBUG: FindMovie: Searching for 'Königreich der Himmel' using The Movie Database scraper (path: '/Applications/XBMC.app/Contents/Resources/XBMC/addons/metadata.themoviedb.org', content: 'movies', version: '3.7.3')
19:40:32 T:2957651968   DEBUG: scraper: CreateSearchUrl returned <url>http://api.themoviedb.org/3/search/movie?api_key=57983e31fb435df4df77afb854740ea9&amp;query=k%c3%b6nigreich%20der%20himmel&amp;year=2005&amp;language=de</url>
19:40:32 T:2957651968   DEBUG: CurlFile::Open(0x3e29640) http://api.themoviedb.org/3/search/movie?api_key=57983e31fb435df4df77afb854740ea9&query=k%c3%b6nigreich%20der%20himmel&year=2005&language=de
19:40:32 T:2957651968    INFO: easy_aquire - Created session to http://api.themoviedb.org
... (snip)
Code:
19:40:32 T:2957651968   DEBUG: scraper: GetSearchResults returned <results><entity><title>Königreich der Himmel</title><id>1495</id><year>2005</year><url cache="tmdb-de-1495.json">http://api.themoviedb.org/3/movie/1495?api_key=57983e31fb435df4df77afb854740ea9&amp;language=de</url></entity><entity><title>Kingdom of Heaven</title><id>1495</id><year>2005</year><url cache="tmdb-de-1495.json">http://api.themoviedb.org/3/movie/1495?api_key=57983e31fb435df4df77afb854740ea9&amp;language=de</url></entity></results>
19:40:32 T:2957651968   DEBUG: GetVideoDetails: Reading movie 'http://api.themoviedb.org/3/movie/1495?api_key=57983e31fb435df4df77afb854740ea9&language=de' using The Movie Database scraper (file: '/Applications/XBMC.app/Contents/Resources/XBMC/addons/metadata.themoviedb.org', content: 'movies', version: '3.7.3')
19:40:32 T:2957651968   DEBUG: CurlFile::Open(0x3a30590) http://api.themoviedb.org/3/movie/1495?api_key=57983e31fb435df4df77afb854740ea9&language=de
19:40:33 T:2957651968   DEBUG: scraper: GetDetails returned <details><id>tt0320661</id><chain function="GetTMDBTitleByIdChain">1495</chain><originaltitle>Kingdom of Heaven</originaltitle>...

Test 2: Scraping the filename "Königreich der Himmel (2005).mkv" with XBMC UI set to german (doesn't work)
Code:
19:51:52 T:2957119488   DEBUG: GetMovieId (smb://ad-server.intra/movies/test/Königreich der Himmel (2005).mkv), query = select idMovie from movie where idFile=702
19:51:52 T:2957119488   DEBUG: VideoInfoScanner: No NFO file found. Using title search for 'smb://ad-server.intra/movies/test/Königreich der Himmel (2005).mkv'
19:51:52 T:2957119488   DEBUG: FindMovie: Searching for 'Königreich der Himmel' using The Movie Database scraper (path: '/Applications/XBMC.app/Contents/Resources/XBMC/addons/metadata.themoviedb.org', content: 'movies', version: '3.7.3')
19:51:52 T:2957119488   DEBUG: scraper: CreateSearchUrl returned <url>http://api.themoviedb.org/3/search/movie?api_key=57983e31fb435df4df77afb854740ea9&amp;query=kåb6nigreich%20der%20himmel&amp;year=2005&amp;language=de</url>
19:51:52 T:2957119488   DEBUG: CurlFile::Open(0x5009a60) http://api.themoviedb.org/3/search/movie?api_key=57983e31fb435df4df77afb854740ea9&query=kÃ%b6nigreich%20der%20himmel&year=2005&language=de
19:51:52 T:2957119488    INFO: easy_aquire - Created session to http://api.themoviedb.org
19:51:52 T:2957119488 WARNING: FillBuffer: curl failed with code 22
19:51:52 T:2957119488   ERROR: CCurlFile::CReadState::Open, didn't get any data from stream.
19:51:52 T:2957119488   ERROR: Run: Unable to parse web site

Test 3: Scraping the filename "Konigreich der Himmel (2005).mkv" with XBMC UI set to german (works)
Code:
19:54:39 T:2957651968   DEBUG: VideoInfoScanner: No NFO file found. Using title search for 'smb://ad-server.intra/movies/test/Konigreich der Himmel (2005).mkv'
19:54:39 T:2957651968   DEBUG: FindMovie: Searching for 'Konigreich der Himmel' using The Movie Database scraper (path: '/Applications/XBMC.app/Contents/Resources/XBMC/addons/metadata.themoviedb.org', content: 'movies', version: '3.7.3')
19:54:39 T:2957651968   DEBUG: scraper: CreateSearchUrl returned <url>http://api.themoviedb.org/3/search/movie?api_key=57983e31fb435df4df77afb854740ea9&amp;query=konigreich%20der%20himmel&amp;year=2005&amp;language=de</url>
19:54:39 T:2957651968   DEBUG: CurlFile::Open(0x1001f300) http://api.themoviedb.org/3/search/movie?api_key=57983e31fb435df4df77afb854740ea9&query=konigreich%20der%20himmel&year=2005&language=de
19:54:39 T:2957651968    INFO: easy_aquire - Created session to http://api.themoviedb.org
...(snip)
Code:
19:54:40 T:2957651968   DEBUG: scraper: GetSearchResults returned <results><entity><title>Königreich der Himmel</title><id>1495</id><year>2005</year><url cache="tmdb-de-1495.json">http://api.themoviedb.org/3/movie/1495?api_key=57983e31fb435df4df77afb854740ea9&amp;language=de</url></entity><entity><title>Kingdom of Heaven</title><id>1495</id><year>2005</year><url cache="tmdb-de-1495.json">http://api.themoviedb.org/3/movie/1495?api_key=57983e31fb435df4df77afb854740ea9&amp;language=de</url></entity></results>
19:54:40 T:2957651968   DEBUG: GetVideoDetails: Reading movie 'http://api.themoviedb.org/3/movie/1495?api_key=57983e31fb435df4df77afb854740ea9&language=de' using The Movie Database scraper (file: '/Applications/XBMC.app/Contents/Resources/XBMC/addons/metadata.themoviedb.org', content: 'movies', version: '3.7.3')
19:54:40 T:2957651968   DEBUG: CurlFile::Open(0x1065fc50) http://api.themoviedb.org/3/movie/1495?api_key=57983e31fb435df4df77afb854740ea9&language=de
19:54:40 T:2957651968   DEBUG: scraper: GetDetails returned <details><id>tt0320661</id><chain function="GetTMDBTitleByIdChain">1495</chain><originaltitle>Kingdom of Heaven</originaltitle>...

It would be fine, if this small, but annoing bug could be fixed in a future release.

Thanks

All 3 complete logs:
Test 1
Test 2
Test 3
I don't really think this is a Mac issue. This seems to be a common issue with a lot of scrapers. You would probably have better luck with getting this resolved if you post to that forum.
I found other reports of this issue only related to OS X.

btw, I just found a bug ticket for this issue here: http://trac.xbmc.org/ticket/14666
It's not related to german umlauts, but the same issue (and only on OS X).
Try with next nightly (20140115 or later) when it will be uploaded: http://mirrors.xbmc.org/nightlies/osx/x86_64/
Backup all XBMC userdata as it's still nightly build.
(2014-01-16, 07:08)Karlson2k Wrote: [ -> ]Try with next nightly (20140115 or later) when it will be uploaded: http://mirrors.xbmc.org/nightlies/osx/x86_64/
Backup all XBMC userdata as it's still nightly build.

Solved the (related) case of "German Umlaut can't be found scraping" - thank you!
... for all platforms, by the way