German IMDB scraper, please test it and give feedback - Printable Version
+- XBMC Community Forum (http://forum.xbmc.org)
+-- Forum: Help and Support (/forumdisplay.php?fid=33)
+--- Forum: Add-ons Help and Support (/forumdisplay.php?fid=27)
+---- Forum: Metadata scrapers (/forumdisplay.php?fid=147)
+---- Thread: German IMDB scraper, please test it and give feedback (/showthread.php?tid=75121)
German IMDB scraper, please test it and give feedback - Eisbahn - 2010-06-04 23:33
new version online: http://github.com/Eisbahn/IMDb_de-Scraper/zipball/3.0.5
v3.0.4 and v2.0.2 out now: <http://github.com/Eisbahn/IMDb_de-Scraper/>
just finished a first version after some work: german scraper for IMDB (in german language). Actual Version 1.0.0 is on http://ul.to/v5d9j0 ready for test. It grabs every tag availabel, only Trailer is not implemented (because for me it's a useless feature). Please feel free to report bugs or issues,
latest Version 2.0.0 for XBMC v9.11 can be found here:
What is _not_ working
<premiered>Premierendatum</premiered> not im-/exported to XBMC
<aired>???</aired> only for TV-Shows/series?
<set>???</set> don't know what this is
<artist>???</artist> difference to actor?
<status>???</status> don't know what this is
<certification>Altersfreigabe für alle Staaten außer D</certification> not im-/exported to XBMC
<sorttitle>alternative Filmtitel</sorttitle>only first titel is im-/exported to XBMC
<code>???</code> don't know what this is, I think it's the codec => no sense to import anything in this field
<trailer>Trailer</trailer> senseless for me because the hole DVD is in XBMC present
Any hints for the corrupted tags are highly welcome.
- Spaggi - 2010-06-05 00:35
Great work! Will test on the weekend
- Eisbahn - 2010-06-05 09:27
Because I'm new to XBMC: what tags are supported/should be provided by a scraper?
Could you give me an overview of mandantory and optional tags?
- vdrfan - 2010-06-05 10:55
Eisbahn Wrote:Because I'm new to XBMC: what tags are supported/should be provided by a scraper?
Check out the other scrapers. The imdb.com is pretty feature complete.
- donabi - 2010-06-05 10:56
well, that differs very much.
some users "need" studio-tags, to have fancy icons in the skin.
or the narator.
others, like me, just need things like playtime, year, actors, fsk (mpaa) and ONE genre.
the orignal imdb-scraper gets a lot of genre-tags.
which makes the genre-filter sense-less.
we would like to see you at german xbmc.de
- Eisbahn - 2010-06-05 18:20
Hmmm, sorry. Do we have a spec showing which tags are mandantory/optional? If not: how can I figure out which tags are supported? The IMDB com scraper fetches no infos about sound, subtitle, video-format (if I looked right), in several screenshots I could see infos about these things... So the answer: please do reverse engineering because everybody can implement tags however he/she likes is a bit contra productive and shows kind of quick-and-dirty-hacking without any concept? Is this the XBMC style?
@donabi: to cut some infos away is not a real problem and done in few seconds. But gathering all possible things is a bit more complicated. So first I would have a scraper which gets all infos.
If you have a decription of the alowed tags, please provide it. Is the order/sequence relevant, what tags are supported, what format is expected and so on. If the german board has active members, why not. But to be honest: think after the scraper my active work is over :=(
@all: Where can I get infos which tags are supported by XBMC? If the skins shows the infos doesn't matter at all, think a "good scrapper" should gather as much as possible. For the result of a scraper: is the order/sequence relevant, what tags are supported, what format is expected and so on. Today all I've done is reverse engineering, but I think thats not the right way...
- olympia - 2010-06-05 21:31
As for the starting point:
I think you are chasing something like the first result?
Other than that, I am not sure you are seriously calling "reverse engineering" to just have a look at what tags are being used by other scrapers.
- Eisbahn - 2010-06-05 23:04
great, I found google. If you know the right words and do not type "scraper, tags, xbmc" as a newbee or anything like that, it realy works. If my questions are so easy: why do I get only from you an answer? Think it's a bit frustrating for both of us: for you as expert and me as new user...
- The set tag is for a standalone XBMC useless because you could not edit the tag before importing, so should not be used by a scraper. Am I right?
- what about the order. Is it relevant? Seemed to be not (looking at your nfo and the imdb.com output)
- fileinfos are imported by XBMC by analysing the video file on its own without interaction?
 new version:
- year gets imported if quartal is added, like in "Insomnia (2002)"
- importing up to 6 genres (9 easy possible)
- triming of spaces
=> to come: all tags like in the nfo
- Nicezia - 2010-06-05 23:15
All tags are optional , but i would say its best that the TITLE is at least supplied
of course it goes without saying that actor, audio (inside stream info), video(inside stream info) and subtitle(inside stream info) are multiple instance and optional
- olympia - 2010-06-06 08:57
Eisbahn Wrote:- The set tag is for a standalone XBMC useless because you could not edit the tag before importing, so should not be used by a scraper. Am I right?Yes, it's only useful when you have an xbmc compliant external nfo to import from. Nevertheless you couldn't even scrape this info from anywhere
Eisbahn Wrote:- what about the order. Is it relevant? Seemed to be not (looking at your nfo and the imdb.com output)Order doesn't matter
Eisbahn Wrote:- fileinfos are imported by XBMC by analysing the video file on its own without interaction?Yes, if this option is enabled in xbmc. But obviously this is again an info what you couldn't scrape from a web site. These tags are existing for an nfo, because you might don't want xbmc to do the extraction from the media file in itself, because you use an external nfo manager for that purposes and you want xbmc to import the data generated by that.