XBMC Community Forum
[WIP] AniDB.net Anime Video Scraper - Printable Version

+- XBMC Community Forum (http://forum.xbmc.org)
+-- Forum: Help and Support (/forumdisplay.php?fid=33)
+--- Forum: Add-ons Help and Support (/forumdisplay.php?fid=27)
+---- Forum: Metadata scrapers (/forumdisplay.php?fid=147)
+---- Thread: [WIP] AniDB.net Anime Video Scraper (/showthread.php?tid=64587)



- bambi73 - 2011-04-04 21:48

ZERO <ibis> Wrote:On this note and also noting that others have commented about cool things the scrapper could do but can not due to the limitations of it not being a full python plugin I suggest the following:

How about a feature support plugin. Basically a python plugin that works with this scrapper to offer additional features that would not be possible otherwise. For example:
The ability to preform the above requested automatic downloading of fan art
The ability to auto generate theme.mp3 files by also looking up the series on gendou.com and automatically downloading the full version of the OP1
The ability to have an anime version of TV Show Next Aired that uses data from anidb to show the next episode release date (I have TV Show Next Aired disabled b/c it is borked as far as anime goes thinking it is some crazy crap)
Ect..

I think there are all sorts of features that could be made possible with an expansion plugin that works hand in hand with this scrapper in order to provide us anime lovers the same interface experiences offered to users of American content.
It's hard to argue with you because everything what you wrote would be nice to have, but ... you need some support for scraper and python addon cooperation from XBMC side and there was no signs that any regular developer is interested in this topic, which is bad. Of course you can always program it yourself and post patch Smile
Personally i'm satisfied how AniDB scraper works for me now so don't expect any activity from my side. Simply i have no time and energy for such large project, even though it looks like fine challenge Smile


- ZERO <ibis> - 2011-04-06 19:30

Oh if I could code xbmc addons in sourcepawn or php I would have released it already. Unfortunately python gives me brain cancer but maybe I will pick it up eventually. Although I wonder if it may be possible to attract a programer with some $$ b/c I would be willing to pay to get support for those features Big Grin


- NoValidTitle - 2011-04-08 04:39

Ok, I'm sorry to be a pest but would someone mind giving me a hand? I've spent hours reading and tinkering but I don't fully understand all the scraper stuff apparently. I'm not even a huge anime watcher to be honest, just a few series. I'm running XBMC Live(installed) and I'm trying to use the anidb scraper to get my One Piece episodes into my library. No matter what I do the closest I can seem to get is it scans my folder and adds every episode as a special and not an actual episode. My episodes are named like this: One Piece - 001 - I'm Luffy! The boy who will become the Pirate King! [K-F&AKUPX].avi which I used WebAOM to rename.

Where do I even start at fixing this?


- bambi73 - 2011-04-08 14:02

NoValidTitle Wrote:Ok, I'm sorry to be a pest but would someone mind giving me a hand? I've spent hours reading and tinkering but I don't fully understand all the scraper stuff apparently. I'm not even a huge anime watcher to be honest, just a few series. I'm running XBMC Live(installed) and I'm trying to use the anidb scraper to get my One Piece episodes into my library. No matter what I do the closest I can seem to get is it scans my folder and adds every episode as a special and not an actual episode. My episodes are named like this: One Piece - 001 - I'm Luffy! The boy who will become the Pirate King! [K-F&AKUPX].avi which I used WebAOM to rename.

Where do I even start at fixing this?

Your problem lies in parsing of episode and season numbers from file names and it's done by XBMC itself even before scraper is started. What you are looking for is <tvshowmatching> setting in [url="http://wiki.xbmc.org/index.php?title=Advancedsettings.xml]Advancedsettings.xml[/url]. I guess you didn't touched this setting and you have defaults there, so in this case your file name is catched by 4th default regexp

Code:
<regexp>[\._ \-]([0-9]+)([0-9][0-9])([\._ \-][^\\/]*)</regexp>  <!-- foo.103 -->
and result is SeasonNr='0' (first capture group) and EpisodeNr='01' (second capture group) which wrong because SeasonNr=0 means specials. You must add something like
Code:
<tvshowmatching action="prepend">
  <regexp>(?i)[/\\].*?()\s-\s(\d{2,3})([^/\\]*)</regexp>
</tvshowmatching>
It expects " - " before EpisodeNr (two or three digits). Empty first capture group means SeasonNr=1.


- NoValidTitle - 2011-04-08 16:26

bambi73 Wrote:Your problem lies in parsing of episode and season numbers from file names and it's done by XBMC itself even before scraper is started. What you are looking for is <tvshowmatching> setting in [url="http://wiki.xbmc.org/index.php?title=Advancedsettings.xml]Advancedsettings.xml[/url]. I guess you didn't touched this setting and you have defaults there, so in this case your file name is catched by 4th default regexp

Code:
<regexp>[\._ \-]([0-9]+)([0-9][0-9])([\._ \-][^\\/]*)</regexp>  <!-- foo.103 -->
and result is SeasonNr='0' (first capture group) and EpisodeNr='01' (second capture group) which wrong because SeasonNr=0 means specials. You must add something like
Code:
<tvshowmatching action="prepend">
  <regexp>(?i)[/\\].*?()\s-\s(\d{2,3})([^/\\]*)</regexp>
</tvshowmatching>
It expects " - " before EpisodeNr (two or three digits). Empty first capture group means SeasonNr=1.

I'll give that a try! I appreciate your time.


- ZERO <ibis> - 2011-04-10 09:02

Found what I believe to be a small bug. It appears that studios are not scrapped correctly. I assume that the studio is supposed to come from anidb.net and is listed under "Animation Work". I noticed that not all shows scrap this item correctly instead leaving it blank.

For example Ore no Imouto ga Konna ni Kawaii Wake ga Nai does not scrap the studio even though under the staff section it says: "Animation Work (アニメーション制作Wink AIC Build"

I also noticed that when I ran the scrapper it actually made my genre list more restricted than before. Where it had read Comedy, Seinen it changed to just Seinen when according to anidb it should actually say Comedy, Novel, Seinen.

Also for some shows like 30-sai no Hoken Taiiku the scrapper fails to find the studio or genres even though they are all right there on anidb: http://anidb.net/perl-bin/animedb.pl?show=anime&aid=8106


- bambi73 - 2011-04-10 13:14

ZERO &lt;ibis&gt; Wrote:Found what I believe to be a small bug. It appears that studios are not scrapped correctly. I assume that the studio is supposed to come from anidb.net and is listed under "Animation Work". I noticed that not all shows scrap this item correctly instead leaving it blank.

For example Ore no Imouto ga Konna ni Kawaii Wake ga Nai does not scrap the studio even though under the staff section it says: "Animation Work (アニメーション制作Wink AIC Build"
Unfortunatelly it's not problem of scraper but data provided by AniDB. You can check it yourself, there is no "Animation Work" type creator. Maybe they limit creator list to 15 records and "Animation Work" didn't make it under this limit.

ZERO &lt;ibis&gt; Wrote:I also noticed that when I ran the scrapper it actually made my genre list more restricted than before. Where it had read Comedy, Seinen it changed to just Seinen when according to anidb it should actually say Comedy, Novel, Seinen.
Scraper overtakes only genres with weight 500 or 600, which for Ore no Imouto are Seinen, Novel, Earth, Asia, Japan, Present. Additionally it filters out these which doesn't qualify (IMHO Wink) as genres, in this case everything except Seinen. Maybe i can leave Novel too, because when i wrote this part of scraper i linked Novel to Visual Novel (Eroge) in my mind and filtered it out. But in this case it means Light Novel, which is "legal" for me Smile.
About Comedy, it has now weight 400, so it's not in genres list. Reason why it was there in past is that originally Comedy had weight 600 but was changed down to 400 in November. You can check CREQ history for this record on AniDB.

ZERO &lt;ibis&gt; Wrote:Also for some shows like 30-sai no Hoken Taiiku the scrapper fails to find the studio or genres even though they are all right there on anidb: http://anidb.net/perl-bin/animedb.pl?show=anime&aid=8106
Fot this show i see studio Gathering in my DB and there are only two genres with weight only 400 on AniDB (you can check it here - half star = 100 weight points)


- ZERO <ibis> - 2011-04-10 17:58

OK I will try to look into why the Studio is not being reported in the api output. However your comments on the genre section has lead me to a request:

Can we have the ability to adjust the weights and or use a fall back option. For example you can sort by weights but if less than X results occur than take the Y highest. Inversely you could have another option to prevent too many results by limiting the output to Z.

My other request involves the filtering. As you stated there are some results that are filtered out in order to provide higher quality output. However some users may want more or less restrictive filters. For example some users may still want to filter out Novel while people like me want to be able to allow Novel and Visual Novel as well. If there was an option that let us edit this list and or add to it that would be awesome! It could be implemented by listing the items separated by , so the program knows how to break them up.

BTW, I have made a thread reporting the issue here: http://anidb.net/perl-bin/animedb.pl?show=cmt&id=36248#c199729

bambi73 Wrote:Fot this show i see studio Gathering in my DB and there are only two genres with weight only 400 on AniDB (you can check it here - half star = 100 weight points)

I have checked this in the API output and see that it is listed however every time I refresh it will not load the studio. I even deleted it from my xbmc database and did it again and it still comes up as "Not available" for studio. Perhaps there is something tricky going on where the scrapper is not writing the studio if it did not find anything to use for genre?


- Armitage - 2011-04-15 02:25

Thanks for the update off the scraper! It really works well, good job!


- bambi73 - 2011-04-19 18:01

1.2.0:
Changed: Replace "`" with "'" in all significant texts
Changed: Configuration for genres
Added: Loading characters + actors/seiyus

Should be available soon from XBMC repo