Anime lookup scraper for anidb.net and/or animenfo.com
#61
AniDB is as complete as it gets for anime series info, but it's still sub-optimal for XBMC scraping purposes.

What is really needed is a scrapable, wiki-based anime database in the vein of TheTVdb.com and TheMoviedb.org. AniDB doesn't want to be that kind of a site, so we're just going to have to wait for somebody to step up and do it the right way.
Reply
#62
What about Animenfo, or Aninenewsnetwork as a replacement? (or sort of Download all the info bit by bit on Anidb host it somewere and use that as a scraper (migth b a stupid suggestion but just an idea xD))
Reply
#63
Neither have proper banner / artwork / poster images, and AnimeNFO doesn't even have standardized episode lists.

TheMoviedb.org came about because IMDB was too limited and frowned on scraping. TheTVdb.com came about because TV.com was similarly limited. Eventually somebody with the necessary resources will do for anime what the creators of those other sites have done for TV and movies.
Reply
#64
candre23 Wrote:AniDB is as complete as it gets for anime series info, but it's still sub-optimal for XBMC scraping purposes.

What is really needed is a scrapable, wiki-based anime database in the vein of TheTVdb.com and TheMoviedb.org. AniDB doesn't want to be that kind of a site, so we're just going to have to wait for somebody to step up and do it the right way.


This will be nearly impossible or will take forever. The problem is that there are much more Anime than normal TV shows/Movies, also every month there are coming always tonns of new. Also for some reason only anidb.net is complete for show infos/specials or exotic anime content. To get proper results atm there is no way around anidb for a anime fan.

What i can agree is that for just those 20 animes u can already use thetvdb.com, but even than one of the biggest shows like Bleach only has valid informations till episode 199, while we are currently at 241.

Also if u followed the last yeahrs of anime sites, there wont be a new site that will try to build up a database.
Reply
#65
Don't take this the wrong way, this really is a genuine question for the anime fans.

Given that there doesn't seem to be a good scrapable based site, nor a dev interested in creating a scraper for the sites that do exits (with partial content), why don't you just add the content to thetvdb.com?

This has been suggested on a few occasions, but never really answered. I know that tvdb has little content on it now, but like most of the content there when a few people start it add it, it will be contributed to more.
Reply
#66
prae5 Wrote:why don't you just add the content to thetvdb.com?
People do, or there wouldn't be any content there. Still, even if the TVdb was 100% up to date on anime TV shows, that's only half the issue. Anime as a genre is unique in that a "series" usually contains one or more seasons of TV episodes, movies, and OVA episodic content. Neither theTVdb or TheMoviedb are set up to handle everything. It really is going to take a dedicated anime scraper-friendly site for this to work.
Reply
#67
prae5 Wrote:Don't take this the wrong way, this really is a genuine question for the anime fans.

Given that there doesn't seem to be a good scrapable based site, nor a dev interested in creating a scraper for the sites that do exits (with partial content), why don't you just add the content to thetvdb.com?

This has been suggested on a few occasions, but never really answered. I know that tvdb has little content on it now, but like most of the content there when a few people start it add it, it will be contributed to more.

Just look at the daily update for anidb.net, sure if 3-5 ppl start to update 1 hour every day u can keep up.

But keep in mind that anidb already has a database with around 250000 episodes listed, i have a huge collection and anidb just tells me i have watched 2% of there database.
Reply
#68
Well might be a stupid idea but what if like a few ppl slowly started scraping all the content off of Anidb, store it on a server and let us use it thru that Smile or something simulair like that.
Reply
#69
Saint Wrote:Well might be a stupid idea but what if like a few ppl slowly started scraping all the content off of Anidb, store it on a server and let us use it thru that Smile or something simulair like that.

Not a reliable solution.
Reply
#70
True but until Anidb decides to join in on the fun it might be the only solution that makes any sence at all what so ever...
Reply
#71
Saint Wrote:Well might be a stupid idea but what if like a few ppl slowly started scraping all the content off of Anidb, store it on a server and let us use it thru that Smile or something simulair like that.
Totally unnecessary. AniDB already offers a fairly current (monthly?) copy of their DB, downloadable as a torrent. It's been suggested in this thread already that a scraper could be modified to use this downloaded DB. Unless somebody volunteers for the task, it's still just a suggestion.
Reply
#72
Well then they basicly did what i said for us ^-^ i had no idea they had a torrent downloadable database, well now someone just has to volunteer to build up a scraperfriendly site/database/whatever you want to call it. Smile
Reply
#73
Smile 
Just started playing with XBMC on osx for watching anime, dramas, tv and movies. I really want my anime and drama downloads to be more organized so making a scraper seemed like the way to go. Here's what I've got and the problems I am still having with my content or library:

1. I tried originally to scrape anidb.net, but they return the content gzipped, so the scapper doesn't know what to do with it. I tested their search url in curl to confirm. It would be great if there was a way to specify --compressed so that the result would get handled by gzip

2. So I went with the back up - animenfo.com. I got it working. It pulls the show title and thumbnail from the site based on folder name - and adds the show to my library. That's pretty cool, but the problem is, the files in my folder are not being added to the library - they are not 'enumerated' or matching this pattern according to the debug log:

Code:
DEBUG: found match /anime/canaan/[anbu-menclave]_canaan_-_01_[1024x576_h.264_aac][12f00e89].mkv (s1024e576) [[\\/\._ \[-]([0-9]+)x([0-9]+)([^\\/]*)$]
DEBUG: could not enumerate file /Anime/Canaan/[gg]_CANAAN_-_01v2_[3561386D].mkv

Now those two files don't really match anyway, but this is happening even when all the files have similar patterns. I am wondering how can I customize this so that my files will get added? Does the scraper have anything to do with the file(name)s themselves?

Anyway if anyone out there wants thumbnails from animenfo.com on their anime show folders add this xml to /XBMC/system/scrapers/video - enjoy!

Code:
<?xml version="1.0" encoding="UTF-8"?>
<scraper framework="1.0" date="2009-11-15" name="animenfo.com" content="tvshows" thumb="animenfo.png">
    <NfoUrl dest="3">
        <RegExp input="$$1" output="\1" dest="3">
            <expression></expression>
        </RegExp>
    </NfoUrl>
    <CreateSearchUrl dest="3">
          <RegExp input="$$1" output="http://animenfo.com/search.php?query=\1&amp;queryin=anime_titles&amp;option=smart" dest="3">
             <expression noclean="1"/>
          </RegExp>
    </CreateSearchUrl>
    <GetSearchResults dest="8">
        <RegExp input="$$5" output="&lt;?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot; standalone=&quot;yes&quot;?&gt;&lt;results&gt;\1&lt;/results&gt;" dest="8">
            <RegExp input="$$1" output="&lt;entity&gt;&lt;title&gt;\3&lt;/title&gt;&lt;url&gt;http://animenfo.com/\1&lt;/url&gt;&lt;/entity&gt;" dest="5">
                <expression repeat="yes">&lt;a href='(animetitle,([0-9]*),[^']*)'&gt;([^&lt;]*)&lt;/a&gt;</expression>
            </RegExp>
            <expression noclean="1" />
        </RegExp>
    </GetSearchResults>
<GetDetails dest="3">
    <RegExp input="$$8" output="&lt;details&gt;\1&lt;/details&gt;" dest="3">
        <!-- Title !-->
        <RegExp input="$$1" output="&lt;title&gt;\1&lt;/title&gt;" dest="8">
            <expression trim="1" noclean="1">&lt;td class=&quot;anime_info_top&quot;&gt;([^&lt;]*)&lt;/td&gt;&lt;/tr&gt;</expression>
        </RegExp>
        <!-- Year !-->
        <RegExp input="$$1" output="&lt;year&gt;\1&lt;/year&gt;" dest="8">
            <expression trim="1" noclean="1">&lt;td class=&quot;anime_info_top&quot;&gt;&lt;a href='[^']*'&gt;([^&lt;]*)&lt;/a&gt;</expression>
        </RegExp>
        <!-- Thumbnail !-->
        <RegExp input="$$1" output="&lt;thumb&gt;&lt;url spoof=&quot;http://animenfo.com&quot;&gt;http://animenfo.com/\1&lt;/url&gt;&lt;/thumb&gt;" dest="8+">
            <expression>&lt;img class=&quot;float&quot; src=&quot;([^&quot;]*)&quot;</expression>
        </RegExp>
        <expression noclean="1"/>
    </RegExp>
</GetDetails>
</scraper>
Reply
#74
please use [ code] tag for including xml and log.
better for reading.
Reply
#75
Don't know if it helps, but there is a fully working anime plugin for MP which uses anidb: http://www.team-mediaportal.com/files/Do.../MyAnime2/
http://forum.team-mediaportal.com/mediap...2-a-60793/
Posting this in hope some kind person will someday be able to implement something like that for xmbc Smile
Reply

Logout Mark Read Team Forum Stats Members Help
Anime lookup scraper for anidb.net and/or animenfo.com2