Anime lookup scraper for anidb.net and/or animenfo.com

  Thread Rating:
  • 2 Votes - 4.5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Post Reply
candre23 Offline
Member
Posts: 58
Joined: Jul 2009
Reputation: 0
Post: #71
Saint Wrote:Well might be a stupid idea but what if like a few ppl slowly started scraping all the content off of Anidb, store it on a server and let us use it thru that Smile or something simulair like that.
Totally unnecessary. AniDB already offers a fairly current (monthly?) copy of their DB, downloadable as a torrent. It's been suggested in this thread already that a scraper could be modified to use this downloaded DB. Unless somebody volunteers for the task, it's still just a suggestion.
find quote
Saint Offline
Member
Posts: 55
Joined: Sep 2009
Reputation: 0
Post: #72
Well then they basicly did what i said for us ^-^ i had no idea they had a torrent downloadable database, well now someone just has to volunteer to build up a scraperfriendly site/database/whatever you want to call it. Smile
(This post was last modified: 2009-10-20 22:50 by Saint.)
find quote
minimonk Offline
Junior Member
Posts: 3
Joined: Nov 2009
Reputation: 0
Smile  animenfo.com scraper Post: #73
Just started playing with XBMC on osx for watching anime, dramas, tv and movies. I really want my anime and drama downloads to be more organized so making a scraper seemed like the way to go. Here's what I've got and the problems I am still having with my content or library:

1. I tried originally to scrape anidb.net, but they return the content gzipped, so the scapper doesn't know what to do with it. I tested their search url in curl to confirm. It would be great if there was a way to specify --compressed so that the result would get handled by gzip

2. So I went with the back up - animenfo.com. I got it working. It pulls the show title and thumbnail from the site based on folder name - and adds the show to my library. That's pretty cool, but the problem is, the files in my folder are not being added to the library - they are not 'enumerated' or matching this pattern according to the debug log:

Code:
DEBUG: found match /anime/canaan/[anbu-menclave]_canaan_-_01_[1024x576_h.264_aac][12f00e89].mkv (s1024e576) [[\\/\._ \[-]([0-9]+)x([0-9]+)([^\\/]*)$]
DEBUG: could not enumerate file /Anime/Canaan/[gg]_CANAAN_-_01v2_[3561386D].mkv

Now those two files don't really match anyway, but this is happening even when all the files have similar patterns. I am wondering how can I customize this so that my files will get added? Does the scraper have anything to do with the file(name)s themselves?

Anyway if anyone out there wants thumbnails from animenfo.com on their anime show folders add this xml to /XBMC/system/scrapers/video - enjoy!

Code:
<?xml version="1.0" encoding="UTF-8"?>
<scraper framework="1.0" date="2009-11-15" name="animenfo.com" content="tvshows" thumb="animenfo.png">
    <NfoUrl dest="3">
        <RegExp input="$$1" output="\1" dest="3">
            <expression></expression>
        </RegExp>
    </NfoUrl>
    <CreateSearchUrl dest="3">
          <RegExp input="$$1" output="http://animenfo.com/search.php?query=\1&amp;queryin=anime_titles&amp;option=smart" dest="3">
             <expression noclean="1"/>
          </RegExp>
    </CreateSearchUrl>
    <GetSearchResults dest="8">
        <RegExp input="$$5" output="&lt;?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot; standalone=&quot;yes&quot;?&gt;&lt;results&gt;\1&lt;/results&gt;" dest="8">
            <RegExp input="$$1" output="&lt;entity&gt;&lt;title&gt;\3&lt;/title&gt;&lt;url&gt;http://animenfo.com/\1&lt;/url&gt;&lt;/entity&gt;" dest="5">
                <expression repeat="yes">&lt;a href='(animetitle,([0-9]*),[^']*)'&gt;([^&lt;]*)&lt;/a&gt;</expression>
            </RegExp>
            <expression noclean="1" />
        </RegExp>
    </GetSearchResults>
<GetDetails dest="3">
    <RegExp input="$$8" output="&lt;details&gt;\1&lt;/details&gt;" dest="3">
        <!-- Title !-->
        <RegExp input="$$1" output="&lt;title&gt;\1&lt;/title&gt;" dest="8">
            <expression trim="1" noclean="1">&lt;td class=&quot;anime_info_top&quot;&gt;([^&lt;]*)&lt;/td&gt;&lt;/tr&gt;</expression>
        </RegExp>
        <!-- Year !-->
        <RegExp input="$$1" output="&lt;year&gt;\1&lt;/year&gt;" dest="8">
            <expression trim="1" noclean="1">&lt;td class=&quot;anime_info_top&quot;&gt;&lt;a href='[^']*'&gt;([^&lt;]*)&lt;/a&gt;</expression>
        </RegExp>
        <!-- Thumbnail !-->
        <RegExp input="$$1" output="&lt;thumb&gt;&lt;url spoof=&quot;http://animenfo.com&quot;&gt;http://animenfo.com/\1&lt;/url&gt;&lt;/thumb&gt;" dest="8+">
            <expression>&lt;img class=&quot;float&quot; src=&quot;([^&quot;]*)&quot;</expression>
        </RegExp>
        <expression noclean="1"/>
    </RegExp>
</GetDetails>
</scraper>
(This post was last modified: 2009-11-16 21:06 by minimonk.)
find quote
ppic Offline
Skilled Python Coder
Posts: 2,663
Joined: Feb 2009
Reputation: 10
Location: France idf
Post: #74
please use [ code] tag for including xml and log.
better for reading.

[Image: widget]
Passion-XBMC Repository Download your SVN skins and addons
TvTunes Addon Download Play your theme while browsing library
TVshow Next Aired display next aired infos from tvrage.com
Logo Downloader Script download Logo/clearart/show thumbs/poster/banner
Bande-Annonce Allociné plugin watch trailers in french and vo
SportLive Script Live score match info in XBMC (not compatible dharma)
find quote
gokudo Offline
Member
Posts: 68
Joined: Dec 2009
Reputation: 1
Location: Germany
Post: #75
Don't know if it helps, but there is a fully working anime plugin for MP which uses anidb: http://www.team-mediaportal.com/files/Do.../MyAnime2/
http://forum.team-mediaportal.com/mediap...2-a-60793/
Posting this in hope some kind person will someday be able to implement something like that for xmbc Smile
find quote
spiff Offline
Grumpy Bastard Developer
Posts: 12,185
Joined: Nov 2003
Reputation: 82
Post: #76
minimonk; <url gzip="yes">...</url>

scraper has nothing to do with the filenames. see <tvshowmatching> advancedsetting.

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
find quote
MukiDA Offline
Junior Member
Posts: 16
Joined: Dec 2009
Reputation: 0
Post: #77
Junk, I didn't realize this was written 5 days ago. I've tried to write the AniDB script using the previously posted AnimeNFO script. It's pre-pre-pre-alpha, majorly WIP.

So far it grabs thumbs, title, and year. At the moment it still doesn't get episode titles (so it's kind of a no-go for libraries), and I can't figure out how to get rid of stuff in brackets or parentheses. It's driving me nuts, so I'm taking a break in hopes that someone will bring up a solution I missed:

http://forum.xbmc.org/showthread.php?tid=64587
(This post was last modified: 2010-01-01 03:01 by MukiDA.)
find quote
Cheesekun Offline
Junior Member
Posts: 2
Joined: May 2009
Reputation: 0
Post: #78
I am interested in creating a anidb scrape mirror with someone.
find quote
someoneelse Offline
Junior Member
Posts: 1
Joined: Jan 2010
Reputation: 0
Post: #79
candre23 Wrote:Totally unnecessary. AniDB already offers a fairly current (monthly?) copy of their DB, downloadable as a torrent.

good day to you sir. would you be so kind to provide me with a link to said torrent?
find quote
candre23 Offline
Member
Posts: 58
Joined: Jul 2009
Reputation: 0
Post: #80
someoneelse Wrote:good day to you sir. would you be so kind to provide me with a link to said torrent?
Looks like they don't do a torrent any more. Instead you can use this program to download a (presumably) up-to-date version of their database. Not sure what format it ends up in, but it should all be there. There's a link on that page to download an archive of all the artwork as well.
find quote
Post Reply