zonsky, feel free to beta test the scraper as-is (no episodes yet, hence why I said pre-pre-alpha, but at the moment the thumb, title, and plot summary work just fine
) I would, of course, suggest testing it on only a few folders at a time, however (e.g. don't just set the anime root folder to it
)Here's the latest version. I'm working on episode support, and once again, I have no idea what I'm doing wrong
(seems to be a running theme)Code:
<?xml version="1.0" encoding="utf-8"?><scraper framework="1" date="2009-11-15" name="AniDB.net" content="tvshows" thumb="anidb.jpg" language="en">
<NfoUrl dest="3">
<RegExp input="$$1" output="\1" dest="3">
<expression></expression>
</RegExp>
</NfoUrl>
<CreateSearchUrl dest="3">
<RegExp input="$$1" output="<url gzip="yes">http://anidb.net/perl-bin/animedb.pl\?show=animelist&adb.search=\1</url>" dest="3">
<expression></expression>
</RegExp>
</CreateSearchUrl>
<GetSearchResults dest="8">
<!-- Multiple Results -->
<RegExp input="$$5" output="<?xml version="1.0" encoding="iso-8859-1" standalone="yes"?><results>\1</results>" dest="8">
<RegExp input="$$1" output="<entity><title>\3</title><url gzip="yes">http://anidb.net/perl-bin/\1</url></entity>" dest="5">
<expression repeat="yes" noclean="1"><a href="(animedb.pl\?show=anime&amp;aid=([0-9]*))">([^<]*)</a></expression>
</RegExp>
<expression noclean="1"></expression>
<!-- Only one Result -->
<RegExp input="$$1" output="<entity><title>\1</title><url gzip="yes">\2</url></entity>" dest="5+">
<expression repeat="no" noclean="1"><th class="field">Main Title</th>.....<td class="value">(.[^\n]*)....<a class="shortlink" href="(http.[^"]*)</expression>
</RegExp>
</RegExp>
</GetSearchResults>
<GetDetails dest="3">
<RegExp input="$$8" output="<details>\1</details>" dest="3">
<RegExp input="$$1" output="<title>\1</title>" dest="8">
<expression repeat="yes"><th class="field">Main Title</th>.....<td class="value">(.[^\n]*)</expression>
</RegExp>
<RegExp input="$$1" output="<year>\1</year>" dest="8+">
<expression trim="1" noclean="1"><th class="field">Year</th>.[^>]*>([^<]*)|$</expression>
</RegExp>
<RegExp input="$$1" output="<thumb>\1</thumb>" dest="8+">
<expression><div class="image".[^"]*"(http.[^"]*)</expression>
</RegExp>
<RegExp input="$$1" output="<rating>\1</rating>" dest="8+">
<expression>animevotes&amp;aid=[0-9]*">(.[^<]*)</expression>
</RegExp>
<RegExp input="$$1" output="<plot>\1</plot>" dest="8+">
<expression>class="desc">(.*)</div></expression>
</RegExp>
<RegExp input="$$1" output="<episode>\1</episode>" dest="8+">
<expression repeat="no"><td class="epno lastep">([0-9]+)</td></expression>
</RegExp>
<expression noclean="1"></expression>
</RegExp>
<RegExp input="$$10" output="<episodeguide>\1</episodeguide>" dest="3+">
<RegExp input="$$1" output="<episode><title>\2</title><epnum>\1</epnum></episode>" dest="10">
<expression repeat="yes"><td class="id eid"><a href.[^>]*>([0-9]+).*?label.[^>]*>(.[^<]*)</expression>
</RegExp>
<expression noclean="1"></expression>
</RegExp>
</GetDetails>
</scraper>From what I can gather (I am SO adding this to the scraper Wiki once I understand it) from glancing at the tvdb scraper source, the episode format is part of the "GetDetails" (are these "sections" arbitrary?) section. The format seems to be as follows:
Code:
<episodeguide>
<episode>
<title>Title of this ep</title>
<enum>XX</enum>
</episode>
</episodeguide>And it seems to be after <details>. At the moment, I think I'm doing this right, but once again, I can only check my regex with ScraperXML. Is there any way to see XBMC's output on a scrape run? Is there a scraper log flag I need to toggle?


Search
Help