Kodi Community Forum

Full Version: Python scraper
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I've started a project similar to ScraperXML but in Python and the goal is compability with dharma+ addons.
However all information about scraper development is kind of (well thats a understatement) outdated, or perhaps I've missed something?

I'm trying to reverse engineer the ones that are included in dharma release but i'm getting very confused. Is there -any- information on how the dharma engine works with scrapers?

perhaps a flowchart Tongue?
code. see addons/Scraper.cpp, and video/VideoInfoDownloader.cpp
Oh, my c/c++ is very rusty. This will be interssting Tongue.

-Z
I've put up a git on github with the project. Not much yet since i started today. But here it is anyway.

https://github.com/ztripez/pyScraper
Ok, i've built an addon class that builds a stack with all functions from it's addon and from dependencyn.

I have a couple of questions though:

* The buffer(s) has 20 slots, is there a local buffer in every function or is it one global?


* A snippet from tmdb.xml:
Quote:<CreateSearchUrl dest="3">
<RegExp input="$$1" output="<url>http://api.themoviedb.org/2.1/Movie.search/$INFO[language]/xml/57983e31fb435df4df77afb854740ea9/\1</url>" dest="3">
<RegExp input="$$2" output="+\1" dest="4">
<expression clear="yes">(.+)</expression>
</RegExp>
<expression noclean="1"/>
</RegExp>
</CreateSearchUrl>

The basics are simple;
- Do regex-replace on buffer 1 with output and use buffer 1 as source and put the result in buffer 3.

However, sinces there are a nested RegExp should i run the regex on the parent buffer and if so, should i do it before or after i've applied the parents regex?
the buffers are global to the parser. if you dig a bit you'll see the 'clearbuffers=no' tag. that's a way to pass information between functions.

expressions are evaluated in an lifo/depth-search fashion, i.e. dig into the deepest one and evaluate that first.
spiff Wrote:the buffers are global to the parser. if you dig a bit you'll see the 'clearbuffers=no' tag. that's a way to pass information between functions.
But if the buffers are global for the scraper, why is the 'clearbuffers=no' needed? When does it clean itself?

spiff Wrote:expressions are evaluated in an lifo/depth-search fashion, i.e. dig into the deepest one and evaluate that first.
Alright, i thought so, thanks.


Thanks for the info
-Z
by default, if that tag isn't set, you clear the buffers at the end of a function call (or well, somewhere before the next function is called, but logic wise it's easiest to have it at the end of an evaluation).
Alright, thanks
Thanks for the info