Kodi Community Forum
can we please please provide an alternate way to write scrapers - Printable Version

+- Kodi Community Forum (https://forum.kodi.tv)
+-- Forum: Development (https://forum.kodi.tv/forumdisplay.php?fid=32)
+--- Forum: Scrapers (https://forum.kodi.tv/forumdisplay.php?fid=60)
+--- Thread: can we please please provide an alternate way to write scrapers (/showthread.php?tid=96671)



can we please please provide an alternate way to write scrapers - pathw - 2011-03-12

I've spent hours trying to tweak the anidb scraper, using the crazy xml and regex dsl that we have to jump through hoops to do some of the complex stuff that's required.

I constantly find myself faced with a problem and having to attempt a dozen different solutions before finding one that works simple because a dozen ways of solving problems are not viable.

I understand that the choice of developing scrapers the way we do was probably built this way to be easy or to sandbox it. But it doesn't serve the purpose of easiness anymore.

Firstly manipulating xml streams with regex is really hard, and it's sad that we dont have dom methods. Secondly there is so much logic in some of the scrapers we have that would be so much better expressed in a general purpose language.

Right now it's almost torture to have to tweak bugs in scrapers. I'm not suggesting dropping the current scraper technology, but how about the ability to write scrapers in a more general purpose scripting language.

If sandboxing is an issue, we could bundle a javascript or a lua runtime into the system. But I think this will provide great sanity to the scraper scripts. So many of the subtle bugs that scrapers have are actually artifacts of the technology choice. I have a lot of false negatives or positives I just cannot fix because I'm unable to make my scraper smarter.

thanks


- jmarshall - 2011-03-12

We have python available already, so using that instead is a reasonable option. A patch would be welcome.

Cheers,
Jonathan