ScraperXML (Open Source XML Web Scraper C# Library) please help verify my work... - Printable Version
+- XBMC Community Forum (http://forum.xbmc.org)
+-- Forum: Development (/forumdisplay.php?fid=32)
+--- Forum: Scraper Development (/forumdisplay.php?fid=60)
+--- Thread: ScraperXML (Open Source XML Web Scraper C# Library) please help verify my work... (/showthread.php?tid=50055)
- spiff - 2009-05-20 13:23
- Nicezia - 2009-05-20 14:33
I can honestly say i couldn't have done it without you spiff,
i did find one error in it, the function that i made to clean is getting out all the html entities, but not the nested tags (found thisout cause i tried to run a scraper i made in with my library on XBMC (the debugging messages for scrapers really helps)
so i'm going to have to rewrite the function to remove tags
so version 1.1 soon to be released
And oh, i rewrote the Excalibur scraper, and its working perfectly now. and a few others that weren't working in XBMC for me.
Looking into info on post data now haven't run into a scraper that uses it yet, but I might as well go ahead and make sanctions for it... spoof is already supported.
- spiff - 2009-05-20 15:12
allmusic is a poster iirc. asiandb is a movie scraper posting
- Nicezia - 2009-05-20 16:03
Post method handled now, any other methods on downloading pages i need to know about that are supported in XBMC?
- spiff - 2009-05-20 16:05
nope, post and submit (i.e. no post) are the two forms we support
- Nicezia - 2009-05-20 16:18
good now i can move on to supporting TVShow scrapers
Then Music video scrapers
Then Music Scrapers
The hardest part is over now its just a matter of adding in the other functions and considerations for things other than movies
after that i'm going to start making custom scrapers for games, comics, books, magazines
(I'm quite ambitious)
- spiff - 2009-05-20 16:31
i'd do mvids first as that's pretty much identical to movies, the online part that is. the other part is just another function that is fed the filename.
- Nicezia - 2009-05-20 16:39
Allright i'll go with music vids next.
is there somewhere i could find a list of standard function in each scraper type and what the return data is supposed to look like (because i've already noticed TVShows has a completely different return format from movies in get search results)
- spiff - 2009-05-20 16:55
only documentation as ever when i do stuff is the source itself
the return format should be the same, maybe some diff attributes but it's all a basic <details><entity>..</entity></details> thingie. as "proof" of that, we have only one method that we use in XBMC itself.
- nul7 - 2009-05-20 17:36
Congrats on the release! Sorry I missed it last night.... ended up passing out during the movie. :/