ScraperXML (Open Source XML Web Scraper C# Library) please help verify my work... - Printable Version
+- XBMC Community Forum (http://forum.xbmc.org)
+-- Forum: Development (/forumdisplay.php?fid=32)
+--- Forum: Scraper Development (/forumdisplay.php?fid=60)
+--- Thread: ScraperXML (Open Source XML Web Scraper C# Library) please help verify my work... (/showthread.php?tid=50055)
Lmfao - Nicezia - 2009-05-30 16:42
I finally boned up enough on my C++ to be able to follow the flow of XBMC's ScraperParser, so now i've run into 3 curiosities
Correct me if i'm wrong, but it looks to me in the source code that clean cleans the buffers not the Regular Expression match strings.
that might be my whole problem as i'm cleaning the matches before applying them to the output and not the input before matching the regular expression.
And one last thing, does trim option remove ALL whitespace, Leading & Trailing whitespace, just leading whitespace, or just trailing whitespace.
Don't answer just yet, stillexaminingXBMC source, so i might figure out the answer myself
Allright everything figured out now - Nicezia - 2009-05-31 14:37
Version 4.0 should be up sometime before midnight figured everything out now, just finished testing all the changes, soon as i optimize code and add logging all XBMC suport for all XBMC featres, and everything that XBMC scrapes will be fully supported, of course this as well is not a drop in solution for the older versions.
- DonJ - 2009-05-31 19:37
Good to hear!
- Nicezia - 2009-05-31 21:38
Good to actually be at this point finally, the scrapers for dummies tutorial was a little misleading as to when and where things happen and a little bit wrong on what some attributes actually do(though close enough to allow a working scraper to be made by following it), but once i got to a point where i could read XBMC source code, it all became clear.
Next step is letting other people mess around with it and make suggestions and point out bugs,
I've tried to make it easy as possible to deal with each type of scraper takes only two commands, one to retrieve the search results and one to retrieve the details(Save for the tvscrapers which take three commands, two to get info about the series, and one to get episode details (with the optional episode list update to retieve new episode info)
Later I'm going to add a relevance feature so that results can be auto selected, but i need a minute break from this library )
I'm looking forward to suggestions and criticism, and i'm getting started on a scraper tester tomorrow(though that's partially done in the program that i've developed to test the library) for both .net and mono.
- Nicezia - 2009-05-31 22:05
Still one question about the trim option, i still can't figure out if it removes ALL white space, just leading, or just trailing, or leading and trailing spaces
most of my results look okay but i still am not sure if i have the function for that right, as for the moment i have it removing leading and trailing whitespace
- jmarshall - 2009-06-01 01:56
Leading and trailing I should think, yes.
4.0 Uploaded - Nicezia - 2009-06-01 07:47
Well its up.
The good: Everything works for me tested all scrapers on it and it retrieves all info asked for.
The Bad: Haven't finished adding the logging as of yet. There's sure to be some bugs running on other systems
The Ugly: There are no programs up to demonstrate its usage, though I'm working on that and should be done by tomorrow this time. This program i'm working on can also double as a scraper tester (Pure console program) and won't neccessarily be using the object's i've made for GUI programs(which are mostly objects to save and restore settings at runtime)
- Nicezia - 2009-06-01 10:52
Just a little note, Adultcdmovies doesn't exist anymore so you might want to remove it from the svn
- zag - 2009-06-01 12:25
Great work, looking forward to some kind of test app just to show how it works to normal users like in v1.0
Also do you have a list of the working XBMC scrapers? I've had a look at the Wiki and SVN but couldn't find anything.
EDIT: I guessed the names from the svn and posted a list here
If the XBMC scrapers do become the standard for other HTPC apps, would people want a site that collects them all in one place and shows if they are working/broken?
- Nicezia - 2009-06-01 13:19
zag2me Wrote:Great work, looking forward to some kind of test app just to show how it works to normal users like in v1.0
Currently uploading a few edits and a console test app that i've just finished up. It displays the function calls, as far as the actual gui usage of the settings, i have a tiny little bit of work to do on it at the moment, currently all scrapers will only run with default settings.