ScraperXML (Open Source XML Web Scraper C# Library) please help verify my work...

  Thread Rating:
  • 0 Votes - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Post Reply
Nicezia Offline
Fan
Posts: 369
Joined: Nov 2006
Reputation: 0
Location: Montgomery, Alabama
Question  ScraperXML (Open Source XML Web Scraper C# Library) please help verify my work...
Post: #1
Just to make sure i'm getting this right(don't blast me i am just trying to verify my work)

Code:
scraper $$(20) //array of 20 strings
|
|                        
Function //9 string fields(the info is compiled to xml format and sent as a
     |                                                  single string back to one of the 20
     |                                                    buffers)
     |
     |
     Regular Expression (sends the info back to the function arrays)
         |
         |
         Expression (makes matches for each field and sends back to the
                                   RegExp each field 1-9 as an array)

Am i understanding right? that from expression we have a a results-determined amount of or 9 string arays (var[?][8]) which is compressed into a single string by the RegExp (var[8]), which is sent to the functions in one of 9 possible string variables (var = single string) collected from each regexp then the function compresses these 9 fields into a single string which is sent to one of twenty of the scraper buffers, and At the end of a function the clearbuffers(if set) clears the 9 function fields?

ScraperXML Open Source Web Scraper Library compatible with XBMC XML Scrapers


I Suck, and if you act now by sending only $19.95 and a self addressed stamped envelop, so can you!

[Image: teamumx_sigline.png]
(This post was last modified: 2009-06-12 15:47 by Gamester17.)
find quote
Nicezia Offline
Fan
Posts: 369
Joined: Nov 2006
Reputation: 0
Location: Montgomery, Alabama
Post: #2
Nevermind i just figured the whole thing out, seems i was thinking about it in the wrong way, i have it figured out now...

however, is that the option override the regexp ignore culture-specifics?
(This post was last modified: 2009-05-01 05:30 by Nicezia.)
find quote
spiff Offline
Retired Developer
Posts: 12,386
Joined: Nov 2003
Post: #3
i do not understand what you mean.
find quote
Nicezia Offline
Fan
Posts: 369
Joined: Nov 2006
Reputation: 0
Location: Montgomery, Alabama
Post: #4
i suppose it would be easier just to ask what the override option does, because i haven't got a clue, i'm guessing it has something to do with the regular expression engine. but i'm not quite clear on what it sets the reg expression engine to do.
find quote
spiff Offline
Retired Developer
Posts: 12,386
Joined: Nov 2003
Post: #5
there is no override option?
find quote
Nicezia Offline
Fan
Posts: 369
Joined: Nov 2006
Reputation: 0
Location: Montgomery, Alabama
Post: #6
From The Scaper.Xml Wiki:

Quote:conditonal="<condition>": A condition that must resolve to TRUE for the particular RegExp to be run. Currently the only available condition is "override", which is set based on the Language Override setting in the scraper.

Can you point me to the code that handles this function in XBMC?
find quote
spiff Offline
Retired Developer
Posts: 12,386
Joined: Nov 2003
Post: #7
aha.

those are scraper settings. they are the stuff returned from the <GetSettings> scraper function.

see PluginSettings.cpp (CBasicSettings) and ScraperSettings.cpp.

they are also used with the $INFO[settingname] construct
(This post was last modified: 2009-05-03 14:15 by spiff.)
find quote
Nicezia Offline
Fan
Posts: 369
Joined: Nov 2006
Reputation: 0
Location: Montgomery, Alabama
Post: #8
oh ok, perhaps that's why it didn't make sense to me, i haven't tackled the whole custom function thing yet.
find quote
Nicezia Offline
Fan
Posts: 369
Joined: Nov 2006
Reputation: 0
Location: Montgomery, Alabama
Post: #9
btw, i have it running most scrapers, ported it to monodevelop and compiled it and it works the same on both mono and .NET. Currently its console only (I seperated it from the ui, because the UI actually was kinda distracting me from coding the damn thing.(I'd code a bit, and then slip over to the ui to consider how to integrate that code into the UI.)

I had asked a question, but once i looked into plugin settings that was answered for me Smile
(This post was last modified: 2009-05-05 01:59 by Nicezia.)
find quote
spiff Offline
Retired Developer
Posts: 12,386
Joined: Nov 2003
Post: #10
great! Smile

i would put alot of effort into having your parser work as a library if i were you. it will make it alot more useful, in particular i sincerly hope that stuff like MIP will pick it up.
find quote
Gamester17 Offline
Team-XBMC Forum Moderator
Posts: 10,523
Joined: Sep 2003
Reputation: 9
Location: Sweden
Post: #11
If you open source this code/library and could pitch this concept in a better way to MediaPortal and MeediOS developers then we could all use the same scrapers and share the development of them, checkout:
http://forum.team-mediaportal.com/improv...ing-35312/
and:
http://www.meedios.com/forum/viewtopic.php?t=2238

Just the same all other open source media managers that are dotnet based could use this as well:
http://forum.xbmc.org/tags.php?tag=media+manager

Possible making this scraper method (as well as XBMC's NFO formatting) the open standard for all open source media center and media management applications Cool

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
(This post was last modified: 2009-05-05 17:14 by Gamester17.)
find quote
Nicezia Offline
Fan
Posts: 369
Joined: Nov 2006
Reputation: 0
Location: Montgomery, Alabama
Post: #12
spiff Wrote:great! Smile

i would put alot of effort into having your parser work as a library if i were you. it will make it alot more useful, in particular i sincerly hope that stuff like MIP will pick it up.



I think that's going to be the only way seeing as how Monodevelop has absolutely no support for creating a GUI with .NET, save for C# which i don't know the slightest bit about.

As far as it goes with just the console module, all i have to finish up at this moment is my considerations of custom functions(which i'm working on at this moment) and then go back and account for error handling and streamline the code.
find quote
Nicezia Offline
Fan
Posts: 369
Joined: Nov 2006
Reputation: 0
Location: Montgomery, Alabama
Post: #13
Gamester17 Wrote:If you open source this code/library and could pitch this concept in a better way to MediaPortal and MeediOS developers then we could all use the same scrapers and share the development of them, checkout:
http://forum.team-mediaportal.com/improv...ing-35312/
and:
http://www.meedios.com/forum/viewtopic.php?t=2238

Just the same all other open source media managers that are dotnet based could use this as well:
http://forum.xbmc.org/tags.php?tag=media+manager

Possible making this scraper method (as well as XBMC's NFO formatting) the open standard for all open source media center and media management applications Cool

Not A Bad Idea at all i hadn't even considered that, but I was considering integrating this into a catalog manager that i wrote, it handles books and comics, movies, TVShows , the only thing is with it so far is that all info has to be put in manually (except for movies, which uses theMovieDB Api).

But it would definately be a good idea for everyone to have a unified and forwards thinking scraper library.

Since my end goal is making an editor for the ScraperXMLs (and now that i have a better understanding myself of how these things work, that goal seems simpler than it did last week) maybe that would calm the people who are saying its too difficult to program for .... (And i actually dissagree with that now considering that I have had no formal programming training myself and have only been self-teaching myself programming for about 11 months)
(This post was last modified: 2009-05-06 06:33 by Nicezia.)
find quote
Nicezia Offline
Fan
Posts: 369
Joined: Nov 2006
Reputation: 0
Location: Montgomery, Alabama
Post: #14
Ummm, i kinda coded it this way without thinking about it, but just to make sure... the Function dest only tells XBMC where to look for the Information that was gathered by its root RegExp Element... right?

I was just looking back through my code and checking for errors when i realized there was a skip in the chain between regExp and the function, but i was using the Function Dest to pull the info to send to either get a page or pull results (based on what function is in play)

And last thing i need to verify is... while a custom function is running it uses a seperate Bufferspace from the main one with a completely fresh set of buffers $$1-$$20?
find quote
spiff Offline
Retired Developer
Posts: 12,386
Joined: Nov 2003
Post: #15
yes on the first one.

the second one; depends on whether or not the calling function has clearbuffers="no" set. if it isn't set, clear buffers after function execution, if it is set do not and hence the next function should be called with the previous buffer state (excluding the first one which holds the data of course).
find quote
Post Reply