Login at Kodi Home

Nicezia · (This post was last modified: 2009-05-18, 15:40 by Nicezia.)

I haven't got a clue, i'm still checking all my source code to see why it keeps recreating itself
I haven't figured out a way to clear the function, and for whatever reason, it keeps popping back up

**spiff** · 2009-05-18, 15:42

that must mean that you get a match on that regexp for some reason. looking at the regexps, it might be you not respecting the "clear" keyword on the expression. clear means we shall clear the dest buffer no matter if the expression matches or not. if you do not respect that, we have a chain already stored in $$5 and shit hits the fan

Nicezia · (This post was last modified: 2009-05-18, 16:48 by Nicezia.)

spiff Wrote:that must mean that you get a match on that regexp for some reason. looking at the regexps, it might be you not respecting the "clear" keyword on the expression. clear means we shall clear the dest buffer no matter if the expression matches or not. if you do not respect that, we have a chain already stored in $$5 and shit hits the fan

ok so the "clear" keyword clears the buffer whether it matches or not, the scraper tutorial gave me the wrong impresson of what the clear keyword did. but i've found the problem now.

**spiff** · 2009-05-18, 17:01

yeah. sorry i haven't even read that tutorial, it was based on some early ramblings of mine, but other than that it's all third party Smile

Nicezia · (This post was last modified: 2009-05-18, 18:45 by Nicezia.)

well i'm glad that got cleared up , i was thinking it was impossible to get around
anyway my problem with my clear on fail was i was checking for "regexp.match.group.count > 0" not knowing that it would make groups even if the match wasn't successful
got to get out of the habit of lazy programming!

Nicezia · (This post was last modified: 2009-05-19, 06:24 by Nicezia.)

Code:
<GetIMDBPoster dest="5">

        <RegExp input="$$8$$9$$10$$11" output="&lt;details&gt;&lt;thumbs&gt;\1&lt;/thumbs&gt;&lt;/details&gt;" dest="5">

                        <RegExp input="$$1" output="\1_SX$INFO[imdbscale]_SY$INFO[imdbscale]_\2" dest="6">

                            <expression noclean="1,2">&lt;a name=&quot;poster&quot;.*?src=&quot;(.*?)_S.*?(.jpg)&quot;.*?&lt;/a&gt;</expression>

                        </RegExp>

                        <RegExp input="$$6" output="&lt;thumb&gt;\1&lt;/thumb&gt;" dest="11">

                            <expression clear="yes" noclean="1">(.*?_SX[0-9]+_SY[0-9]+_.jpg)</expression>

                        </RegExp>

            <expression noclean="1"></expression>

        </RegExp>

    </GetIMDBPoster>

okay another question... this function clears the buffers, however its asking for info from 4 of the buffers, info created fom other functions, so when exactley is clearbuffers supposed to happen?

from this function's behaviour i would guess that at the beginning of a new function it checks the state of clear buffers from the last function and if clearbuffers is true from the last function it clears the buffers and then sets the state of the current function, am i right?

Code:
Function Process

1. checks for if clearbuffers = no

  1.a if clearbuffers ="no" leaves everything intact in the buffers

  1.b if clearbuffers ="yes" or isnot set then deletes everything from all buffer

2. Either downloads specific page reffered to, or takes title and year and sets that to $$1

3.Parses through regular expressions

  execution starts from the first RegExp's deepest decendant

    Check conditional     

       if condition met... 

          replacebuffers on input

          if clear is set on the expression, the destination is cleared before execution,

          replace buffers on expression before compile

       apply expression to text

              check if there are any matches

                 if repeat....

                 if noclean....

                 if trim.....

      apply results to output

          replace buffers on ouptut

          checks wheter to append or overwrite

      if condition not met 

          do not execute regexp

4. check output for custom function calls

    Goes over the same process above with customfunctions

      checks each custom function output once more for any newly created custom function calls

5. Final results

This is the process of my parser as i have it so far, other than not being sure of where and when to clear buffers or at what time from each function it reads this info i think i have it licked, can you verify?

**spiff** · 2009-05-19, 10:03

spiff Wrote:yes on the first one.

the second one; depends on whether or not the calling function has clearbuffers="no" set. if it isn't set, clear buffers after function execution, if it is set do not and hence the next function should be called with the previous buffer state (excluding the first one which holds the data of course).

already explained. in particular note the AFTER. your stuff looks okay but i only had half an eye to spear

Nicezia · (This post was last modified: 2009-05-20, 02:58 by Nicezia.)

Just wanted to inform everyone that ScraperXML version 1.0 will hit the svn repository tonight at midnight (Central US time), it has full support for all XBMC Movie Scrapers!

There will also be a dll release for windows, and soon to follow after, a mono-library

Nicezia · 2009-05-20, 07:45

30 minutes late but its up

Schenk2302 · (This post was last modified: 2009-05-20, 13:11 by Schenk2302.)

Nicezia Wrote:30 minutes late but its up

Okay, found it !!!

**spiff** · 2009-05-20, 13:23

congrats!

Nicezia · (This post was last modified: 2009-05-20, 14:44 by Nicezia.)

I can honestly say i couldn't have done it without you spiff,

i did find one error in it, the function that i made to clean is getting out all the html entities, but not the nested tags (found thisout cause i tried to run a scraper i made in with my library on XBMC (the debugging messages for scrapers really helps)

so i'm going to have to rewrite the function to remove tags

so version 1.1 soon to be released

And oh, i rewrote the Excalibur scraper, and its working perfectly now. and a few others that weren't working in XBMC for me.
Looking into info on post data now haven't run into a scraper that uses it yet, but I might as well go ahead and make sanctions for it... spoof is already supported.

**spiff** · 2009-05-20, 15:12

allmusic is a poster iirc. asiandb is a movie scraper posting

Nicezia · 2009-05-20, 16:03

Post method handled now, any other methods on downloading pages i need to know about that are supported in XBMC?

**spiff** · 2009-05-20, 16:05

nope, post and submit (i.e. no post) are the two forms we support