Unable to connect to remote server
#1
I'm trying to write a scraper to get movie information from the Grindhouse Database. While I like the site, this project is mainly so I can learn scraper development. Whenever I have the scraper try to scan a directory I keep getting an error about it not being able to connect to the remote server, so I'm wondering if I overlooked something.

Here is the code for grindhousedb.xml
Code:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<scraper date="2013-06-16" framework="1.1">

    <NfoUrl dest="3">
        <RegExp input="$$1" output="&lt;url&gt;http://www.grindhousedatabase.com/index.php/\1&lt;/url&gt;" dest="3">
            <expression noclean="1">grindhousedatabase.com/index.php/(.*?)</expression>
        </RegExp>
    </NfoUrl>

    <CreateSearchUrl dest="3">
        <RegExp input="$$1" output="&lt;url&gt;http://www.grindhousedatabase.com/index.php/Special:Search?search=\1&amp;fulltext=Search&lt;/url&gt;" dest="3">
            <expression noclean="1" />
        </RegExp>
    </CreateSearchUrl>

    <GetSearchResults dest="8">
        <RegExp input="$$5" output="&lt;results&gt;\1&lt;/url&gt;" dest="8">
            <RegExp input="$$1" output="&lt;entity&gt;&lt;title&gt;\2&lt;/title&gt;&lt;url&gt;http://www.grindhousedatabase.com/index.php/\1&lt;/url&gt;&lt;/entity&gt;" dest="5">
                <expression repeat="yes">&lt;div class=&apos;mw-search-result-heading&apos;&gt;&lt;a href=&quot;index.php/(.*?)&quot; title=&quot;(.*?)&quot;&gt;(.*?)&lt;/a&gt;</expression>
            </RegExp>
            <expression clear="yes" noclean="1" />
        </RegExp>
    </GetSearchResults>

    <GetDetails dest="3">
        <RegExp input="$$5" output="&lt;details&gt;\1&lt;/details&gt;" dest="3">

        <!-- Title -->
            <RegExp input="$$1" output="&lt;title&gt;\1&lt;/title&gt;" dest="5">
                <expression>&lt;h1 class=&quot;firstHeading&quot;&gt;(.*?)&lt;/h1&gt;</expression>
            </RegExp>

            <expression clear="yes" noclean="1" />
        </RegExp>
    </GetDetails>
</scraper>

I only have the title under details at the moment because I want to make sure I can get it to connect to the server first. After that hurdle is passed, then I'll work on the rest of the details.

The code for the addon.xml file is below:
Code:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<addon id="metadata.grindhousedatabase.com"
       name="Grindhouse Database"
       version="0.0.2"
       provider-name="lafnlab">
  <requires>
    <import addon="xbmc.metadata" version="1.0"/>
  </requires>
  <extension point="xbmc.metadata.scraper.movies"
             language="en"
             library="grindhousedb.xml"/>
  <extension point="xbmc.addon.metadata">
    <summary lang="en">Grindhouse Database Movie Scraper</summary>
    <description lang="en">Download movie information from the Grindhouse Database. The database is primarily concerned with films that would have played in "grindhouse" theaters in the U.S. during the 60s, 70s, and 80s - a time considered to be a "golden age" of filmmaking.</description>
    <platform>all</platform>
  </extension>
</addon>
Reply
#2
(2013-06-16, 19:50)lafnlab Wrote: <RegExp input="$$5" output="&lt;results&gt;\1&lt;/url&gt;" dest="8">

(Might be the source of your problems.)
Reply
#3
The best way to debug an error like this is turning on Debug info in system settings.

Then you can find the error under ~/.xbmc/temp/xbmc.log
Reply
#4
Thanks for the catch, scudlee. I change the closing tag to results, but it didn't help as I got the same error.

According to the log, for each video it says:

ERROR: Parse: Could not find scraper function CreateSearchUrl
ERROR: Run: Unable to parse web site
WARNING: No information found for item '/home/gagarin/Videos/Grindhouse/name', it won't be added to the library.
Reply
#5
Double-check the name of your scraper xml. Make sure it matches exactly what you put in your addon.xml (e.g. no hidden .txt extension).
Reply
#6
Thumbs Up 
I found it. Part of the problem was I had two directories, one under ~/.xbmc/addons and on on the Desktop. I was working from the Desktop folder, but not keeping the changes in the addons folder. I've since deleted the folder from the Desktop and replaced it with a link to the one in addons.

Aside from that, the second line at the beginning of the grindhousedb.xml file (in the .xbmc/addons dir) had an extraneous / at the end, which messed up parsing. I just ran the corrected scraper without any errors. Now, I'll have to finish the scraper to get the movie details.

Thanks for your help, everyone. Big Grin
Reply

Logout Mark Read Team Forum Stats Members Help
Unable to connect to remote server0