Filmweb scraper

  Thread Rating:
  • 2 Votes - 3 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Post Reply
smuto Offline
Senior Member
Posts: 241
Joined: Sep 2004
Reputation: 2
Post: #31
can someone review and then commit this to the SVN?

[Image: 1.png]
find quote
smuto Offline
Senior Member
Posts: 241
Joined: Sep 2004
Reputation: 2
Post: #32
i have a script before every location url

ex.
details.1.html

how can i force scraper to skip this

smuto

[Image: 1.png]
find quote
smuto Offline
Senior Member
Posts: 241
Joined: Sep 2004
Reputation: 2
Post: #33
i add "spoof" to url, mayby this help
u can test my wip scraper

filmweb.xml_test

[Image: 1.png]
find quote
spiff Offline
Grumpy Bastard Developer
Posts: 12,187
Joined: Nov 2003
Reputation: 82
Post: #34
spoof is for setting the referer. it probably does the trick indeed. sorry for the late response

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
find quote
smuto Offline
Senior Member
Posts: 241
Joined: Sep 2004
Reputation: 2
Post: #35
maybe it's not xbmc problem, but maybe u can help

Recently in movie info from filmweb scraper, accented characters are show as a entities

ex.
latin small letter o with acute
ó -> ó

is the way to fix this
smuto

[Image: 1.png]
find quote
spiff Offline
Grumpy Bastard Developer
Posts: 12,187
Joined: Nov 2003
Reputation: 82
Post: #36
hmm, it should convert those tags when you load the xml?
if not, make sure cleaning is performed on the field. latter would remove them though

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
find quote
smuto Offline
Senior Member
Posts: 241
Joined: Sep 2004
Reputation: 2
Post: #37
with or without "noclean" i still have this same

ex
xbmc shows
który -> ktoacute;ry

in .xml from scrap.exe
który -> który

realy don't know what to do

i need to update SVN (small url function link fix)
but for now,this one is good for testing entitie
filmweb.xml_test

good for test is "Kingdom of the Crystal Skull"
tag title is OK
tags outline & plot are wrong

[Image: 1.png]
find quote
smuto Offline
Senior Member
Posts: 241
Joined: Sep 2004
Reputation: 2
Post: #38
for myself i edit source file HTMLUtil.cpp
edited HTMLUtil.cpp
Code:
strReturn.Replace("–", "-;");
  strReturn.Replace("ó", "ó");

it's working, but i hope u help to fix this for all polish users

smuto

[Image: 1.png]
find quote
smuto Offline
Senior Member
Posts: 241
Joined: Sep 2004
Reputation: 2
Post: #39
i add fanart to filmweb scraper

i use polish wikipedia to migration from filmweb.id to imdb.id

we still have problem with entities, hope spiff find time to help us

u can test new scraper from here
filmweb.xml_test_scraper

smuto

[Image: 1.png]
find quote
spiff Offline
Grumpy Bastard Developer
Posts: 12,187
Joined: Nov 2003
Reputation: 82
Post: #40
hi.

i see nothing wrong, nor any other way to handle this so i just commited your replaces along with the new scraper. please use trac in the futureSmile

spiff

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
find quote
Post Reply