Filmweb scraper

  Thread Rating:
  • 2 Votes - 3 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Post Reply
haken Offline
Member
Posts: 59
Joined: Jan 2008
Reputation: 0
Location: Poland - Krakow
Post: #41
@smuto: There are some problems with titles that start with numbers eg. "1410" or "27 dresses" - numbers are cut off from them. Fanart support is really great.
I hope that xbmc compilation with edited HTMLutil.cpp will be ready soon. At this moment you could put your compiled xbmc default.xbe at smuto.w.interia.pl (would be great for me, because i want to rescan my movie library and polish plots with no entity problems is something I look for...)
find quote
haken Offline
Member
Posts: 59
Joined: Jan 2008
Reputation: 0
Location: Poland - Krakow
Post: #42
Eventhough entities has been fixed with changeset 15625 it seems that "oacute problem" still exists (I checked filmweb scraper on xbmc compilations 15640 and 15728). Smuto - do you agree with me?
find quote
smuto Offline
Senior Member
Posts: 241
Joined: Sep 2004
Reputation: 2
Post: #43
@haken - u just need to update scraper
filmweb.xml

@spiff
Quote:i see nothing wrong, nor any other way to handle this so i just commited your replaces
but this is not a good idea - "oacute" & "ndash" are most popular
this mean i should add all entities to replaces
next in my queue are
Code:
strReturn.Replace(" ", "");
  strReturn.Replace("’", "'");
smuto

[Image: 1.png]
find quote
smuto Offline
Senior Member
Posts: 241
Joined: Sep 2004
Reputation: 2
Post: #44
i don't know why, but sometimes wikipedia search don't work

i change the way of scraping the link after search - please test
filmweb.xml

is the way to show in skin custom label?

something like this i need for testing
ListItem.IMDbID or ListItem.FilmwebID

smuto

[Image: 1.png]
find quote
haken Offline
Member
Posts: 59
Joined: Jan 2008
Reputation: 0
Location: Poland - Krakow
Post: #45
@smuto: I think that there are some changes in filmweb.pl website - descriptions cannot be scraped and high-res posters also. I looked inside the scraper, but it is to complicated for meWink

Update: Scraper is ok! It was something else - now everything works perfect. I was surprised because each time earlier scraper worked or didn't work at all... Sorry!
(This post was last modified: 2008-11-01 20:16 by haken.)
find quote
Neku Offline
Member
Posts: 79
Joined: Feb 2009
Reputation: 0
Post: #46
Any chance to fix this scraper? I realy love itLaugh. But its stop working for me now. Its find title but dont downloand any cover and any info about movie.Frown
find quote
Neku Offline
Member
Posts: 79
Joined: Feb 2009
Reputation: 0
Post: #47
Neku Wrote:Any chance to fix this scraper? I realy love itLaugh. But its stop working for me now. Its find title but dont downloand any cover and any info about movie.Frown


This must be somthing with site cos now its working.
find quote
smuto Offline
Senior Member
Posts: 241
Joined: Sep 2004
Reputation: 2
Post: #48
The problem seems to be because the website sometimes force users to see a welcome page. I was try to fix this by spoof. But this will be harder to than expected. So if someone wants to work on that, she/he is welcome.

smuto

[Image: 1.png]
find quote
waszka Offline
Junior Member
Posts: 8
Joined: Feb 2009
Reputation: 0
Post: #49
smuto Wrote:The problem seems to be because the website sometimes force users to see a welcome page. I was try to fix this by spoof. But this will be harder to than expected. So if someone wants to work on that, she/he is welcome.

smuto

It looks , the site always is forcing to see a welcome page :/ i do refresh and every time i see a message :
Tool.doNotEscapeHTML ($simpleMarkupTool.renderMarkup($desc.description

Any ideas what is wrong ?

---added
I''ve done goolge for phrase doNotEscapeHTML and i've found many sites with this message ( eg. http://www.meg.ryan.filmweb.pl/FilmRevie...ew.id=5968 ) . It looks , there is a problem with filmweb.pl site :/
(This post was last modified: 2009-02-24 00:08 by waszka.)
find quote
nightman Offline
Junior Member
Posts: 1
Joined: Mar 2009
Reputation: 0
Post: #50
waszka Wrote:It looks , the site always is forcing to see a welcome page :/

I''ve done goolge for phrase doNotEscapeHTML and i've found many sites with this message ( eg. http://www.meg.ryan.filmweb.pl/FilmRevie...ew.id=5968 )

Any ideas what is wrong ?

Try to use the URL with "&" changed to "&". For me it doesn't show the welcome page and what is better it doesn't show the "doNotEscapeHTML" error.
find quote
Post Reply