Filmweb scraper

  Thread Rating:
  • 2 Votes - 3 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Post Reply
Neku Offline
Member
Posts: 79
Joined: Feb 2009
Reputation: 0
Post: #46
Any chance to fix this scraper? I realy love itLaugh. But its stop working for me now. Its find title but dont downloand any cover and any info about movie.Frown
find quote
Neku Offline
Member
Posts: 79
Joined: Feb 2009
Reputation: 0
Post: #47
Neku Wrote:Any chance to fix this scraper? I realy love itLaugh. But its stop working for me now. Its find title but dont downloand any cover and any info about movie.Frown


This must be somthing with site cos now its working.
find quote
smuto Offline
Senior Member
Posts: 242
Joined: Sep 2004
Reputation: 2
Post: #48
The problem seems to be because the website sometimes force users to see a welcome page. I was try to fix this by spoof. But this will be harder to than expected. So if someone wants to work on that, she/he is welcome.

smuto

[Image: 1.png]
find quote
waszka Offline
Junior Member
Posts: 8
Joined: Feb 2009
Reputation: 0
Post: #49
smuto Wrote:The problem seems to be because the website sometimes force users to see a welcome page. I was try to fix this by spoof. But this will be harder to than expected. So if someone wants to work on that, she/he is welcome.

smuto

It looks , the site always is forcing to see a welcome page :/ i do refresh and every time i see a message :
Tool.doNotEscapeHTML ($simpleMarkupTool.renderMarkup($desc.description

Any ideas what is wrong ?

---added
I''ve done goolge for phrase doNotEscapeHTML and i've found many sites with this message ( eg. http://www.meg.ryan.filmweb.pl/FilmRevie...ew.id=5968 ) . It looks , there is a problem with filmweb.pl site :/
(This post was last modified: 2009-02-24 00:08 by waszka.)
find quote
nightman Offline
Junior Member
Posts: 1
Joined: Mar 2009
Reputation: 0
Post: #50
waszka Wrote:It looks , the site always is forcing to see a welcome page :/

I''ve done goolge for phrase doNotEscapeHTML and i've found many sites with this message ( eg. http://www.meg.ryan.filmweb.pl/FilmRevie...ew.id=5968 )

Any ideas what is wrong ?

Try to use the URL with "&" changed to "&". For me it doesn't show the welcome page and what is better it doesn't show the "doNotEscapeHTML" error.
find quote
zxcvbn1971 Offline
Junior Member
Posts: 4
Joined: Mar 2009
Reputation: 0
Post: #51
I'm afraid this scraper does not work anymore. Only 30% of movies are correctly added to XBMC database. Remaining movies are added with errors (no title or description - famous doNotEscapeHTML) or not added at all....
Sad
(This post was last modified: 2009-05-15 10:19 by zxcvbn1971.)
find quote
wojak Offline
Junior Member
Posts: 10
Joined: Jul 2009
Reputation: 0
Post: #52
hi,
something new about this scraper? It works for you or not?
find quote
plebann Offline
Junior Member
Posts: 2
Joined: Mar 2009
Reputation: 0
Post: #53
Hi.
I have a question:
what about Filmweb scraper in new version? Will be available?
(This post was last modified: 2010-04-13 16:24 by plebann.)
find quote
smuto Offline
Senior Member
Posts: 242
Joined: Sep 2004
Reputation: 2
Post: #54
for now i have no time - maybe i try in summer

from today we have new filmweb portal - my old filmweb scraper is not working anymore

smuto

[Image: 1.png]
find quote
smuto Offline
Senior Member
Posts: 242
Joined: Sep 2004
Reputation: 2
Post: #55
just start working on new filmweb portal - basic scraper is ready for tests

metadata.filmweb.pl.zip

but it seems that "oacute problem" is back, accented characters are show as a entities

smuto

[Image: 1.png]
find quote
spiff Offline
Retired Developer
Posts: 12,386
Joined: Nov 2003
Post: #56
you need to specify the fixchars attribute, see http://trac.xbmc.org/changeset/31124
find quote
smuto Offline
Senior Member
Posts: 242
Joined: Sep 2004
Reputation: 2
Post: #57
filmweb is in utf - so i flag the outputted xml as utf-8

after fixchars="1" there is no "oacute problem", but i have encoding problem - looks like i'm getting utf-8 encoded results which is passed through a utf-8 converter yet again (which doesnt play nice)

smuto

[Image: 1.png]
find quote
spiff Offline
Retired Developer
Posts: 12,386
Joined: Nov 2003
Post: #58
oh bugger. yeah, fixchars only works on ascii shit. give me some time to think of a solution, i'll get it in.
find quote
mako777 Offline
Junior Member
Posts: 5
Joined: Jul 2010
Reputation: 0
Post: #59
Hi,
The new scraper is working pretty well except of two problems/bugs that I have.
First is that in movies list, some movies instead of only title, eg. "Hitman" has something like this "Hitman (2007) - Filmweb".
And Second is that if scraper doesn't find the movie, it's not added to the library and then it's hard to find which movie is not added (I have to search for it manually), this could be easier if such movie will be added to the library with for eg. unknown title
find quote
smuto Offline
Senior Member
Posts: 242
Joined: Sep 2004
Reputation: 2
Post: #60
i just made same fix for weekend tests - please try it

Quote:And Second is that if scraper doesn't find the movie

can u be more accurate? title of the movie or file name?

or just give me an example

smuto

[Image: 1.png]
find quote
Post Reply