[RELEASE] Data18.com Web Content Scraper - Adult Movie Web Downloads

  Thread Rating:
  • 2 Votes - 5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Post Reply
user136 Online
Junior Member
Posts: 3
Joined: Aug 2014
Reputation: 0
Post: #46
does this still work? i use it to find without google because it blocks me all the times.

If i search on there website i dont see the file, did they changed the security?
i renamed all the times on the name of person and the name of update

How do i need to rename the files i just cant figure this out tryd 4 whole days.

thanks!
find quote
DoctorD Offline
Member
Posts: 55
Joined: Apr 2013
Reputation: 3
Post: #47
It works ok for me when I try it, even without using the google search mode. Obviously using the google search mode is more lenient on what you name the file, but if you don't want to use it, try naming your file exactly the same as the title of the file from the page on the site. This is found on the page of the file above where it has the thumbnail image and says something like:

Date: January 01, 2013 | ! Report errors

The problem with this is that it's kind of annoying to name your files this because then you can't include the sitename or actors in the filename. So really, my suggestion is to use google and stagger your searches to not get blocked, because data18's search is terrible.
find quote
user136 Online
Junior Member
Posts: 3
Joined: Aug 2014
Reputation: 0
Post: #48
ok thanks, have it
(This post was last modified: 2014-08-28 09:23 by user136.)
find quote
DoctorD Offline
Member
Posts: 55
Joined: Apr 2013
Reputation: 3
Post: #49
I've released a few updates (1.5.4 and 1.5.5.) to address some issues with date content not scraping.
find quote
Chuck Bartowski Offline
Member
Posts: 76
Joined: May 2011
Reputation: 0
Post: #50
Thanks DoctorD - will test it tonight
find quote
Chuck Bartowski Offline
Member
Posts: 76
Joined: May 2011
Reputation: 0
Post: #51
Hi DoctorD
It's great but still not picKing up fanart. Is this a known issue.
Thanks for all your work
find quote
DoctorD Offline
Member
Posts: 55
Joined: Apr 2013
Reputation: 3
Post: #52
It's a problem with XBMC itself. XBMC doesn't support the spoof attribute with fanart and some servers that data18 uses require that the referrer be from data18 so people don't try to rip off their images. It's weird because this requirement is not consistent across servers that data18 uses, so sometimes the fanart will work and sometimes it won't.

I've actually filed a bug / feature request for XBMC to add this (see earlier post in this thread), but it hasn't been implemented yet.

To get around this, you can use my standalone scraper program which supports both data18 content and Japanese movies.

Here's the link to it:

https://github.com/DoctorD1501/JAVMovieScraper
find quote
Chuck Bartowski Offline
Member
Posts: 76
Joined: May 2011
Reputation: 0
Post: #53
Thanks DoctorD will try out the JAV scraper and let you know how it goes. Again, thank you
find quote
str1567 Offline
Junior Member
Posts: 2
Joined: Oct 2014
Reputation: 0
Post: #54
Hi DoctorD!

I appreciate your scraper very much. However I have faced several issues. I have tried fixed some of them by myself:
1. Google blocks me very quickly. Just after 10-20 files scraped. So I have replaced it with Bing. Bing works without any problem.
2. Added support for 'Official poster' image. Page example: http://www.data18.com/content/178762
3. Video preview image with alt="Scene Preview" is supported. Page example: http://www.data18.com/content/174268
4. Fanart is not loaded from image gallery for pages like this: http://www.data18.com/content/179454 . Probably because of the xbmc issue discussed above.
The worst thing is that broken fanart replaces the 'This is fallback in case there is no image gallery.' image and as result - there is no fanart at all! So I have temporary commented out the logic for image gallery fanart.

Could you please consider merging my changes into your code (except the last one - I think there should be a better fix for it)?

The updated 1.5.5 version can be found here: http://www.mediafire.com/download/gjy31v...m.bing.zip
find quote
DoctorD Offline
Member
Posts: 55
Joined: Apr 2013
Reputation: 3
Post: #55
Hi str1567,

Thanks for posting your updated version. Data18 has so many types of pages, so it's easy for me to miss one or two here, so your additions are greatly appreciated. I'm mostly using my standalone scraper program I wrote for Data18 these days, so I'm not always aware of when things break or change in the XML version anymore. I think my standalone might actually have a few of the same issues you reported with it too, so I'll have to check that out as well.

I'm going to check out your code and see what I can do to get it merged (maybe I can add an option to let you switch between Bing and Google? I initially wrote in Bing support back before the first release, but it seemed to find a match of a poorly named file less often than Google so I switched it out) and also see what I can do to fix item number 4 (maybe I can add the fallback image to the last item of the fanart every single time so you'll at least have something...?)

I'll make another post with my progress when I have made my update. It may take a few days since I'm fairly busy this week.

Thanks again!
(This post was last modified: 2014-10-07 17:38 by DoctorD.)
find quote
str1567 Offline
Junior Member
Posts: 2
Joined: Oct 2014
Reputation: 0
Post: #56
Thanks!
find quote
ScrappingFTW Offline
Junior Member
Posts: 1
Joined: Oct 2014
Reputation: 0
Post: #57
Hi DoctorD, thanks for your efforts.

Here's a small patch to fix path separator character on Linux.

Regards.
find quote
DoctorD Offline
Member
Posts: 55
Joined: Apr 2013
Reputation: 3
Post: #58
Thanks ScrappingFTW. I implemented your fix and posted a new build with it. Hopefully it should work OK now for you on Linux.
find quote
Post Reply