[RELEASE] Data18.com Web Content Scraper - Adult Movie Web Downloads

  Thread Rating:
  • 2 Votes - 5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Post Reply
karipu Offline
Junior Member
Posts: 2
Joined: Nov 2013
Reputation: 0
Post: #16
Thanks DoctorD for this amazing scraper. Everything works fine (actors, genres,...), execpt for the fanarts. I tried on several videos but it doesn't seem to scrap fanarts even though they appear on data18 website.
find quote
DoctorD Offline
Junior Member
Posts: 38
Joined: Apr 2013
Reputation: 1
Post: #17
I've noticed problems with the fanart as well from time to time. Is anyone else noticing that it sometimes works on some scenes, but not others? That seems to be what happens with me when I try things.

What I think is happening is that data18 has different image servers they store the gallery images on. You can tell this because you won't always get the same IP address for the same image file depending on the scene.

Anyways, for some of these servers, they don't care about the referrer for the image and just let you download it no matter what. However, on several of the servers, they enforce the referrer and if you try to hotlink the image, it won't download.

Now normally, this wouldn't be a problem because XBMC has a spoof field you can fill in in <thumb> in the .nfo. Indeed, this actually works for the poster thumbs and the images download fine for those even when they need a spoof to get it. However, the spoof attribute doesn't seem to work on fanart thumbs, and I don't know why. It might be a bug in XBMC, it might be a bug in my scraper. If anybody has any ideas on how to fix this, please let me know!

Thank you!
(This post was last modified: 2013-11-30 03:16 by DoctorD.)
find quote
DoctorD Offline
Junior Member
Posts: 38
Joined: Apr 2013
Reputation: 1
Post: #18
I've submitted a new bug ticket for XBMC with the problems with the spoof attribute in the fanart as I think it is a problem with XBMC itself. Here's the URL for the ticket:

http://trac.xbmc.org/ticket/14722
find quote
jiggsaw Offline
Member
Posts: 84
Joined: Aug 2012
Reputation: 0
Post: #19
(2013-11-27 06:16)DoctorD Wrote:  Hi All,

I've released a new version. Please use the new version from the direct download link in the first post of the thread until it gets merged into xbmc-adult. It fixes issues with actors not showing up. I've also noticed some of the fanart issues seem to have gone away when I was doing development. Anyone else still having issues with fanart in this scraper?

Recently it seems Data18 has vastly improved their search for content in that you can actually search their content database natively. I still think google is going to be the most accurate way of finding the scene as there's no standard naming convention for files and google can sometimes sort it out anyways, but it might be possible now to allow the scraper not to use google when searching. This would stop the problem with the scraper sometimes not working due to being blocked temporarily by google. If I can figure how settings work in scrapers and I have some free time, I may try to make this part of the scraper so you can optionally use this. If anyone else knows of another scraper that has a Settings feature in it that allows you to pick between two different search functions, that could greatly speed along development since I can examine how it was done there. XBMC scraper documentation is pretty out of date Sad.

Let us know when you figure out the google block issue. I'm getting blocked and it's a PITA. I can't really even use the scraper since it never goes through all my files.

[Image: all-thin-fanart.jpg]
(This post was last modified: 2013-12-06 07:03 by jiggsaw.)
find quote
nikman Offline
Junior Member
Posts: 28
Joined: Dec 2013
Reputation: 0
Post: #20
Can I make that this scraber only get info for files and show it when in Files, but not write it into common movies Libray?.... It's not funny when usual movies shuffled with XXX movies Smile
(This post was last modified: 2013-12-06 14:24 by nikman.)
find quote
DoctorD Offline
Junior Member
Posts: 38
Joined: Apr 2013
Reputation: 1
Post: #21
(2013-12-06 14:24)nikman Wrote:  Can I make that this scraber only get info for files and show it when in Files, but not write it into common movies Libray?.... It's not funny when usual movies shuffled with XXX movies Smile

Hi Nikman,

What I do for this is I set up two separate XBMC profiles. One for regular movies and one for adult movies.

This page has more info:

http://wiki.xbmc.org/index.php?title=profiles

Another option is to create two smart lists - one for each of your movie directories (Regular and Adult). Base the smart list on "Contains" and then put in the root directory of your movie folder. That will get all files under the sub-directory. Then use a skin like Aeon Nox (or others) which allow you to set a home screen icon for each smart list.

Still, I prefer the separate profile option as it keeps things really separated and you're not going to accidentally have your guests seeing your movies you probably don't want them watching. You can also password protect profiles, which may also be a nice thing.
(This post was last modified: 2013-12-07 05:23 by DoctorD.)
find quote
DoctorD Offline
Junior Member
Posts: 38
Joined: Apr 2013
Reputation: 1
Post: #22
(2013-12-06 07:00)jiggsaw Wrote:  Let us know when you figure out the google block issue. I'm getting blocked and it's a PITA. I can't really even use the scraper since it never goes through all my files.

Hi Jiggsaw,

I just released a new version you can download from the first post that gives you a configurable option to use Data18's own search instead of Google. It's really pretty strict and your files need to be named almost exactly the same as the title on data18, but it should prevent you from getting blocked. I don't really like naming my files the same as the title that data18 has, so I personally don't use this method, but hopefully some people find it useful. One option I thought of to maybe make this a little better is we come up a standardized naming convention for Data18 content files like "Network - Actor1, Actor 2 (...) - Scene Title As It Appears on Data 18 (Optional Year or Date)" or something like that. Then we can just look at the last part of the file name before the date to search on data18. This gives us the advantage of not just having the scene title as our file name, which honestly is not very useful for these kind of movies.

Does anyone else have thoughts on a file name convention for this?
find quote
jiggsaw Offline
Member
Posts: 84
Joined: Aug 2012
Reputation: 0
Post: #23
(2013-12-08 07:29)DoctorD Wrote:  
(2013-12-06 07:00)jiggsaw Wrote:  Let us know when you figure out the google block issue. I'm getting blocked and it's a PITA. I can't really even use the scraper since it never goes through all my files.

Hi Jiggsaw,

I just released a new version you can download from the first post that gives you a configurable option to use Data18's own search instead of Google. It's really pretty strict and your files need to be named almost exactly the same as the title on data18, but it should prevent you from getting blocked. I don't really like naming my files the same as the title that data18 has, so I personally don't use this method, but hopefully some people find it useful. One option I thought of to maybe make this a little better is we come up a standardized naming convention for Data18 content files like "Network - Actor1, Actor 2 (...) - Scene Title As It Appears on Data 18 (Optional Year or Date)" or something like that. Then we can just look at the last part of the file name before the date to search on data18. This gives us the advantage of not just having the scene title as our file name, which honestly is not very useful for these kind of movies.

Does anyone else have thoughts on a file name convention for this?

Thanks DoctorD, I'm going to try it out now.

What I've been doing is "Actress-scene name-site". I used actual site instead of network. Except I used abbreviations to keep the file from being so long. I would say it worked pretty well. Just a little hiccups here and there.

EX: mlib instead of milfs like it big. I didn't use brazzers at all.

I agree that Data18's search sucks. However, I'm willing to change for whatever works best and is the most accurate.

[Image: all-thin-fanart.jpg]
(This post was last modified: 2013-12-08 15:56 by jiggsaw.)
find quote
FISHMANPET Offline
Junior Member
Posts: 1
Joined: Jan 2012
Reputation: 0
Post: #24
So I tried to name my files based on what Data18 has, but some of the movies have colons in the names and I can't use that in a name, so the search doesn't match up.

So if I'm going to use the Google route, what should I be naming my files? Or rather, if I put the filename into Google, what should a Google search return that would indicate it's working?
find quote
DoctorD Offline
Junior Member
Posts: 38
Joined: Apr 2013
Reputation: 1
Post: #25
If your scene shows up as the first result when you enter this in Google (replace file_name_goes_here with the name of your file without the file extension), then you should be good to go:

site:data18.com/content file_name_goes_here

Typically, you'll get the most accurate result if you name your file with both the full name of the website as it appears on data 18, the episode "name", and any actresses or actors that appear in that scene (you don't need both actors and actresses, if you'd prefer to just have one of them in the file name). I've noticed that including the date the scene was released in the file name is generally a bad idea unless it is also part of the episode name such as (Feb 2010) or something.

So for an imaginary site called "My Site", an episode name of "My Scene (Feb 2010)" starring the actress "Jane Doe", you can name your file:

My Site - My Scene (Feb 2010) - Jane Doe.mp4

or

Jane Doe - My Site - My Scene (Feb 2010).mp4

or any other variation of that order. Google doesn't really care about the order which is nice. The dashes are also optional in the above example, so you could leave them out or use something else if you want.

However, I've noticed that naming your file something like this:

My Site - My Scene (Feb 2010) - Jane Doe (2010-02-26)

sometimes causes problems with the search, so don't use those. It might be those minuses in the date being converted to boolean expressions? I'm not quite sure, but it's something I'll eventually look into.
find quote
worldroll Offline
Junior Member
Posts: 2
Joined: Jan 2014
Reputation: 0
Post: #26
I've been doing some scrapping for college rules and for some of them rather then putting
<set>College Rules</set>

<set>You guys are awesome, we have been receiving so many submissions from all you crazy college kids, but this one here takes the cake! ... </set>

(repeating the same description as in the plot)

eg http://www.data18.com/content/update_62580.html

Also data18 search seems to be beyond broke for searching for scenes, ie searching for "Scavenger Hunt" brings up the above as a possible link and yet putting "Fucken Scavenger Hunt" brings up nothing at all.

Must admit for all there faults it does make me appreciate thetvdb and themoviedb so much more.

P.S. Why is there not something like theadultdb
find quote
jiggsaw Offline
Member
Posts: 84
Joined: Aug 2012
Reputation: 0
Post: #27
(2014-01-21 03:08)worldroll Wrote:  I've been doing some scrapping for college rules and for some of them rather then putting
<set>College Rules</set>

<set>You guys are awesome, we have been receiving so many submissions from all you crazy college kids, but this one here takes the cake! ... </set>

(repeating the same description as in the plot)

eg http://www.data18.com/content/update_62580.html

Also data18 search seems to be beyond broke for searching for scenes, ie searching for "Scavenger Hunt" brings up the above as a possible link and yet putting "Fucken Scavenger Hunt" brings up nothing at all.

Must admit for all there faults it does make me appreciate thetvdb and themoviedb so much more.

P.S. Why is there not something like theadultdb

Yeah theadultdb would be awesome lol

[Image: all-thin-fanart.jpg]
find quote
Chuck Bartowski Offline
Member
Posts: 65
Joined: May 2011
Reputation: 0
Post: #28
(2014-01-26 19:04)jiggsaw Wrote:  
(2014-01-21 03:08)worldroll Wrote:  I've been doing some scrapping for college rules and for some of them rather then putting
<set>College Rules</set>

<set>You guys are awesome, we have been receiving so many submissions from all you crazy college kids, but this one here takes the cake! ... </set>

(repeating the same description as in the plot)

eg http://www.data18.com/content/update_62580.html

Also data18 search seems to be beyond broke for searching for scenes, ie searching for "Scavenger Hunt" brings up the above as a possible link and yet putting "Fucken Scavenger Hunt" brings up nothing at all.

Must admit for all there faults it does make me appreciate thetvdb and themoviedb so much more.

P.S. Why is there not something like theadultdb

Yeah theadultdb would be awesome lol

+1
find quote
Chuck Bartowski Offline
Member
Posts: 65
Joined: May 2011
Reputation: 0
Post: #29
For some reasons this scraper has stopped picking up the cast pictures
find quote
DoctorD Offline
Junior Member
Posts: 38
Joined: Apr 2013
Reputation: 1
Post: #30
Hello Chuck Bartowski,

I've fixed the problem with cast pictures. You can download the new version from the link in the first post.
find quote