[RELEASE] Data18.com Web Content Scraper - Adult Movie Web Downloads

  Thread Rating:
  • 2 Vote(s) - 5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Post Reply
user136 Offline
Junior Member
Posts: 3
Joined: Aug 2014
Reputation: 0
Post: #46
does this still work? i use it to find without google because it blocks me all the times.

If i search on there website i dont see the file, did they changed the security?
i renamed all the times on the name of person and the name of update

How do i need to rename the files i just cant figure this out tryd 4 whole days.

thanks!
find quote
DoctorD Offline
Member
Posts: 91
Joined: Apr 2013
Reputation: 4
Post: #47
It works ok for me when I try it, even without using the google search mode. Obviously using the google search mode is more lenient on what you name the file, but if you don't want to use it, try naming your file exactly the same as the title of the file from the page on the site. This is found on the page of the file above where it has the thumbnail image and says something like:

Date: January 01, 2013 | ! Report errors

The problem with this is that it's kind of annoying to name your files this because then you can't include the sitename or actors in the filename. So really, my suggestion is to use google and stagger your searches to not get blocked, because data18's search is terrible.
find quote
user136 Offline
Junior Member
Posts: 3
Joined: Aug 2014
Reputation: 0
Post: #48
ok thanks, have it
(This post was last modified: 2014-08-28 09:23 by user136.)
find quote
DoctorD Offline
Member
Posts: 91
Joined: Apr 2013
Reputation: 4
Post: #49
I've released a few updates (1.5.4 and 1.5.5.) to address some issues with date content not scraping.
find quote
Chuck Bartowski Offline
Member
Posts: 81
Joined: May 2011
Reputation: 0
Post: #50
Thanks DoctorD - will test it tonight
find quote
Chuck Bartowski Offline
Member
Posts: 81
Joined: May 2011
Reputation: 0
Post: #51
Hi DoctorD
It's great but still not picKing up fanart. Is this a known issue.
Thanks for all your work
find quote
DoctorD Offline
Member
Posts: 91
Joined: Apr 2013
Reputation: 4
Post: #52
It's a problem with XBMC itself. XBMC doesn't support the spoof attribute with fanart and some servers that data18 uses require that the referrer be from data18 so people don't try to rip off their images. It's weird because this requirement is not consistent across servers that data18 uses, so sometimes the fanart will work and sometimes it won't.

I've actually filed a bug / feature request for XBMC to add this (see earlier post in this thread), but it hasn't been implemented yet.

To get around this, you can use my standalone scraper program which supports both data18 content and Japanese movies.

Here's the link to it:

https://github.com/DoctorD1501/JAVMovieScraper
find quote
Chuck Bartowski Offline
Member
Posts: 81
Joined: May 2011
Reputation: 0
Post: #53
Thanks DoctorD will try out the JAV scraper and let you know how it goes. Again, thank you
find quote
str1567 Offline
Junior Member
Posts: 6
Joined: Oct 2014
Reputation: 1
Post: #54
Hi DoctorD!

I appreciate your scraper very much. However I have faced several issues. I have tried fixed some of them by myself:
1. Google blocks me very quickly. Just after 10-20 files scraped. So I have replaced it with Bing. Bing works without any problem.
2. Added support for 'Official poster' image. Page example: http://www.data18.com/content/178762
3. Video preview image with alt="Scene Preview" is supported. Page example: http://www.data18.com/content/174268
4. Fanart is not loaded from image gallery for pages like this: http://www.data18.com/content/179454 . Probably because of the xbmc issue discussed above.
The worst thing is that broken fanart replaces the 'This is fallback in case there is no image gallery.' image and as result - there is no fanart at all! So I have temporary commented out the logic for image gallery fanart.

Could you please consider merging my changes into your code (except the last one - I think there should be a better fix for it)?

The updated 1.5.5 version can be found here: http://www.mediafire.com/download/gjy31v...m.bing.zip
find quote
DoctorD Offline
Member
Posts: 91
Joined: Apr 2013
Reputation: 4
Post: #55
Hi str1567,

Thanks for posting your updated version. Data18 has so many types of pages, so it's easy for me to miss one or two here, so your additions are greatly appreciated. I'm mostly using my standalone scraper program I wrote for Data18 these days, so I'm not always aware of when things break or change in the XML version anymore. I think my standalone might actually have a few of the same issues you reported with it too, so I'll have to check that out as well.

I'm going to check out your code and see what I can do to get it merged (maybe I can add an option to let you switch between Bing and Google? I initially wrote in Bing support back before the first release, but it seemed to find a match of a poorly named file less often than Google so I switched it out) and also see what I can do to fix item number 4 (maybe I can add the fallback image to the last item of the fanart every single time so you'll at least have something...?)

I'll make another post with my progress when I have made my update. It may take a few days since I'm fairly busy this week.

Thanks again!
(This post was last modified: 2014-10-07 17:38 by DoctorD.)
find quote
str1567 Offline
Junior Member
Posts: 6
Joined: Oct 2014
Reputation: 1
Post: #56
Thanks!
find quote
ScrappingFTW Offline
Junior Member
Posts: 1
Joined: Oct 2014
Reputation: 0
Post: #57
Hi DoctorD, thanks for your efforts.

Here's a small patch to fix path separator character on Linux.

Regards.
find quote
DoctorD Offline
Member
Posts: 91
Joined: Apr 2013
Reputation: 4
Post: #58
Thanks ScrappingFTW. I implemented your fix and posted a new build with it. Hopefully it should work OK now for you on Linux.
find quote
sharpenednoodle Offline
Junior Member
Posts: 1
Joined: Nov 2014
Reputation: 0
Post: #59
Sorry to crash the thread; I was playing around with this scraper because I wanted to see how it handles nfo files. Most of my collection is named whatever the file was originally named, but then also has a nfo file with the URL of the scene. From sites like Brazzers, the nfo contains a URL pointing to the page on Brazzers for that specific scene. Now I updated a few of the nfo files for non Brazzers content to point to their relevant scene page on data18. After running the scraper, all of the brazzers content scraped perfectly, however absolutely none of the files with nfo files pointing to data18 urls worked at all. Is there something I'm missing here?

To clarify, the nfo files are named the same as the video file, and they only contain a URL relevant to the scene
EDIT:
This is what works

btas_alexis_ford_720p_8000.mp4
btas_alexis_ford_720p_8000.nfo

nfo contains
"http://www.brazzers.com/scenes/view/id/6794/teaching-mr-sins/"

This doesn't work
zzs_bw_paris04_720p_8000.mp4
zzs_bw_paris04_720p_8000.nfo

nfo contains
"http://www.data18.com/content/190242"

Any help greatly appreciated Big Grin
(This post was last modified: 2014-11-01 03:51 by sharpenednoodle.)
find quote
str1567 Offline
Junior Member
Posts: 6
Joined: Oct 2014
Reputation: 1
Post: #60
Hi Doctor!

I have tried your JAVMovieScraper - looks promissing!
Some feedback:
1. The "Select movie to scrape from Data18 movie" dialog appears even if there is only one movie found. It can be tiresome to select each movie while indexing several handreds of moviesSmile. Is it possible to provide an option to autoselect singular movie? Ideally there can be a batch mode so big amount of movies can be processed without manual actions.
2. The "Pick Poster" window appears even the "Write fanart and posters file option is disabled". This again requires some manual action. Is it possible to have an option not to download poster or autoselect it?
3. I have noticed that scraper is sensitive for file name delimeters. For example "x-art_capri_tyler_green_eyes_1080.mov" is not found on data18.com, but if I rename it to the "x-art capri tyler green eyes 1080.mov" - it is found succesfully. I have noticed the "Rename settings" option. Can I use it to preprocess the file name? Could you please give a clue how to use it?
4. There is a problem with fanart downloading from the data18.com. Currently I see black boxes instead of images and when poster and fanart are written to disk - the files have zero size. And xbmc can not download images like <thumb>http://78.157.200.234/301/2768/119028/01.jpg</thumb> as well. The issue can be reproduced with "x-art capri tyler green eyes 1080.mov" movie.
As an workaround I suggest the following: add an option to write the only data18.com content URL to the nfo file without film metadata. After that the movies can be simply scraped by the existing data18.com xbmc scraper (which currently can download most of fanart).

Regarding Bing vs Google search...
I have double checked. You are right, Google gives more accurate results than Bing while searching on the data18.com especially for old movies (e.g. released in 2009-2011).
(This post was last modified: 2014-11-07 00:00 by str1567.)
find quote
Post Reply