• 1
  • 3
  • 4
  • 5(current)
  • 6
  • 7
  • 9
Release JAV Movie Scraper
#61
(2014-11-12, 02:44)DoctorD Wrote: As for the Tokyo247 data showing the name as "Bookmark", I actually noticed this same problem when I was writing the code to get it working! I don't really know Japanese, but from what I can gather, this is happening because the name for this particular person is all in hiragana: "しおり". In English letters, this is the correct name, Shiori. For whatever reason, google translate gives "Bookmark" as the translation for しおり and so that's why the scraper shows this. I don't know why google translate gives such a bad translation (Or maybe it's actually a good translation? I don't know, I don't speak Japanese), but it does!

A possible fix for this that I can write might be, on name elements, instead of going straight to google translation, if it detects every letter as hiragana it tries to do a phonetic translation into English letters. For example, "し" becomes "Shi". I might also be able to do the same if also has katakana in the name as I think (Huh) it follows a similar one to one translation structure. Do you think that will work?

Ha, I didn't even realize that "しおり" would mean "bookmark." I thought it means some kind of programming error and that it's a name of a function or library or something like that. While I understand the problem I don't know Japanese either so I can't give any cent about Hiragana or Katakana.

The fix for scanning name you propose sounds a bit complicate and may add more time to the scraping process. But if you can figure it out it will, as I understand, preserve not just the name, but also any other specific word that shouldn't be translated. Another idea is, if scraper sees actress name in Japanese as well as English, maybe you can have it to crosscheck the title with the actress name so it won't translate it. I don't know. It just an idea.

Thanks!

Edit: Just saw your update now. Impressively FAST! So you figured out the Hiragana/Katakana thing. Awesome!! I'll test around with some files and will let you know if there's any bug.
Reply
#62
Hi Doctor D - i know its not a JAV site but a scraper for Excalibur Fims would be fantastic if that could be built in
Reply
#63
Hi Doctor D,

Thank you for creating this scraper. It's a real time saver!

As I was editing some files, I noticed that thou this scraper will download the fanart and the poster pics, the nfo file does not link to them? The nfo still link to the images at the respective websites. Is there any way I can link them to local fanart and poster pics directly? For my case, I normally use fanart.jpg and poster.jpg in individual folders.
Reply
#64
The new update of Caribbean work great, i can scape in Japanese now. Thanks DoctorD.
Reply
#65
(2014-11-14, 09:50)yellie Wrote: Hi Doctor D,

Thank you for creating this scraper. It's a real time saver!

As I was editing some files, I noticed that thou this scraper will download the fanart and the poster pics, the nfo file does not link to them? The nfo still link to the images at the respective websites. Is there any way I can link them to local fanart and poster pics directly? For my case, I normally use fanart.jpg and poster.jpg in individual folders.

If you are using XBMC, it is better that way. XBMC always uses local files first. The beauty of having the others in the NFO is so that if you ever decide you want to change the poster or NFO, you can do so from within XBMC and it will know where to go get the replacements.

-Pr.
[4 Kodi Clients + 4 Norco RPC-4224 Media Servers w/376 TB HDD Space]
Reply
#66
I guess I can add Excalibur films once I work through some of the backlog of other things. Next up on my list is getting the other specific parser scrapers to work in Japanese first, however.

Do you find the data better / more complete than data18? If so, what type of data do you tend to find better on it (such as movie posters, selection of titles, actor thumbnails, plot descriptions, etc)? The reason I am asking is if some type of data tends to be better on excalibur, maybe I can create some kind of automatic amalgamation between various adult sites like I'm doing with JAV DVD releases.
Reply
#67
(2014-11-14, 16:38)Pr.Sinister Wrote:
(2014-11-14, 09:50)yellie Wrote: Hi Doctor D,

Thank you for creating this scraper. It's a real time saver!

As I was editing some files, I noticed that thou this scraper will download the fanart and the poster pics, the nfo file does not link to them? The nfo still link to the images at the respective websites. Is there any way I can link them to local fanart and poster pics directly? For my case, I normally use fanart.jpg and poster.jpg in individual folders.

If you are using XBMC, it is better that way. XBMC always uses local files first. The beauty of having the others in the NFO is so that if you ever decide you want to change the poster or NFO, you can do so from within XBMC and it will know where to go get the replacements.

-Pr.

Yes, I think Pr.Sinister is correct about this. I actually don't think there is any way to point to a local file in a XBMC coded nfo file, even if I wanted to, because it expects a URL in the <thumb> element. Fortunately, it will always look at the local file first, so you should be OK. In some cases, you HAVE to use the local file my scraper creates anyways to get things to look right, because my program will actually create the image it saves to disk by cropping or joining images.
Reply
#68
hi guys,

excuse my ignorance,

but i get what a site scraper is, and like the idea of them

have got data18 on my android xbmc box, just dont know what to do with it now..

any pinters or youtube vids or thread that make it easy for a simpleton like myself to get going

thnx keith
Reply
#69
(2014-11-14, 23:27)keithc8765 Wrote: hi guys,

excuse my ignorance,

but i get what a site scraper is, and like the idea of them

have got data18 on my android xbmc box, just dont know what to do with it now..

any pinters or youtube vids or thread that make it easy for a simpleton like myself to get going

thnx keith

Hello Keith,

I suspect you're posting in the wrong thread. This is a thread about a standalone program which does scraping for you, not the other scraper which integrates into XBMC directly (I suspect this is what you already have). You can download this program from https://github.com/DoctorD1501/JAVMovieScraper.

Anyways, the basic idea of this program in this thread is that you scrape your media before you even start up XBMC. The program creates a "nfo" file, which contains title, plot, actor, and other information as well as a few jpg files which contain the poster and fanart images. By scraping outside XBMC, you get a little more control on what the final scraped file looks like and can confirm it got the right one. Then, once you have scraped the file and created the nfo and jpg files in the same directory as your movie file, pointing XBMC to this file as a source will allow XBMC to read in the scraped information.

If you prefer not to do this, you can instead use the scraper in XBMC that you already have. Go to videos -> files and set up a new source. Set the source type as Data 18 Web Content and point it to the folder where you have your movies. You need to be kinda specific with file names for this scraper, so see this thread http://forum.kodi.tv/showthread.php?tid=...pid1426612 for more tips on how to use that scraper.

For more general info about using XBMC scrapers, this tutorial has some info on scraping regular movies (released in theaters):

https://gbatemp.net/threads/xbmc-a-tutor...rs.350372/

The process is more or less the same to use the data18 web content scraper, just change the scraper used to the data 18 web content scraper instead of using something like TMDB or IMDB since we want to use an adult movie database instead of one made for regular Hollywood releases.
Reply
#70
(2014-11-14, 19:52)DoctorD Wrote: I guess I can add Excalibur films once I work through some of the backlog of other things. Next up on my list is getting the other specific parser scrapers to work in Japanese first, however.

Do you find the data better / more complete than data18? If so, what type of data do you tend to find better on it (such as movie posters, selection of titles, actor thumbnails, plot descriptions, etc)? The reason I am asking is if some type of data tends to be better on excalibur, maybe I can create some kind of automatic amalgamation between various adult sites like I'm doing with JAV DVD releases.

Hi DoctorD

I find the plot description better on Excalibur and the posters seem to be of higher quality without watermarks. The only disadvantage is it seems to add "Adult" after each title. Actor thumbs are about equal although there are occasions where they don't have all that data18 does
Reply
#71
(2014-11-14, 20:03)DoctorD Wrote:
(2014-11-14, 16:38)Pr.Sinister Wrote:
(2014-11-14, 09:50)yellie Wrote: Hi Doctor D,

Thank you for creating this scraper. It's a real time saver!

As I was editing some files, I noticed that thou this scraper will download the fanart and the poster pics, the nfo file does not link to them? The nfo still link to the images at the respective websites. Is there any way I can link them to local fanart and poster pics directly? For my case, I normally use fanart.jpg and poster.jpg in individual folders.

If you are using XBMC, it is better that way. XBMC always uses local files first. The beauty of having the others in the NFO is so that if you ever decide you want to change the poster or NFO, you can do so from within XBMC and it will know where to go get the replacements.

-Pr.

Yes, I think Pr.Sinister is correct about this. I actually don't think there is any way to point to a local file in a XBMC coded nfo file, even if I wanted to, because it expects a URL in the <thumb> element. Fortunately, it will always look at the local file first, so you should be OK. In some cases, you HAVE to use the local file my scraper creates anyways to get things to look right, because my program will actually create the image it saves to disk by cropping or joining images.

Thanks for the assistance Pr.Sinister and Doctor D! I uses XBMC, but most of the time I used it on my iPad instead of the PC or Mac version. There are some differences for sure, but I'm not sure what are the differences. Pr.Sinister post reminded me of the feature where I can choose my own poster and fanart lol. Previously I was always editing the nfo directly that I completely forgotten about changing them in XBMC itself.

For my case, the XBMC in my iPad doesn't seems to pick up the poster.jpg, and I have to edit each movie info to get them right. Maybe it's the XBMC issue? Thanks again!
Reply
#72
Hi again.

Problem:
  • I still have a problem with TOKYO247. Somehow, "TOKYO134" still returns the title "しおり" as "Bookmark" instead of "Shiori." The actress name now returns correctly though. This is nothing since I can just put in the name myself but if you have time please look into it again.
  • Also, has anybody have a problem with Rename Settings? I don't see any report on this. The rename settings seems to allow only one pattern: %TITLE% %[ACTORS]% %(YEAR)% %[ID]%. Anything other than this will break the auto-generated example, make it generate something like "C:\Temp\1999A1999c1999t1999o1999r1999 1999A1999,1999...1999B1999.avi" (full name here http://pastebin.com/BtdFNwX4) When write data it will return "Unhandled Exception" error and the application goes not responding.

Suggestion/Request:
  • Any way to add an option to add Movie ID at the front of the movie titles (and written into nfo file)? So that in Plex or XMBC/Kodi the movies will be sorted by ID in the title and, to me, make it much easier and faster to search for the movie I want.
  • Please add 10musume scraper
  • Please add 1000giri scraper
Reply
#73
Hi DoctorD,

Let me start by thanking you for all your hard work on this. People are quick to ask for new scrapers and new features but i know it is not that easy!

I have been extremely busy these past few months and i finally got around to testing the data18 movie scraping. I needed to make space on my download drive by moving movies to their final destination and wanted to see if that would help in expediting this.

1st off, i need to tell you i am very particular about my collections and i hate having bad metadata. That is why my usual workflow is scraping all my adult movies with Ember Media Manager's IMDB scraper and then downloading the poster manually and copying and pasting the plot from one of the buying links found on IAFD. The reason for that is because IMDB usually has the REAL original release year instead of the DVD release year and they usually have the right cast. A quick example of that is The Adventures of Buttman... It was released in 1989 but data18 puts it at 2003

Most people don't know that IMDB has 95% of all Adult Movies and 100% of the "classic" ones. The trick is to search IMDB from Google to find the titles you need. I even created a Custom Google Search to ease me finding the titles. I then copy the IMDB ID into Ember and it scrapes the movie. It's time consuming but i know it's accurate.

Anyway, all this babbling to get to my point of suggesting a couple of feature for all scrapers:
  • Ability to pick which fields to scrape
  • Ability to append to existing NFO
  • Ability to only scrape images and not the NFO
  • Ability to not download fanart (only poster)
  • Ability to set the title to the folder name

Explanations of features:

Ability to pick which fields to scrape & Ability to append to existing NFO

These go hand in hand because basically, i want to pick and choose which data comes from where. I would get as much as i can from IMDB and whatever is missing, i can get from data18 (usually plot or IAFD actors) without losing what i got from IMDB.

Ability to only scrape images and not the NFO

If my NFO is pretty much perfect IMO, then i just need the images

Ability to not download fanart (only poster)

Just like the whole jacket for JAV scraper at the beginning, you have the back cover set as the fanart for data18 movies. I'd rather have no fanart than the back cover.

Ability to set the title to the folder name

This one is JAV specific. As you know, some of the translated titles of JAV movies are pretty shocking. Stuff about r@pe, big sister, mother-son, etc... i rather not have my list cluttered with that stuff. If we could have an option to scrape all the JAV info but set the title to the folder name, it would be great. I only use the ID in my folder name.

Sorry for the long winded post... i tend to ramble on a lot...

Once again, thank for your tremendous work... I have a few things i'd like to discuss with regards to the data18 webcontent scraper but i will leave that for later Smile

-Pr.
[4 Kodi Clients + 4 Norco RPC-4224 Media Servers w/376 TB HDD Space]
Reply
#74
Quick Bug Report...

I enabled using IAFD for Actors instead of Data18.com but that didn't work. The Cast and Pictures were from data18.com

Movie: Racquel Darrian: Best Ass On the Planet
[4 Kodi Clients + 4 Norco RPC-4224 Media Servers w/376 TB HDD Space]
Reply
#75
How can I get Scrape jav information display Japanese?
when i scrape MKBD-Sxx in AVE all information is English
Reply
  • 1
  • 3
  • 4
  • 5(current)
  • 6
  • 7
  • 9

Logout Mark Read Team Forum Stats Members Help
JAV Movie Scraper1