• 1
  • 2
  • 3
  • 4(current)
  • 5
  • 6
  • 9
Release JAV Movie Scraper
#46
Since you had it open in eclipse, can you post or PM me the full stack trace (that is, the error message eclipse gives you)? I'm not able to reproduce the error on my own machine, but if I can find the offending line of code I may be able to fix it anyways.

Thanks!
Reply
#47
DoctorD,

Here you go:

Gui Initialized
setMainGuIEnabled with value false
fileBaseName = Summer Brielle - [MilfNextDoor] Thrilling Threesome - 20140830
searchString = http://www.data18.com/search/?k=Summer+B...140830&t=0
fileBaseName = Summer Brielle - [MilfNextDoor] Thrilling Threesome - 20140830
Scraping this webpage for movie: http://www.data18.com/content/1143099
poster 1
Data18 Scrape results: Movie [title=Title [title=Thrilling Threesome], originalTitle=OriginalTitle [originalTitle=], sortTitle=SortTitle [sortTitle=], set=Set [set=Milf Next Door], rating=Rating [maxRating=0.0, rating=], year=Year [year=2014], top250=Top250 [top250=], trailer = Trailer [trailer=], votes=Votes [votes=], outline=Outline [outline=], plot=Plot [plot=Bree invited her gal pals Cherie and Summer for an afternoon delight. She was bored and lonely just hanging out all by herself while her hubby was away on business, so she figured she would invite them over for lunch and some fun. After they ate, they retired to the backyard gazebo to relax and have a couple drinks. Play time started as soon as they noticed Summer s nipples standing at attention, due to the twilight breeze and evening chill beginning to settle in. They greeted the night completely satiated sexually. Summer and Cherie made sure they thanked their hostess for such a wonderful afternoon.], tagline=Tagline [tagline=], studio=Studio [Studio=Reality Kings], runtime=Runtime [runtime=], posters=[Thumb [thumbURL=http://94.229.67.74/4/27/143099/01.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/02.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/03.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/04.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/05.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/06.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/07.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/08.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/09.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/10.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/11.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/12.jpg]], fanart=[Thumb [thumbURL=http://94.229.67.74/4/27/143099/01.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/02.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/03.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/04.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/05.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/06.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/07.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/08.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/09.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/10.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/11.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/12.jpg]], extrafanart = [Thumb [thumbURL=http://94.229.67.74/4/27/143099/01.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/02.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/03.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/04.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/05.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/06.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/07.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/08.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/09.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/10.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/11.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/12.jpg]], mpaa=MPAARating [MPAARating=XXX], id=ID [id=], genres=[Genre [genre=Lesbian], Genre [genre=Milf], Genre [genre=Ass Licking], Genre [genre=Threesome], Genre [genre=3 Women], Genre [genre=Caucasian], Genre [genre=Blondes], Genre [genre=Outdoor]], actors=[Actor [role=null, toString()=Person [name=Brianna Ray, thumb=Thumb [thumbURL=http://img.data18.com/images/stars/120/9517.jpg]]], Actor [role=null, toString()=Person [name=Cherie Deville, thumb=Thumb [thumbURL=http://img.data18.com/images/stars/120/23120.jpg]]], Actor [role=null, toString()=Person [name=Summer Brielle Taylor, thumb=Thumb [thumbURL=http://img.data18.com/images/stars/120/18022.jpg]]]], directors=[]]
Exception in thread "AWT-EventQueue-0" java.lang.NullPointerException
at javax.swing.ImageIcon.<init>(ImageIcon.java:228)
at moviescraper.doctord.Thumb.getThumbImage(Thumb.java:279)
at moviescraper.doctord.GUI.GUIMain.updateAllFieldsOfFileDetailPanel(GUIMain.java:1392)
at moviescraper.doctord.GUI.GUIMain$ScrapeMovieAction$1.done(GUIMain.java:2139)
at javax.swing.SwingWorker$5.run(SwingWorker.java:737)
at javax.swing.SwingWorker$DoSubmitAccumulativeRunnable.run(SwingWorker.java:832)
at sun.swing.AccumulativeRunnable.run(AccumulativeRunnable.java:112)
at javax.swing.SwingWorker$DoSubmitAccumulativeRunnable.actionPerformed(SwingWorker.java:842)
at javax.swing.Timer.fireActionPerformed(Timer.java:312)
at javax.swing.Timer$DoPostEvent.run(Timer.java:244)
at java.awt.event.InvocationEvent.dispatch(InvocationEvent.java:312)
at java.awt.EventQueue.dispatchEventImpl(EventQueue.java:733)
at java.awt.EventQueue.access$200(EventQueue.java:103)
at java.awt.EventQueue$3.run(EventQueue.java:694)
at java.awt.EventQueue$3.run(EventQueue.java:692)
at java.security.AccessController.doPrivileged(Native Method)
at java.security.ProtectionDomain$1.doIntersectionPrivilege(ProtectionDomain.java:76)
at java.awt.EventQueue.dispatchEvent(EventQueue.java:703)
at java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:242)
at java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:161)
at java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:150)
at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:146)
at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:138)
at java.awt.EventDispatchThread.run(EventDispatchThread.java:91)
Reply
#48
Thanks for the detailed error message.

I think I fixed the error in my newest build. Can you redownload the jar (or compile yourself, I suppose) and see if it fixes the problem for you?

Even using the same filename as you, I couldn't recreate the error, but the problem was happening when the program failed to be able to read in an image from a URL (for example, the server is down or you get blocked somehow). Since servers go up and down all the time, that could be why it worked for me when I tried it. Either way, it was obvious when I looked at that line of code that I should have put in some extra checks to make sure things where OK with the read from a URL before proceeding. I also added a few more similar checks in other parts of the program.
Reply
#49
Thanks DoctorD,
I pulled your latest build, something changed, now another exception, here is the stack trace:
Gui Initialized
setMainGuIEnabled with value false
fileBaseName = Summer Brielle - [MilfNextDoor] Thrilling Threesome - 20140830
searchString = http://www.data18.com/search/?k=Summer+B...140830&t=0
fileBaseName = Summer Brielle - [MilfNextDoor] Thrilling Threesome - 20140830
Scraping this webpage for movie: http://www.data18.com/content/1143099
poster 1
Data18 Scrape results: Movie [title=Title [title=Thrilling Threesome], originalTitle=OriginalTitle [originalTitle=], sortTitle=SortTitle [sortTitle=], set=Set [set=Milf Next Door], rating=Rating [maxRating=0.0, rating=], year=Year [year=2014], top250=Top250 [top250=], trailer = Trailer [trailer=], votes=Votes [votes=], outline=Outline [outline=], plot=Plot [plot=Bree invited her gal pals Cherie and Summer for an afternoon delight. She was bored and lonely just hanging out all by herself while her hubby was away on business, so she figured she would invite them over for lunch and some fun. After they ate, they retired to the backyard gazebo to relax and have a couple drinks. Play time started as soon as they noticed Summer s nipples standing at attention, due to the twilight breeze and evening chill beginning to settle in. They greeted the night completely satiated sexually. Summer and Cherie made sure they thanked their hostess for such a wonderful afternoon.], tagline=Tagline [tagline=], studio=Studio [Studio=Reality Kings], runtime=Runtime [runtime=], posters=[Thumb [thumbURL=http://94.229.67.74/4/27/143099/01.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/02.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/03.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/04.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/05.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/06.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/07.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/08.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/09.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/10.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/11.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/12.jpg]], fanart=[Thumb [thumbURL=http://94.229.67.74/4/27/143099/01.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/02.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/03.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/04.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/05.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/06.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/07.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/08.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/09.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/10.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/11.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/12.jpg]], extrafanart = [Thumb [thumbURL=http://94.229.67.74/4/27/143099/01.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/02.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/03.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/04.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/05.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/06.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/07.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/08.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/09.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/10.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/11.jpg], Thumb [thumbURL=http://94.229.67.74/4/27/143099/12.jpg]], mpaa=MPAARating [MPAARating=XXX], id=ID [id=], genres=[Genre [genre=Lesbian], Genre [genre=Milf], Genre [genre=Ass Licking], Genre [genre=Threesome], Genre [genre=3 Women], Genre [genre=Caucasian], Genre [genre=Blondes], Genre [genre=Outdoor]], actors=[Actor [role=null, toString()=Person [name=Brianna Ray, thumb=Thumb [thumbURL=http://img.data18.com/images/stars/120/9517.jpg]]], Actor [role=null, toString()=Person [name=Cherie Deville, thumb=Thumb [thumbURL=http://img.data18.com/images/stars/120/23120.jpg]]], Actor [role=null, toString()=Person [name=Summer Brielle Taylor, thumb=Thumb [thumbURL=http://img.data18.com/images/stars/120/18022.jpg]]]], directors=[]]
Exception in thread "AWT-EventQueue-0" java.lang.IllegalArgumentException: Width (0) and height (0) cannot be <= 0
at java.awt.image.DirectColorModel.createCompatibleWritableRaster(DirectColorModel.java:1016)
at java.awt.image.BufferedImage.<init>(BufferedImage.java:331)
at moviescraper.doctord.ImageCache.getImageFromCache(ImageCache.java:35)
at moviescraper.doctord.Thumb.getThumbImage(Thumb.java:278)
at moviescraper.doctord.GUI.GUIMain.updateAllFieldsOfFileDetailPanel(GUIMain.java:1409)
at moviescraper.doctord.GUI.GUIMain$ScrapeMovieAction$1.done(GUIMain.java:2156)
at javax.swing.SwingWorker$5.run(SwingWorker.java:737)
at javax.swing.SwingWorker$DoSubmitAccumulativeRunnable.run(SwingWorker.java:832)
at sun.swing.AccumulativeRunnable.run(AccumulativeRunnable.java:112)
at javax.swing.SwingWorker$DoSubmitAccumulativeRunnable.actionPerformed(SwingWorker.java:842)
at javax.swing.Timer.fireActionPerformed(Timer.java:312)
at javax.swing.Timer$DoPostEvent.run(Timer.java:244)
at java.awt.event.InvocationEvent.dispatch(InvocationEvent.java:312)
at java.awt.EventQueue.dispatchEventImpl(EventQueue.java:733)
at java.awt.EventQueue.access$200(EventQueue.java:103)
at java.awt.EventQueue$3.run(EventQueue.java:694)
at java.awt.EventQueue$3.run(EventQueue.java:692)
at java.security.AccessController.doPrivileged(Native Method)
at java.security.ProtectionDomain$1.doIntersectionPrivilege(ProtectionDomain.java:76)
at java.awt.EventQueue.dispatchEvent(EventQueue.java:703)
at java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:242)
at java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:161)
at java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:150)
at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:146)
at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:138)
at java.awt.EventDispatchThread.run(EventDispatchThread.java:91)

I modified my copy, change the size from 0 to 1 in case of no image got from server, now I don't receive images, but program doesn't die.
Now I'm checking how could I put at the file renaming to the end of the file the video resolution. (720p , 1080p ,etc)
It is very difficult for me at my current level ... Smile
Reply
#50
Hello DoctorD!

English isn't my native language, so forgive me if i make any mistake. First of all, thank you for your hardwork to create an awesome tool. I just have a few feedback, Tokyo Hot metadata you can use this site, it's have every information you need, caribbeancom you can use this site instead of caribbeancompr, this 2 site is different. If u can add 1pondo and Heyzo too, my dream will come true. Once again, thank you for your hardword Tongue
Reply
#51
OK, I got the fix in using 1x1 pixel instead of 0x0 pixels.
Reply
#52
I've added initial support for 1Pondo. See the note in the readme on the github page for naming conventions for 1Pondo files and let me know how this goes!

1Pondo is kind of a mess of a site with a bunch of different formats for releases, so let me know if you notice any weirdness or missing info with specific scenes. It's worth noting that really old releases are just plain up missing lots of information such as actor information. I also couldn't find a great way to get genre information as it's not listed on the page the actual movie is on. They do have a page on the site which lists each movie in that genre, but I'd rather not have to scrape 100+ pages every time I want to scrape the genres for one movie as I would have to visit each genre, scroll through 10 or so pages on each and build up the list of movies. Anyone have a better idea on how to do this?

Anyways, I'll try to add the other requested jav sites when I have time.
Reply
#53
I've also added support now for Heyzo.com under the "Specific Scraper" section.
Reply
#54
I've also now added support for my.tokyo-hot.
Reply
#55
Caribbeancom support has now been added. Caribbeancom Premium support also is still there, so you can use the right site for the right time. Smile

That should wrap up shunobi's request for the web release JAV sites. Anyone know of any other JAV sites which still need a scraper written for them?
Reply
#56
OMG, thank you so much. You're my hero :X

Edit: Is there anyway to scape metadata in Japanese? Because these site is better with Japanese metadata, i tick scape jav movie in japanese but it still scape metadata in English.
Reply
#57
Hi DoctorD! First of all, thank you for developing and maintaining this application. It is by far the only one (that I know of) that can scrape JAV movies, and it did an awesome job! I'm happy and getting happier every update that adds more supported sites.

Some request/feedback:
1. If you have time, maybe a lot of time, please add this site: http://en.xxx-av.com/home/index.html. I'm yet to figure out their naming system because it seems they use more than one naming scheme for their files and in likely random manner. Movies from this studio that I got from somewhere else were named like xsh0272_01, xxs0002_02, or something in these pattern. The worst part is that there's no search functionality on their site (as far as I see.) So...... this is a tough one if you wanna give it a try.

2. (Shunobi posted above while I'm writing this reply and ask roughly the same thing here) In case you haven't notice. Some site have totally different design for Japanese and English version. The Japanese version is clearly newer, cleaner, has a lot better usability, and has more complete movie information. For example, for Caribbeancom movie id 052214-004, the English page is http://en.caribbeancom.com/eng/moviepage...index.html, while the Japanese page is http://www.caribbeancom.com/moviepages/052214-004/. The English page, when compared to the Japanese page, and for the majority of the movie, is missing "title", "description", and "genre." 1pondo and HEYZO have different design of different version also but the information difference is not as bad as caribbeancom.

The very problem I have with Caribbeancom scraper is that it returns with only Studio, Actors, and Posters, regardless of telling the scraper to scrape in Japenese or English; no title, no genre. This is understandable since the English page rarely have title for the movie and the plot and genre are not provided (but the release date (year) is provided so maybe something is not working as it should be?) But the Japanese page has all the information, isn't it? Then I realized it's because of the difference in web design of each version.

So the request here is to make the scraper scrape the real Japanese version page when this option is checked. But if this option wasn't initially intended to cover the specific scraper, please make it so!

3. If you haven't done so, I suggest, for the basic scraper, to also include www.r18.com as it is the official English version of dmm.co.jp. Sorry if it's already included but you didn't mention this anywhere on github page.

4. With basic scraper, I have a problem scraping movies from Tokyo247 studio whose the content ID is "tokyo###", with no dash in the middle. For example, tokyo134: [ http://www.r18.com/videos/vod/amateur/de...=tokyo134/ ] [ http://www.dmm.co.jp/digital/videoc/-/de...=tokyo134/ ] I named the folder/filename accordingly. However, the scraper return with error that it couldn't find the movie. Both manual and autoscraper returned the same error. Adding dash, using title as folder/filename didn't help

Well, sorry for the wall of text. And thank you again for taking requests. Cheers
Reply
#58
Hello Shunobi and Ixohoxi,

You guys are right in that the preference in the menu "Scrape JAV Movies in Japanese instead of English" does nothing right now for anything from the specific scraper list. It only works for DVD JAV releases. I'll plan to try to get some support for that in the site specific scrapers. It may take a while or come in in pieces because I have to write some new code for each specific scraper.

As for issues with caribbeancom, make sure you're on the newest version of the program (by redownloading the jar file) because when I scraped the movie that you linked I got English info on title, plot, genres, actors, year, poster, fanart, studio, etc. You're right that this info is missing on the English page, but when I wrote the scraper for this site, I noticed this and told it to go to the Japanese page to get this info and then perform a machine translation on the data.

Good idea on r18.com - I hadn't ever really looked at that site, but the English info on that site for the plot and title seem to be much better than I'm getting from other sites, so I can add that into my amalgamation algorithm to get higher quality data.

I'll check out what is happening with Tokyo247. The bug is probably happening because I was expecting there to always be a dash for releases from DMM and Tokyo247's naming convention wasn't compatible with my assumption.

For the site "en-xxx-av.com", it could be tough, but I'll see what I can do. Maybe I can do a google search on their site or something.

Thanks for all the feedback and don't forget you can check the commits page every now and then to see all the small little updates I'm making as I don't always post information about those here.
Reply
#59
Wow, that was super fast to fix and update in less than 12 hours! Thanks you!

More feedback:
1. Caribbeancom - I have downloaded today's update and the issue I was having with Caribbeancom scraper in the previous post was gone. However, new issue arises. The scraper works flawlessly with the first one I scrape. But after that, it is giving out the same TITLE, ORIGINAL TITLE, PLOT, and GENRE as the first movie scraped no matter what files I selected. ACTOR and POSTER doesn't seem to be affected. In short, it only works the first time. I'm guessing something wrong with the cache and/or memory??

2. Tokyo247 - It's working now. However, some files return the TITLE and ACTOR as "bookmark"?? One example is "tokyo134". "tokyo135" returned all correct info.

Some extra request:
1. S-Cute studio, while releases DVDs that are already scrape-able with basic scraper, also have web releases that are in different format. Each DVD release bundled together multiple actresses while web release doesn't do this. It looks more like episode release by each actress. Check it out here: http://www.s-cute.com/contents/

At download sites (ex: trackers), If you search for "scute" or "s-cute" it will return with web release. If you want DVD release, then you have to search with "sqte" instead. The web release has a simple naming scheme like this: ###_name_## (ex: http://www.s-cute.com/contents/363_azusa_04/) while the DVD release is like this: sqte-### (ex: http://www.s-cute.com/dvd/omnibus/sqte-070/). So it would be great if you can make it support this s-cute web release.


Wall of text again, sorry about that. And thank you again! Smile
Reply
#60
Thanks for the bug report ixohoxi. I think I should have addressed the issue with batch scraping in my newest build. Let me know if you notice any issues with it still.

As for the Tokyo247 data showing the name as "Bookmark", I actually noticed this same problem when I was writing the code to get it working! I don't really know Japanese, but from what I can gather, this is happening because the name for this particular person is all in hiragana: "しおり". In English letters, this is the correct name, Shiori. For whatever reason, google translate gives "Bookmark" as the translation for しおり and so that's why the scraper shows this. I don't know why google translate gives such a bad translation (Or maybe it's actually a good translation? I don't know, I don't speak Japanese), but it does!

A possible fix for this that I can write might be, on name elements, instead of going straight to google translation, if it detects every letter as hiragana it tries to do a phonetic translation into English letters. For example, "し" becomes "Shi". I might also be able to do the same if also has katakana in the name as I think (Huh) it follows a similar one to one translation structure. Do you think that will work?

Of course, once I get Japanese mode working for specific scrapers, you can just try to scrape in Japanese if you prefer.

I'll add S-Cute to the "todo list". Smile
Reply
  • 1
  • 2
  • 3
  • 4(current)
  • 5
  • 6
  • 9

Logout Mark Read Team Forum Stats Members Help
JAV Movie Scraper1