Work in Progress - IMVDb Music Video Scraper

  Thread Rating:
  • 1 Vote(s) - 5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Post Reply
tphoenix Offline
Junior Member
Posts: 26
Joined: Nov 2011
Reputation: 1
Post: #16
Hey guys, thanks for your posts. They have renewed my interest in this scraper Big Grin

I have gone back through the scraper and, due to the lack of info in the API, I have changed it to use the main site to pull info. In doing so I believe the issue with the accents should be fixed (the API had the formatted versions of the characters). I have also been able to ensure that only the directors are scraped in the director field (an issue I think might have been there previously, not too sure though).

And I've added Studio (taken from production company instead of record label, which I think suits better in terms of the music video). Then I looked at the options for viewing music videos (which include genre and tag), so I found the genre and tag links on the music video pages and added them as well.

Also by using the main site I think I have solved some of the mismatching issues that I had. Still no way that I know of to solve the issue of the music video not being listed on imvdb.com (which is another issue I guess...)

Feel free to go to github and download v0.4, let me know how it runs Big Grin

EDIT: The only issue I seem to be having now is with this video http://imvdb.com/video/sigur-rós/saeglopur, no thumbnail is being scraped. Getting the error:
Code:
17:48:55 T:8988   ERROR: CCurlFile::Stat - Failed: HTTP response code said error(22) for http://images.imvdb.com/video/310237994574-sigur-rós-saeglopur_music_video_ov.jpg?v=2
17:48:55 T:8988   DEBUG: CTextureCacheJob::GetImageHash - unable to stat url http://images.imvdb.com/video/310237994574-sigur-rós-saeglopur_music_video_ov.jpg?v=2
All other thumbs have been scraped fine, and I can load the image from that url in my browser. Any thoughts?
(This post was last modified: 2015-01-11 12:32 by tphoenix.)
find quote
RockerC Offline
Posting Freak
Posts: 1,514
Joined: May 2011
Reputation: 30
Post: #17
Thanks for this! Please submit to Kodi's official add-ons repository too, have really been missing this in the repo!

Submitting_Add-ons (wiki)

Sad that other larger music db's like TheAudioDB.com and MusicBrainz.org don't cater for music videos as well
find quote
tphoenix Offline
Junior Member
Posts: 26
Joined: Nov 2011
Reputation: 1
Post: #18
(2015-01-11 15:29)RockerC Wrote:  Thanks for this! Please submit to Kodi's official add-ons repository too, have really been missing this in the repo!

Submitting_Add-ons (wiki)

Sad that other larger music db's like TheAudioDB.com and MusicBrainz.org don't cater for music videos as well

Thanks for the support RockerC. I read through the submission guidelines and made some changes (added a fanart image and tweaked the icon.png file).

I'll probably wait for some more feedback from any users out there before submitting it.

Also, there is an existing Music Video scraper that uses TheAudioDB.com, however it links music videos to tracks from albums. I prefer IMVDb.com as it sees music videos as their own media (also why I chose to scrape production companies instead of record labels). Anyway, thanks again! Big Grin
find quote
zag Offline
Retired Team-Kodi Member
Posts: 4,006
Joined: Oct 2007
Reputation: 75
Location: UK
Post: #19
(2015-01-11 15:29)RockerC Wrote:  Thanks for this! Please submit to Kodi's official add-ons repository too, have really been missing this in the repo!

Submitting_Add-ons (wiki)

Sad that other larger music db's like TheAudioDB.com and MusicBrainz.org don't cater for music videos as well

TADB supports music videos and has a working scraper, it even grabs info from IMVDB music video site as well Wink

I'd be interested in any comparisons of the 2 scrapers.
(This post was last modified: 2015-01-11 18:11 by zag.)
find quote
hhtitan72 Offline
Junior Member
Posts: 3
Joined: Jan 2015
Reputation: 0
Post: #20
Quote:EDIT: The only issue I seem to be having now is with this video http://imvdb.com/video/sigur-rós/saeglopur, no thumbnail is being scraped. Getting the error:
Code:
17:48:55 T:8988 ERROR: CCurlFile::Stat - Failed: HTTP response code said error(22) for http://images.imvdb.com/video/3102379945...ov.jpg?v=2
17:48:55 T:8988 DEBUG: CTextureCacheJob::GetImageHash - unable to stat url http://images.imvdb.com/video/3102379945...ov.jpg?v=2
All other thumbs have been scraped fine, and I can load the image from that url in my browser. Any thoughts?

I have checked the link and it seems the website is currently providing a Youtube link for a video that has now become private. I have changed the link that the site currently provides to an accessible link hopefully the problem is solved.
find quote
tphoenix Offline
Junior Member
Posts: 26
Joined: Nov 2011
Reputation: 1
Post: #21
(2015-01-12 05:53)hhtitan72 Wrote:  I have checked the link and it seems the website is currently providing a Youtube link for a video that has now become private. I have changed the link that the site currently provides to an accessible link hopefully the problem is solved.

That could be the issue but I'm not 100% sure. The image is hosted on the IMVDb site, and it's accessible in my browser, so it should be able to be scraped. I'll wait until the site updates with the new video and see what happens.

(2015-01-11 18:09)zag Wrote:  I'd be interested in any comparisons of the 2 scrapers.

TADb Scraper: downloads album artwork.
IMVDb Scraper: downloads a video thumbnail.

TADb Scraper: scrapes album.
IMVDb Scraper: does not scrape album.
(I'm not a fan of linking music videos to albums, or at least I don't think it's necessary, however that means you can't search music videos by album or by artist (because the next level down is album). I was thinking of just scraping the song title into the <album> tag. Either that or the artist name. Any other suggestions, or preferences to that?)

TADb Scraper: scrapes album year.
IMVDb Scraper: scrapes video year.

TADb Scraper: scrapes album genre.
IMVDb Scraper: scrapes genre listed on the video page (if there is one).

TADb Scraper: doesn't scrape tags.
IMVDb Scraper: scrapes hashtags listed on the video page (if any).

TADb Scraper: requires relatively clean naming convention.
IMVDb Scraper: no strict naming conventions.

IMVDb Scraper processed:
Code:
Music Videos\ACDC\ACDC - Play Ball.avi
Music Videos\Childish Gambino - Sober\ Childish Gambino - Sober.mp4
Music Videos\Ed Sheeran\ Ed Sheeran - Don't.avi
Music Videos\Ed Sheeran\ Ed Sheeran - Thinking Out Loud [Official Video].avi
Music Videos\Blink_182-After_Midnight-DDC-720p-x264-2012-FRAY_INT.avi
Music Videos\Sia - Elastic Heart feat. Shia LaBeouf & Maddie Ziegler (Official Video).avi
Music Videos\Sigur Ros - Sæglópur.avi
While with the same set, TADb Scraper only processed:
Code:
Music Videos\Ed Sheeran\ Ed Sheeran - Don't.avi
Music Videos\Ed Sheeran\ Ed Sheeran - Thinking Out Loud [Official Video].avi
Music Videos\Sigur Ros - Sæglópur.avi

Running these files through TADb scraper only picks up Ed Sheeran and Sigur Ros (I think Childish Gambino isn't picked up because it's not on TheAudioDb yet). So on the plus side my scraper should be able to scan whatever files you have. On the down side...

TADb Scraper: will ignore the file if not match is found.
IMVDb Scraper: will select the closest match if the video isn't listed on IMVDb.
This could be a big issue if you are trying to import a large library all at once. Any thoughts/suggestions/recommendations for this?

The only other things I could think to add would be the extra credits for each music video. For example the Sigur Ros video lists Producers, Record Label, Editorial, Art Department, Post-Production Department, and Visual Effects. maybe also scraping an artist thumbnail, or some fanart from somewhere. Not sure about these though.

Thanks again for everyone's support! Blush
find quote
Bridgwater Offline
Member
Posts: 59
Joined: Jan 2015
Reputation: 0
Post: #22
@tphoenix sadly I cannot get this scraper to work at all on my system. I have tried a few times with a few videos, I have this file structure

Music Videos
- M
- Madness
- Baggy Trousers.MP4

I have tried Music Videos/Baggy Trousers.MP4 I have also tried leaving the file name the same as downloaded from YouTube but sadly with no joy.

Any advice on what I am doing wrong would be gratefully appreciated as the IMVDb is a fantastic website which I am sure will grow with time, so this scraper fully working would be excellent.

---------------------
To add to your questions

1) Album title is important for most people, more so than directors.

I have never used directors to search for a movie before simply because I only a few such as Ron Howard, Clint Eastwood, Michael Bay, Tim Burton and maybe one or two more. I personally find this a wasted feature and really only for a true movie buff.

Regards

Mark
find quote
tphoenix Offline
Junior Member
Posts: 26
Joined: Nov 2011
Reputation: 1
Post: #23
(2015-01-13 00:13)Bridgwater Wrote:  @tphoenix sadly I cannot get this scraper to work at all on my system. I have tried a few times with a few videos, I have this file structure

Music Videos
- M
- Madness
- Baggy Trousers.MP4

I have tried Music Videos/Baggy Trousers.MP4 I have also tried leaving the file name the same as downloaded from YouTube but sadly with no joy.

Any advice on what I am doing wrong would be gratefully appreciated as the IMVDb is a fantastic website which I am sure will grow with time, so this scraper fully working would be excellent.

I just tried adding a dummy file to "\Music Videos\M\Madness\Baggy Trousers.avi" and it scraped fine. If you can submit a debug log I'll have a look at it and see what's going on on your end.

(2015-01-13 00:13)Bridgwater Wrote:  To add to your questions

1) Album title is important for most people, more so than directors.

I have never used directors to search for a movie before simply because I only a few such as Ron Howard, Clint Eastwood, Michael Bay, Tim Burton and maybe one or two more. I personally find this a wasted feature and really only for a true movie buff.

Regards

Mark

Not all tracks from an album will have a music video, and some albums may not have any videos. Then there's single releases (which usually do have a music video), or artists that have made videos that have not released an album. In my opinion, while music videos are videos for a song that may come from an album, they are their own individual media. I understand where you are coming from (I've never searched for movies by director or studio before, or even genre I don't think), but Music Videos do have directors and production companies, and I think it's great to scrape that info. I personally would like to see music videos break down under artist then sorted by year. Of course I do value your feedback, this is still a WIP after all Tongue Right now TheAudioDb scraper pulls all the information this one does, and then some, but it is based on artists, albums and tracks - with music videos added on. IMVDb is more focused on just the Music Videos (which is why I like it).

UPDATE
Scraper has been updated to v0.5

- Changed source of info back to the API (found a way to get more information out of it).
- Removed support for Genres and Tags (these are not available in the API).
- Added Album data (which is just the song title again). - This is to allow for drilling down by artist.
- Known issues: Possible issue when scraping thumbs that have an accented character in the URL. - could be the issue with the Sigur Ros videos.

I had to take out support for genre and tags as these are not listed on the API (at least not yet. I have made contact with the lead developer at IMVDb and he seems to keen to get some feedback on how to improve it).
Also the issue I had before with accented characters (one of the other reasons why I changed scraping to the website directly) was because I didn't realise there was a fixchars flag. I enabled this and accented characters are showing fine.

I think that's about it for now. Again all feedback is welcomed Blush
find quote
Bridgwater Offline
Member
Posts: 59
Joined: Jan 2015
Reputation: 0
Post: #24
@Tphoenix - I have worked out what was happening, when I set up the video folder I had use folder names for look up on, this produced garbage results. I have since tried it with that option turned off and it scrapes almost every video correctly, but there are a few dodgy results which probably need tweaking at the user end. I hope that you carry on working on this scraper and also IMVDB as this for music video fans is the best thing since MTV.

*** Update *** Just out of interest, I don't suppose you could post a picture of what you are actually scraping compared to what I am scraping could you please?

I am scraping Directors, studios, year although some do show the wrong year, apart from that not getting any data regarding the single/singer such as bio. I am not 100% sure what it is meant to scrape.

Regards

Mark
(This post was last modified: 2015-01-14 00:08 by Bridgwater.)
find quote
Bridgwater Offline
Member
Posts: 59
Joined: Jan 2015
Reputation: 0
Post: #25
Hi, Tphoneix

I am just adding music video's and I have noticed one thing that you should stop straight away. You are scraping artist name, but also you are also using the song name for the album title. I have noticed that when you are doing this when you go to an artist such as

Tina Turner

You see the songs but then you have to click into the folder that it has created for that song, the album name should be left blank in my point of view if you are not actually going to have the album title in there, as really you do not want to have to double click to play a song, plus this is a hassle for playing the entire song list.

Regards

Mark
find quote
Hedda Offline
Fan
Posts: 676
Joined: Feb 2013
Reputation: 12
Post: #26
IMVDb.com's metadata itself (and any scraper for it) could be enhanced a lot if they only added artist and song id (identifier tag) for MusicBrainz (or FreeDB)

https://musicbrainz.org https://musicbrainz.org/doc/MusicBrainz_Identifier


That way it would be easy to gather additional information about the artists and songs in the music videos from these more populates sites.

TheMusicDB.com also uses those id tags I believe which means that the scraper could also gather additional information from it to if it only know those id tags.


MusicBrainz even have a FreeDB Gateway (mb2freedb) service that allows FreeDB clients to access MusicBrainz data through the FreeDB protocol

https://musicbrainz.org/doc/FreeDB_Gateway

http://www.freedb.org
(This post was last modified: 2015-01-16 11:48 by Hedda.)
find quote
tphoenix Offline
Junior Member
Posts: 26
Joined: Nov 2011
Reputation: 1
Post: #27
(2015-01-13 20:26)Bridgwater Wrote:  Just out of interest, I don't suppose you could post a picture of what you are actually scraping compared to what I am scraping could you please?
I am scraping Directors, studios, year although some do show the wrong year, apart from that not getting any data regarding the single/singer such as bio. I am not 100% sure what it is meant to scrape.

The only information I am scraping from IMVDb is the track title, artist name, release year of the video (which may be different to the release year of the song), director/s, studio (listed on the site as production company) and a thumbnail. If any of this information is not listed then it won't be scraped. Currently the IMVDb site does not list any biography/plot/summary of any sort so this scraper cannot grab that information.

(2015-01-14 13:37)Bridgwater Wrote:  You are scraping artist name, but also you are also using the song name for the album title. I have noticed that when you are doing this when you go to an artist such as Tina Turner, you see the songs but then you have to click into the folder that it has created for that song, the album name should be left blank in my point of view if you are not actually going to have the album title in there, as really you do not want to have to double click to play a song, plus this is a hassle for playing the entire song list.

I have made a change. I am now also scraping the artist name a second time and putting it in the <album> tag to help with sorting. You will still need to enter through an artists name twice, however if you go to view music videos by album, it will simply list your artists and then their videos. It is not the nicest work around but it is better than it was before. Thanks for your feedback.

(2015-01-15 11:46)Hedda Wrote:  IMVDb.com's metadata itself (and any scraper for it) could be if they added artist and song id (identifier tag) for MusicBrainz (or FreeDB)
That way it would be easy to gather additional information about the artists and songs in the music videos from these more populates sites.
TheMusicDB.com also uses those id tags I believe which means that the scraper could also gather additional information from it to if it only know those id tags.

Thanks for sharing Hedda. Currently I am going to ask if Genre and Tags can be added to the API so that I can scrape these, and I'll ask about including MusicBrainz identifiers to artists/tracks. They currently have links to buying music on iTunes and Amazon, maybe adding a section for MusicBrainz ID won't be a problem. Can't hurt to ask.

Also, does anyone know if music videos can have cast/crew information as well, or even featured artists? Is there support for any information like that being scraped? This kind of information is currently available on the API.

I have updated the scraper to v0.6 as per the above change (albums are now artist name, no longer track name). There is still just the issue of scraping thumbs with accented characters in the URL. Anyone have an idea on how to fix this?

Again, thanks for everyone's support Big Grin
find quote
zag Offline
Retired Team-Kodi Member
Posts: 4,006
Joined: Oct 2007
Reputation: 75
Location: UK
Post: #28
(2015-01-16 09:32)tphoenix Wrote:  Also, does anyone know if music videos can have cast/crew information as well, or even featured artists? Is there support for any information like that being scraped? This kind of information is currently available on the API.
I have updated the scraper to v0.6 as per the above change (albums are now artist name, no longer track name). There is still just the issue of scraping thumbs with accented characters in the URL. Anyone have an idea on how to fix this?

No cast/crew support in kodi at the moment but it would be a nice new feature. Especially to sort by director.

Not sure about the accented characters, you might want to check that the API supports UTF8
find quote
Hedda Offline
Fan
Posts: 676
Joined: Feb 2013
Reputation: 12
Post: #29
(2015-01-16 11:17)zag Wrote:  
(2015-01-16 09:32)tphoenix Wrote:  Also, does anyone know if music videos can have cast/crew information as well, or even featured artists?
No cast/crew support in kodi at the moment but it would be a nice new feature. Especially to sort by director.
Would be fun to have "Cameo appearance" (often shortened to just cameo) too as those are always funny to see in music videos.

http://en.wikipedia.org/wiki/Cameo_appearance

Cameos in a music videos are not the same as featured artists as the person doing a cameo in a music video does not actually sing the song, they are only acting as actors in the music video.

Google using keywords like "music video cameo" or "cameos music videos" and you find a bunch.
(This post was last modified: 2015-01-16 11:53 by Hedda.)
find quote
tphoenix Offline
Junior Member
Posts: 26
Joined: Nov 2011
Reputation: 1
Post: #30
(2015-01-16 11:17)zag Wrote:  No cast/crew support in kodi at the moment but it would be a nice new feature. Especially to sort by director.

Not sure about the accented characters, you might want to check that the API supports UTF8

You can't sort by director as such, but you can look at videos by director (instead of artist). You can also search through studios, year, genre and tags (these last two I hope to get implemented in the API so I can scrape them).

The URL of the thumbnail is being scraped correctly, but Kodi doesn't seem to be able to download it. As per the error code I have been getting:
Code:
17:48:55 T:8988   ERROR: CCurlFile::Stat - Failed: HTTP response code said error(22) for http://images.imvdb.com/video/310237994574-sigur-rós-saeglopur_music_video_ov.jpg?v=2
17:48:55 T:8988   DEBUG: CTextureCacheJob::GetImageHash - unable to stat url http://images.imvdb.com/video/310237994574-sigur-rós-saeglopur_music_video_ov.jpg?v=2
That's the URL that is in the <thumb> tag, and it loads fine in a browser, but for some reason it isn't being downloaded by Kodi.

(2015-01-16 11:47)Hedda Wrote:  Would be fun to have "Cameo appearance" (often shortened to just cameo) too as those are always funny to see in music videos.

Cameos in a music videos are not the same as featured artists as the person doing a cameo in a music video does not actually sing the song, they are only acting as actors in the music video.

IMVDb pretty much has this covered in their cast section (similar to their crew section). See Beastie Boys - Make Some Noise as an example. Given all the information that IMVDb supports for cast/crew/featured artists, it would be great to get some of these things implemented.
find quote