Work in Progress - IMVDb Music Video Scraper

  Thread Rating:
  • 1 Vote(s) - 5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Post Reply
tphoenix Offline
Junior Member
Posts: 26
Joined: Nov 2011
Reputation: 1
Video  IMVDb Music Video Scraper
Post: #1
[Image: icon.png]

IMVDb Music Video Scraper

Current version: 0.6

Scrapes: Artist, Title, Year, Director/s, Thumbnail, Studio, Genre/s, Tag/s, Album (as Artist).

v0.6 (2015-01-16)
- Changed album name to be scraped from artist name instead of track title. This allows for all music videos by an artist to be viewed at once.

v0.5 (2015-01-13)
- Changed source of info back to the API (found a way to get more information out of it).
- Removed support for Genres and Tags (these are not available in the API).
- Added Album data (which is just the song title again).
- Known issues: Possible issue when scraping thumbs that have an accented character in the URL.

v0.4 (2015-01-11)
- Changed source of info to the main site, as there is limited information on the API.
- Added support for Studio (scraped from Production Company).
- Also added Genre and Tag support.

v0.3 (2014-10-09)
- Added ability to scrape thumbnail.

v0.2 (2014-10-09)
- Added ability to scrape director/s.

v0.1 (2014-10-04)
- Beta release: Currently scrapes artist, title, and release year.
I recently stumbled across The Internet Music Video Database, or IMVDb, when looking for some music videos to watch. It seems like a great source of information for music videos in XBMC, so I decided to start putting together a scraper for it. I've only dabbled in bits and pieces of coding before so any help or advice is appreciated.

(2014-10-04 15:46)tphoenix Wrote:  Currently this scraper only retrieves artist, track and year information. I wasn't sure what else to include, or what else can be displayed by XBMC. Some videos have a director listed, or other production companies, and every video seems to have thumbs. Should the scraper also be pulling this information?

I have been testing this with all of my video files in one folder, and your music video files do not have to be in the 'artist - track' naming format. I have been able to scrape videos using their YouTube names (e.g. "David Guetta - Play Hard ft. Ne-Yo, Akon (Official Video)" or "Ariana Grande feat. Big Sean - Right There"). The only video I'm having trouble with at the moment (I haven't tried a huge range though) is "Jessie J, Ariana Grande, Nicki Minaj - Bang Bang". Maybe the commas are effecting it? (this now works) The only change the scraper makes to file names at the moment is to replace - and ! with spaces. The - is treated as a minus (excluding search terms) and the ! ends the search string (or something like that. I'm not entirely sure) (the scraper replaces commas, dashes, exclamation marks, and spaces, with %20)

For the most part the scraper is working. If anyone wants to try it out and give me some feedback I would really appreciate it. You can get v0.1 here v0.3 v0.4 here.

Special thanks to pko66 for his how-to on writing media scrapers, budswell for his MMA scraper (I used it as an example/for comparison), UsagiYojimbo for his java tool ScraperEdit, and the folks over at IMVDb.
(This post was last modified: 2015-07-01 09:29 by tphoenix.)
find quote
zag Offline
Retired Team-Kodi Member
Posts: 4,006
Joined: Oct 2007
Reputation: 75
Location: UK
Post: #2
Nice! was meaning to suggest this earlier.

Take a look at theaudiodb mvid scraper for references. It should be very similar as they both have a json api.

That scraper returns either a video screenshot, or the album cover as the image.

Also I started collecting the imvdb id's on TADB for the future. Not sure how much use it will be.

I know from working on previous scrapers that removing the ft. and feat. is sometimes useful but if its working without this manipulation then great Smile

Can you get it up on github?
(This post was last modified: 2014-10-06 13:41 by zag.)
find quote
tphoenix Offline
Junior Member
Posts: 26
Joined: Nov 2011
Reputation: 1
Post: #3
I went ahead and put it up on github. I haven't used the site before, so hopefully I did it right. Here's the link: https://github.com/tphoenix/imvdb-scraper

I have updated the scraper to version 0.2, adding the ability to scrape director information if available.

I'll have a look into adding thumb support as well.

EDIT: I have thumb support! Updated on github to v0.3.

Now the only concern I have is the way it grabs the director info. Currently it will just grab anyone listed as working on the video under entity_name (this may not always be a director). I haven't seen anything other than a director so far though, but if you do see someone being scraped by mistake let me know.
(This post was last modified: 2014-10-09 05:19 by tphoenix.)
find quote
zag Offline
Retired Team-Kodi Member
Posts: 4,006
Joined: Oct 2007
Reputation: 75
Location: UK
Post: #4
Nice one, will try to give it a test this weekend.

I've only ever used:

directors[0]->entity_name

for the director and it seems to work
find quote
zag Offline
Retired Team-Kodi Member
Posts: 4,006
Joined: Oct 2007
Reputation: 75
Location: UK
Post: #5
Also just a small thing, but could you add this forum thread url and github source url into the addon.xml file?
find quote
tphoenix Offline
Junior Member
Posts: 26
Joined: Nov 2011
Reputation: 1
Post: #6
I've added the links to the addon.xml file.

Here's part of the music video info page. The scraper looks for entity_name when scraping director info, and I turned on the repeat flag for videos like this one with multiple directors.

Code:
"directors": [
        {
            "position_name": "Director",
            "position_code": "dir",
            "entity_name": "Grady Hall",
            "entity_slug": "grady-hall",
            "entity_id": 59838,
            "position_notes": "",
            "position_id": 59838,
            "entity_url": "http:\/\/imvdb.com\/n\/grady-hall"
        },
        {
            "position_name": "Director",
            "position_code": "dir",
            "entity_name": "Mark Kudsi",
            "entity_slug": "mark-kudsi",
            "entity_id": 59839,
            "position_notes": "",
            "position_id": 59839,
            "entity_url": "http:\/\/imvdb.com\/n\/mark-kudsi"
        }
    ],

Now, if a different position is listed under this it will also get picked up (I assume it will follow the same format but with a different position code), however as I said I haven't seen any as yet.

Now that it scrapes Artist, Title, Year, Director/s and thumbnail, is there anything else that would be good to scrape? The only other info I can see available that might be good is this:
Code:
"release_date_string": "September 5, 2013"

Is that any different/worth having as well as Year?

Thanks for your feedback again, appreciate it.
find quote
zag Offline
Retired Team-Kodi Member
Posts: 4,006
Joined: Oct 2007
Reputation: 75
Location: UK
Post: #7
Just gave this a test on my complete music video library.

Works well, but I get a lot of duplicate entries when a song is not found, it seems to match the first one of that artist.
find quote
tphoenix Offline
Junior Member
Posts: 26
Joined: Nov 2011
Reputation: 1
Post: #8
Hmm, I didn't think of that.

If the music video isn't listed on imvdb.com then it must just grab the closest thing (first one from the artist).

I'm guessing there is no possible workaround just using imvdb as the source. Ideally their database would be quite complete, however as open as theaudiodb.com is it may be the better option in this instance.

Hmm...
find quote
WhiteSpy Offline
Junior Member
Posts: 9
Joined: Jan 2012
Reputation: 0
Post: #9
How do you install or use this scraper?
find quote
tphoenix Offline
Junior Member
Posts: 26
Joined: Nov 2011
Reputation: 1
Post: #10
If you go here: https://github.com/tphoenix/imvdb-scraper on the right hand side will be a button that says Download ZIP.

Download the zip file, extract it somewhere, copy the folder metadata.musicvideos.imvdb to your XBMC addons folder and restart XBMC. When you add/edit your music video source folder you should be able to choose the IMVDb scraper.

As zag pointed out though, if the music video you are trying to scrape is not listed on the imvdb site then it will not work properly.
find quote
phil65 Offline
Team-Kodi Developer
Posts: 6,686
Joined: Mar 2009
Reputation: 147
Location: Cologne, Germany
Post: #11
Perhaps you could try to extract a clean version of the song title from the filename and compare it with the scraped title to see if it is a proper match.

Donate: https://kodi.tv/contribute/donate (foundation), 14CmofXb2PDohNqKtaBYRcMtQV3BBWrLNB (BTC personal)
Estuary: Kodis new default skin - ExtendedInfo Script - KodiDevKit
find quote
hhtitan72 Offline
Junior Member
Posts: 3
Joined: Jan 2015
Reputation: 0
Post: #12
Thank you for creating this add-on it works really well. However, I recently stumbled upon a problem it doesn't seem to display letters with accents correctly as your presented with a bunch of letters and numbers where the accent should be. As an example the actual track title should be Fierté but your presented with this instead http://i.imgur.com/i7c0DNa.jpg?1
find quote
Jeroen Offline
Skilled Skinner
Posts: 3,127
Joined: Feb 2008
Reputation: 49
Location: The Netherlands
Post: #13
(2014-10-04 15:46)tphoenix Wrote:  Currently this scraper only retrieves artist, track and year information. I wasn't sure what else to include, or what else can be displayed by XBMC. Some videos have a director listed, or other production companies, and every video seems to have thumbs. Should the scraper also be pulling this information?
It would be great if the scraper could also fetch that additional information. Way back XBMC's music video scrapers pulled this information in at least. It was never really complete, but IMVDb looks promising.

As does your scraper. Best results I have had with music video scrapers in a long time, really nice job!

From a skin perspective, the infolabels available for music videos are:

ListItem.Artist
ListItem.Album
ListItem.Title
ListItem.Studio (this could be used for the record label)
ListItem.Date
ListItem.Duration
ListItem.Writer (could be used for composer if that's available in on IMVDb)
ListItem.Director
ListItem.Plot / LisiItem.PlotOutline (could be used for biography)

I think that's all of them. And most of the above but like:

VideoPlayer.Album (to show the info on the video osd)

Furthermore I experienced the same problem as hhtitan72 with special characters. Instead of the artist name "Sigur Rós" it displays "Sigur R\U00F3S" for me.

Keep it up Smile
(This post was last modified: 2015-01-09 23:51 by Jeroen.)
find quote
LongMan Offline
Senior Member
Posts: 128
Joined: Apr 2013
Reputation: 0
Post: #14
@Jeroen

Does ListItem.Date actually work? I have not been able to get it to show the date. Any help would be greatly appreciated.

Cheers
find quote
Jeroen Offline
Skilled Skinner
Posts: 3,127
Joined: Feb 2008
Reputation: 49
Location: The Netherlands
Post: #15
(2015-01-10 22:49)LongMan Wrote:  @Jeroen

Does ListItem.Date actually work? I have not been able to get it to show the date. Any help would be greatly appreciated.

Cheers

Yeah, should work. At least it does with tv episodes for example. I don't see why it wouldn't work with music videos, although it is up to the scraper to get the dates and "assign" it to the infolabel.
find quote
Post Reply