Kodi Community Forum
[RELEASE] Texture Cache Maintenance utility - Printable Version

+- Kodi Community Forum (https://forum.kodi.tv)
+-- Forum: Support (https://forum.kodi.tv/forumdisplay.php?fid=33)
+--- Forum: Supplementary Tools for Kodi (https://forum.kodi.tv/forumdisplay.php?fid=116)
+--- Thread: [RELEASE] Texture Cache Maintenance utility (/showthread.php?tid=158373)



[RELEASE] Texture Cache Maintenance utility - Milhouse - 2013-03-05

It's annoying having to delete the entire Textures13.db+Thumbnails folder to try and correct scraping errors and/or corrupted images that just never go away, so I've written a very trivial Python script that can be used to interrogate the texture cache database, and can also be used to remove - with absolute precision - those database rows and cached files that relate to problematic artwork.

The script has been tested with local installations of XBMC Frodo on OpenELEC R-Pi and Ubuntu 12.10, and also Windows with Python 2.7.3 running against a remote Frodo installation of XBMC. A separate properties file can be used to specify non-default configurations (see property file details at the end of this post).

This is not an addon, and it requires that the webserver is enabled in XBMC on port 8080 (unless another port is specified in the properties file). It also assumes you are comfortable working at the command line (ssh in Linux, a large CMD window may also work OK for Windows users).

Summary of features
  • [c, C] Automatically cache missing artwork (c), or with option C force the re-caching of existing artwork by removing first then downloading. Can use multiple threads (default is 2)
  • [nc] Identify those items that require caching (and would be cached by c option)
  • [lc, lnc] Same as c and nc, but only considers those media (movies, tvshows/episodes) added since the modification timestamp of the file identified by the property lastrunfile
  • [p, P] Prune texture cache by identifying (p) or removing (P) accumulated cruft such as image previews, previously deleted movies/tv shows/music whose artwork remains in the texture cache even after cleaning the database. Essentially, remove any cached file that is no longer associated with an entry in the media library and is therefore just wasting disk space
  • [s, S] Search texture cache database for specific entries and view database content - can help explain reasons for incorrect artwork. S will return only results for items that no longer exist in the Thumbnails folder (but are still in the database).
  • [x, X] Extract rows from texture cache database, with optional SQL filter. X will return only those database results for items that no longer exist in the Thumbnails folder
  • [Xd] Remove those rows from the texture cache database that do not have corresponding files in the Thumbnails folder (ie. remove same rows identified by X)
  • [d] Delete specific database rows and corresponding files from the texture cache using database row identifier (see s/S/x/X)
  • [r, R] Reverse query cache, identifying "orphaned" files that are no longer referenced in the texture cache database. Use R option to auto-delete these files.
  • [j, J, jd, Jd] Query media library using JSON API, and export content as JSON (suitable for further external processing). J and Jd will include additional user defined fields; jd and Jd will "decode" (unquote) all URLs
  • [qa] Perform QA check on media library recently added items, identifying missing properties (eg. plot, mpaa certificate, artwork etc.). Default QA period is previous 30 days. Add property qa.file = yes to verify file exists during QA. Add additional QA fields using qa.art.*, qa.blank.* and qa.zero.* properties.
  • [qax] Like the qa option, but performs a remove and then rescan of any media found to fail the QA tests
  • [set, testset] Set values on movie, set, tvshow, episode, musicvideo, artist, album and song. Pass parameters on the command line, or as a batch of data read from stdin. testset will perform a dry-run. See [setting fields](#setting-fields-in-the-media-library) for more details.
  • [remove] Remove specified library item from media library, ie. remove movie 123
  • [imdb] Update IMDb fields on movies and tvshows. Pipe output into set option to apply changes to the media library using JSON. Uses http://www.omdbapi.com to query latest IMDb details based on imdbnumber (movies) or title and season/pisode # (tvshows). Movies without imdbnumber will not be updated. Specify movie fields to be updated with the property @imdb.fields.movies from the following: top250, title, rating, votes, year, runtime, genre, plot and plotoutline - the default value is: rating, votes, top250. Specify tvshow fields to be updated with the property @imdb.fields.tvshows from the following: title, rating, votes, year, runtime, genre, plot and plotoutline. Specify additional fields to the default by prefixing the list with +, eg. @imdb.fields.movies=+year,genre to update movie ratings, votes, top250, year and genre. See log for old/new values. See announcement for further important details.
  • [purge hashed|unhashed|all] Delete cached artwork containing specified patterns, with or without lasthashcheck, or if it doesn't matter all eg. purge unhashed youtube iplayer imdb.com
  • [purgetest hashed|unhashed|all] Dry-run version of purge - will show what would be removed during an actual purge
  • [watched] Backup and restore movie and tvshow watched lists to a text file. Watched list will be restored keeping more recent playcount, lastplayed and resume points unless property watched.overwrite=yes is specified, in which case the watched list will be restored exactly as per the backup.
  • [duplicates] List movies that appear more than once in the media library with the same imdb number
  • [missing] Locate media files missing from the specified media library and source label, eg. missing movies "My Movies"
  • [ascan, vscan] Initiate audio/video library scan, either entire library or a specific path (see sources). The exit status is the number of items added during each scan, ie. 0 or +n.
  • [aclean, vclean] Clean audio/video library
  • [directory] Obtain directory listing for a specific path (see sources)
  • [sources] List of sources for a specific media class (video, music, pictures, files, programs)
  • [status] Display status of client - ScreenSaver active, IsIdle (default period 600 seconds, or user specified) and active Player type (audio or video), plus title of any media currently being played.
  • [monitor] Display client event notifications as they occur
  • [power] Set power state of client - suspend, hibernate, shutdown or reboot
  • [wake] Use Wake Over LAN to wake a suspended/hibernating remote client. Specify the MAC address of the remote client in the network.mac property (ie. network.mac=xx:xx:xx:xx:xx:xx). When the client is no longer required, suspend or hibernate it with the relevant power option
  • [exec, execw] Execute the specified addon, with optional parameters. eg. exec script.artwork.downloader silent=true mediatype=tvshow. Use execw to wait, but this rarely has any effect (possibly not implemented by JSON?)
  • [setsetting] Set the value of the named setting, eg. setsetting locale.language English
  • [getsetting] Get the current value of the named setting, eg. getsetting locale.language
  • [getsettings] View details of all settings, or those where pattern is contained within id, eg. getsettings debug to view details of all debug-related settings
  • [debugon, debugoff] Enable/Disable debugging
  • [play, playw] Play specified item (local file, playlist, internet stream etc.), optionally waiting until playback is ended
  • [stop] Stop playback
  • [pause] Toggle pause/playback of currently playing media
  • [config] View current configuration
  • [version] View current installed version
  • [update] Manually update to latest available version if not already installed. Only required if checkupdate or autoupdate properties are set to no as by default the script will automatically update itself (if required) to the latest version whenever it is executed.

Installation instructions

For instructions on how to run a Python script, see here.

Windows users will need to install Python 2.7.x or Python 3.3.x, if not already installed (see how-to).

Download the single Python script file - use "Save as" in your browser - from github. A default properties file is available on github, rename this to texturecache.cfg in order to use it, although in many cases it's not required and should just be considered a template - pick and choose what options you wish to override.

Alternatively, non-Windows users can download the script directly at the command line using curl:
Code:
curl https://raw.githubusercontent.com/MilhouseVH/texturecache.py/master/texturecache.py -o texturecache.py
chmod +x ./texturecache.py

ATV2 (iOS) users

Python 2.6+ is required to run this script, and although Python can be installed on iOS using "apt-get install python", the version installed (typically v2.5.1 - check with "python --version") is very old and lacks language features required by the script. It is possible to install a more recent Python 2.7.3 package as follows:
Code:
ssh [email protected]
rm -f python*.deb
wget http://yangapp.googlecode.com/files/python_2.7.3-3_iphoneos-arm.deb
dpkg -i python*.deb
rm python*.deb

Example usage

Let's say the poster image for my "Dr. No" movie is corrupted, and I want to delete it so that XBMC will automatically re-cache it (hopefully correctly) next time it is displayed:

1) Execute: ./texturecache.py s "Dr. No" to search for my Dr. No related artwork

2) Several rows should be returned, relating to different cached artwork - one row will be for the poster, the other fanart, and there may also be rows for other image types too (logo, clearart etc.). This is what I get:
Code:
000226|5/596edd13.jpg|0720|1280|0011|2013-03-05 02:07:40|2013-03-04 21:27:37|nfs://192.168.0.3/mnt/share/media/Video/Movies/James Bond/Dr. No (1962)[DVDRip]-fanart.jpg
000227|6/6f3d0d94.jpg|0512|0364|0003|2013-03-05 02:07:40|2013-03-04 22:26:38|nfs://192.168.0.3/mnt/share/media/Video/Movies/James Bond/Dr. No (1962)[DVDRip].tbn
Matching row ids: 226 227

3) Since I want to remove only the poster (.tbn) I can execute ./texturecache.py d 227 and both the database row and cached poster image will be deleted. If I wanted to remove both images, I would simply execute ./texturecache.py d 226 227 and the two rows and their corresponding cached images will be removed.

Alternatively, ./texturecache.py C movies "Dr. No" would achieve the same result including re-caching the deleted items.

Format of "dumped" records

When displaying rows from the database, the following fields (columns) are shown:

Code:
rowid, cachedurl, height, width, usecount, lastusetime, lasthashcheck, url

Additional usage examples

To view your most recently accessed artwork:
Code:
./texturecache.py x | sort -t"|" -k6
-- or --
./texturecache.py x "order by lastusetime asc"

or your top 10 accessed artwork...
Code:
./texturecache.py x | sort -t"|" -k5r | head -10
-- or --
./texturecache.py x "order by usecount desc" 2>/dev/null | head -10

Use texturecache.py to identify artwork for deletion, then cutting and pasting the matched ids into the "d" option or via a script:

For example, to delete those small remote thumbnails you might have viewed when downloading artwork (and which still clutter up your cache):
Code:
./texturecache.py s "size=thumb"
<then cut & paste the ids as an argument to ./texturecache.py d>

and the same, but automatically:
Code:
IDS=$(./texturecache.py s "size=thumb" 2>&1 1>/dev/null | sed "s/.*: //")
[ -n "$IDS" ] && ./texturecache.py d $IDS
-- or --
./texturecache.py P

Delete any artwork that has not been accessed since a particular date:
Code:
./texturecache.py x "where lastusetime <= '2013-03-05'"

or hasn't been accessed more than once:
Code:
./texturecache.py x "where usecount <= 1"

Query the media library returning JSON results:

First, just the default fields for a particular class (movies), filtering for a specific item (Avatar):

Code:
./texturecache.py j movies "Avatar"
[
  {
    "movieid": 22,
    "title": "Avatar",
    "art": {
      "fanart": "image://nfs%3a%2f%2f192.168.0.3%2fmnt%2fshare%2fmedia%2fVideo%2fMovies%2fAvatar%20(2009)%5bDVDRip%5d-fanart.jpg/",
      "discart": "image://http%3a%2f%2fassets.fanart.tv%2ffanart%2fmovies%2f19995%2fmoviedisc%2favatar-4eea31049147b.png/",
      "poster": "image://nfs%3a%2f%2f192.168.0.3%2fmnt%2fshare%2fmedia%2fVideo%2fMovies%2fAvatar%20(2009)%5bDVDRip%5d.tbn/",
      "clearart": "image://http%3a%2f%2fassets.fanart.tv%2ffanart%2fmovies%2f19995%2fmovieart%2favatar-4f803992128b8.png/",
      "clearlogo": "image://http%3a%2f%2fassets.fanart.tv%2ffanart%2fmovies%2f19995%2fhdmovielogo%2favatar-503e0262ba196.png/"
    },
    "label": "Avatar"
  }
]

With "extrajson.movies = trailer, streamdetails, file" in the properties file, the same query but returning the extra fields too:

Code:
./texturecache.py J movies "Avatar"
[
  {
    "movieid": 22,
    "title": "Avatar",
    "label": "Avatar",
    "file": "nfs://192.168.0.3/mnt/share/media/Video/Movies/Avatar (2009)[DVDRip].m4v",
    "art": {
      "fanart": "image://nfs%3a%2f%2f192.168.0.3%2fmnt%2fshare%2fmedia%2fVideo%2fMovies%2fAvatar%20(2009)%5bDVDRip%5d-fanart.jpg/",
      "discart": "image://http%3a%2f%2fassets.fanart.tv%2ffanart%2fmovies%2f19995%2fmoviedisc%2favatar-4eea31049147b.png/",
      "poster": "image://nfs%3a%2f%2f192.168.0.3%2fmnt%2fshare%2fmedia%2fVideo%2fMovies%2fAvatar%20(2009)%5bDVDRip%5d.tbn/",
      "clearart": "image://http%3a%2f%2fassets.fanart.tv%2ffanart%2fmovies%2f19995%2fmovieart%2favatar-4f803992128b8.png/",
      "clearlogo": "image://http%3a%2f%2fassets.fanart.tv%2ffanart%2fmovies%2f19995%2fhdmovielogo%2favatar-503e0262ba196.png/"
    },
    "trailer": "",
    "streamdetails": {
      "video": [
        {
          "duration": 9305,
          "width": 720,
          "codec": "avc1",
          "aspect": 1.7779999971389771,
          "height": 576
        }
      ],
      "audio": [
        {
          "channels": 6,
          "codec": "aac",
          "language": "eng"
        }
      ],
      "subtitle": []
    }
  }
]

Optional Properties File

By default the script will run fine on distributions where the .xbmc/userdata folder is within the users Home folder (ie. userdata=~/.xbmc/userdata). To override this default, specify a properties file with a different value for the userdata property.

The properties file should be called texturecache.cfg, and be in the same directory as the texturecache.py script. What follows is an example properties file with the default values shown - DO NOT paste this file as your own properties, set only the properties you require in your own file:

Code:
sep = |
userdata = ~/.kodi/userdata
dbfile = Database/Textures13.db
thumbnails = Thumbnails
xbmc.host = localhost
webserver.port = 8080
webserver.username =
webserver.password =
rpc.port = 9090
download.threads = 2
extrajson.albums =
extrajson.artists =
extrajson.songs =
extrajson.movies =
extrajson.sets =
extrajson.tvshows.tvshow =
extrajson.tvshows.season =
extrajson.tvshows.episode =
qaperiod = 30
qa.file = no
qa.fail.urls = ^video, ^music
qa.warn.urls =
cache.castthumb = no
cache.hideallitems = no
cache.ignore.types = ^video
prune.retain.types =
prune.retain.previews = yes
prune.retain.pictures = no
logfile =
logfile.verbose = yes
checkupdate = yes
lastrunfile =
orphan.limit.check = yes
nonmedia.filetypes =
watched.overwrite = no
network.mac =
imdb.fields.movies = rating, votes, top250
imdb.fields.tvshows = rating, votes
purge.minlen = 5
bin.tvservice = /usr/bin/tvservice
hdmi.force.hotplug = no

The dbfile and thumbbnails properties represent folders that are both relative to the userdata property.

Specify webserver.username and webserver.password if you require webserver authentication.

extrajson.* properties allow the specification of additional JSON audio/video fields to be returned by the J query option.

Cast thumbnails will not be updated by default, so specify cache.castthumb = yes if you require cast artwork to be re-cached.

Ignore specific URLs when pre-loading the cache (c/C options), by specifying comma delimited regex patterns for the cache.ignore.types property. Default value is image://video. Set to None to process all URLs. Any matching URL will not be cached (c/C), nor will it be deleted when forcing a reload (option C).

Specify a filename for the logfile property, to log detailed processing information.

See github for full changelog and further details in README.

Run the script without arguments for usage and current configuration information.

Hopefully someone else will find this useful! Smile

For mklocal.py, see here
For explanation of dropped/skipped/ignored, see here


RE: Texture Cache interrogation utility - Milhouse - 2013-03-08

New version (0.0.5) - see installation instructions in first post for download details.

Added "c" option to re-cache all artwork. It uses JSON to query the media library for all artwork, and then retrieves each item of artwork so that it is cached locally. Items that are already in the cache will be ignored. Use the "d" option to remove items from the cache, and then refresh the cache.

This is much easier than laboriously browsing through your media library while the Pi slowly re-caches all the missing items!

To run:
Code:
./texturecache.py c

-- or --

./texturecache.py c movies
./texturecache.py c sets
./texturecache.py c tvshows
./texturecache.py c artists
./texturecache.py c albums
./texturecache.py c songs

-- or --

./texturecache.py c tvshows "big bang theory"
./texturecache.py c movies "harry potter"

If no additional arguments are supplied when using the "c" command, all but songs will be refreshed (ie. movies+sets+tvshows+albums+artists).

A third argument can be used to filter queries, partially matching on movie/set/tvshow/album/song title, or artist name.

As items of artwork are processed, their type (poster, fanart, etc.) will be displayed. A "+" alongside the artwork type will indicate that the item was not found in the cache, and has now been cached. The lack of a "+" indicates that the item was found in the cache, and it has been skipped (not retrieved).

This has been tested with a media library hosted on a remote MySQL server and NFS content, and is working well. Should work with local databases and other media solutions.


RE: Texture Cache interrogation utility - leechguy - 2013-03-10

I haven't tested the script yet, but this sure is going to be usefull! Thanks!


RE: Texture Cache interrogation utility - Milhouse - 2013-03-10

(2013-03-10, 02:19)leechguy Wrote: I haven't tested the script yet, but this sure is going to be usefull! Thanks!

Well it's been useful for me, so I figured it might be useful for others too! Hopefully it works for you, if not let me know and I'll try to fix.

I've lost count of the number of times I've had to drop the texture cache database and remove Thumbnails, only to then have to re-cache items one by one... but no more now that I've got the re-caching option working! Smile

I've also added another caching option ("C") which will remove any existing artwork before re-caching new items, eg.

Code:
./texturecache.py C tvshows "big bang theory"

will automatically remove any existing artwork from the cache (both database and filesystem) before retrieving new artwork items, thus ensuring the local texture cache is updated come what may (this is just a more convenient way of deleting items - option "d" - before refreshing them with option "c").


RE: Texture Cache interrogation utility - nickr - 2013-03-12

@Milhouse - could you please clarify the licensing of your code? I would like to play with the code a bit, and would hope that in the spirit of xbmc you might release it under GPL2?


RE: Texture Cache interrogation utility - Milhouse - 2013-03-12

(2013-03-12, 10:55)nickr Wrote: @Milhouse - could you please clarify the licensing of your code? I would like to play with the code a bit, and would hope that in the spirit of xbmc you might release it under GPL2?

Good question, I've just uploaded an updated version 0.0.8 using the GPLv2 license so you can do what you like with it now, though a small credit or attribution would be nice on any derived work! Smile


RE: Texture Cache interrogation utility - nickr - 2013-03-12

Excellent. Possibly my interest will come to nothing, but a license is good Smile


RE: Texture Cache interrogation utility - Milhouse - 2013-03-13

Version 0.1.0.

Small update, now outputs JSON results (j), optionally with additional fields (J) as specified in the properties file. Potentially JSON output could be parsed by another script for further processing.


RE: Texture Cache interrogation utility - popcornmix - 2013-03-13

@MilhouseVH
One annoyance with xbcm is that some tv episodes have no screenshot/plot/rating presumably because the episode was scanned before the scraping site was updated.
This never gets fixed without a manual episode info/refresh.

Is it possible for your script to trigger a refesh on episodes with missing content?

(or does this feature exist anywhere else?)


RE: Texture Cache interrogation utility - Milhouse - 2013-03-13

(2013-03-13, 14:50)popcornmix Wrote: @MilhouseVH
One annoyance with xbcm is that some tv episodes have no screenshot/plot/rating presumably because the episode was scanned before the scraping site was updated.
This never gets fixed without a manual episode info/refresh.

Is it possible for your script to trigger a refesh on episodes with missing content?

As long as you aren't extracting thumbnails (which will create a thumbnail), then yes, I guess so... you'd need to execute the extended JSON query and determine when there is a missing thumbnail or plot from the returned results.

ie, if you add "plot" to "extrajson.tvshows.episode" in the properties file, you could run something like "./texturecache.py J tvshows" and then parse the JSON results for any missing items.

The trouble with triggering a refresh of items that have missing content is that you normally need to delete the item first - and not just from the texture cache but deletion from the media library. Definitely possible though, either by extending texturecache.py or by a script that parses the JSON results.

I'll give it some thought, though not entirely sure how to fix the problem once having identified there are missing items.

(2013-03-13, 14:50)popcornmix Wrote: (or does this feature exist anywhere else?)

I load all my meta-data from NFO files and locally stored artwork, so I wrote a script that I run on my NAS which parses the NFO files to check for missing plots, and to identify missing artwork.. not aware of any other solution, other than querying the media library database direct.


RE: Texture Cache interrogation utility - Milhouse - 2013-03-13

@popcornmix: Try the version I just uploaded (version 0.1.1) which now includes a "qa" option for movies and tvshows, and should identify missing artwork (poster and fanart for movies, thumb for tvshows) and plots. The formatting of the output could do with some work, but as an initial step is this of any use to you?

To run it, execute:
Code:
./freenas/data/texturecache.py qa movies
-- or --
./freenas/data/texturecache.py qa movies avatar
or
Code:
./freenas/data/texturecache.py qa tvshows
-- or --
./freenas/data/texturecache.py qa tvshows "big bang"



RE: Texture Cache interrogation utility - bonelifer - 2013-03-13

Now if someone could do this in an addon that was crossplatform.


RE: Texture Cache interrogation utility - Milhouse - 2013-03-13

(2013-03-13, 16:58)bonelifer Wrote: Now if someone could do this in an addon that was crossplatform.

Off you go then. Smile

Sorry I've no interest in writing an addon, I'm much happier at the command line and the UI for this as an addon would be quite horrific. Although this command line script should work cross platform, with a bit of jiggery pockery.


RE: Texture Cache interrogation utility - Milhouse - 2013-03-13

Version 0.1.2 now up... when performing a "qa" check, only movies or tvshow episodes added during the previous 30 days will be considered. Increase/reduce this period by specifying a value (in days) for qaperiod in the properties file, eg. "qaperiod=9999" would effectively restore v0.1.1 behaviour which didn't restrict by period.

Also added additional qa tests: rating and mpaa certification (movies), plot/fanart/banner/poster (tv show) and rating (tv show episode).


RE: Texture Cache interrogation utility - popcornmix - 2013-03-13

(2013-03-13, 17:54)MilhouseVH Wrote: Version 0.1.2 now up... when performing a "qa" check, only movies or tvshow episodes added during the previous 30 days will be considered. Increase/reduce this period by specifying a value (in days) for qaperiod in the properties file, eg. "qaperiod=9999" would effectively restore v0.1.1 behaviour which didn't restrict by period.

Also added additional qa tests: rating and mpaa certification (movies) and rating (tv show episodes).

Sounds very interesting. I'll have a play with this when I get the chance.