Beta - Database Pre-Wash Scrub - remove old junk links and privacy issues

  Thread Rating:
  • 5 Vote(s) - 5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Post Reply
BatterPudding Offline
Posting Freak
Posts: 888
Joined: Jan 2015
Reputation: 52
Post: #1
B3TA DISCLAIMER - This addon directly edits your database. It is a BETA release. We make a backup of the SQLite database. We also advise you to make your own backups too. MYSQL Users need to make your own backups. But you already have backups don't you?

This is a BETA so we are making assumptions on what can and cannot be removed so let us know if there is something you want to keep that we didn't think about. This is why this is a BETA TEST THREAD!!

We have tested in a selection of environments but now it is your turn. You are the mugs willing volunteers that will shake out the issues we hadn't thought of.


PROGRAM ADDON - VIDEO DATABASE CLEANER
script.database.cleaner
Created By: black_eagle and BatterPudding


What Is This Addon?
This addon will do a pre-wash clean-up of your KODI Video database removing references to old paths and files not listed in your KODI sources.

KODI's video library fills up over the years with a record of everything you have ever watched. A side effect of features in KODI means this data is never removed.

Every Internet and UPnP stream you watch, every YouTube video, each time File Manager is used to load up a video to watch once - records of these are all kept. (Yes - including that pr0n you have hidden on that network folder...)

A general assumption was that KODI's database only held the files scanned into the library. This is not the case. If you watch it, KODI remembers it.


This leads to a number of issues. One is privacy. There is nothing in the main KODI application that will clean out the database. Most other applications can clean their "recent files" lists, KODI can't. Yet.

KODI does have a built in "Clean Library" function, but this only works with the files scanned into the Media Library. It is not aware of the streams and one-off videos so it does nothing about them.


A second issue regularly rears its head where the old stream paths in the videos database cause numerous problems to other functions within KODI. Even KODI's own built in Clean Library functions spit dozens of errors into the debug logs when attempting to work with these old stream paths. For some people the Clean Library function could stall and crash due to these bad paths. (As well as the slightly worrying case that every one of these paths were being loaded up and interrogated causing some unusual network traffic)

Our script will do a pre-scrub of the database allowing KODI's own Clean Library function to do a far better job.


As a third point, running our add-on will also help make the KODI's database access more efficient. Getting rid of the old crud allows KODI to get the real data quicker. This will also help many other add-ons which don't realise this mess is kept.


The original thread that started the creation of this add-on
http://forum.kodi.tv/showthread.php?tid=269002

After the initial question from pr0xZen, an investigation kicked off between the forumites. Work has been done by black_eagle and BatterPudding to find a solution. 99% of the code is Black_eagle's work with SQL, Testing, Ideas, and generally trouble making from BatterPudding. (He also hosts the repo)


How Does It Work
Our addon aims to remove rtmp://, rtmpe://, http:// and plugin references from the files table. If it is run using the default settings, it builds an SQL query that excludes all the path sources defined in your sources.xml file and deletes everything else. This behaviour can be changed as some people don't have any sources defined and just use filemanager to navigate with. When the add-on finds an empty sources.xml it uses a default set of rules to select and delete the aforementioned paths.

When the addon has removed the required paths from the files table, the built in 'clean library' routine is called which scans the file table and removes any associated references in the rest of the tables to the links we have just deleted. This then clears up the Paths table and many other references.

The addon will prompt you before deleting anything by showing you a summary page of what paths are to be kept and confirmation of other settings. This prompt can be turned off if required. Users are encouraged to check that all their defined sources are indeed listed. Clicking on 'CLEAN' will execute the SQL clean-up, clicking on 'ABORT' will abort the script with no changes made to the database.

A backup is also kept of the database before every clean for SQLite users. VERY important in a beta test. MySQL Users need to do their own backups.

A list of remove files is stashed in KODI's temp folder (Linux: ~KODI\TEMP\database-cleaner.log, Windows: %APPDATA%\KODI\Cache\database-cleaner.log)

Users can also choose (via the add-on settings) to retain or remove old PVR information and to retain or remove bookmarks.


If Kodi's debugging is enabled, the script will write a few things in there. If the script's debugging is also enabled in the settings, it will be quite verbose as to what it is doing.


Settings
We have some optional settings to be configured.

Keep any PVR Information - tick this to keep all your PVR recordings in the database.
Keep any bookmarked files - tick this to keep bookmarks for videos not in the library. You may have bookmarked a YouTube video and this option will keep those details.
Automatically trigger clean library - strongly advise to always leave this ON. Our cleanup works best when worked in tandem with the main library clean up.
Backup local database before cleaning - strongly advice to keep backups. Especially during beta testing.
- Backup database name - ability to change the name of your backups.
Debug Settings
Prompt before actually deleting anything
- will let you review the SQL statement that is about to be applied to your database. Good idea to check this as we are testing (Did I mention this is a BETA TEST!!)
Enable Debugging - dumps a pile of extra notes into your debug log to help us work out where things went wrong.

Sources
Uses sources.xml to determine files to keep - STRONGLY recommended. Otherwise the database clean will remove only rtmp:// rtmpe:// addon:// http:// references
Uses sources.xml on this machine - lets you supply a list of sources from a different location (needed for MySQL users, shared sources.xml, network locations, etc)
Path to remote sources.xml file - Supply a custom path to a different sources.xml to protect.

Advanced
Force Database Name
- allows user to supply different name of a SQLite database file. The addon assume the "highest number" database is the current one. This may not be the case for people who upgraded and then downgraded again. This option lets you pick which database the addon will clean.

Note: MySQL users should back up their database using their favourite MySQL manager. If you don't have anything in place, phpMyAdmin can be used. (Discussions on how to backup an MySQL database are outside of this post really... if you don't know how to backup, then DON'T run our addon...)


The Exclusions
This add-on is pretty brutal in clearing out everything that is not listed as a source. But some people like to keep their You Tube viewing history. Or maybe they have an addon to keep a trailer with their movies. This Exclusions list is designed to handle those uses.

During testing we have found some people run KODI with a sources list that does not actually match their sources. For example - the database may be full of UNC paths from when the videos where initially scanned ( nfs://192.168.1.7/movies/ ) but the user has since changed their KODI source to mapped drive letters instead ( X: ).


To make the exclusions list, firstly the user has to decide where it goes. If this is a standard KODI install without profiles, then it goes in "KODI userdata/addon_data/script.database.cleaner". If profiles are in use then this excludes file goes in "KODI userdata/profiles/<profile_name>/addon_data/script.database.cleaner.

This means that when using profiles, different exclusions can be applied to the different databases. Note - there will already be a 'settings.xml' file in there and this one goes next to it.

The filename is "excludes.xml" and the contents are as follows :-
Code:
<excludes>
    <exclude>plugin://plugin.video.youtube</exclude>
    <exclude>plugin://plugin.video.iplayerwww</exclude>
    <exclude>nfs://192.168.1.7/Movies</exclude>
    <exclude>smb://192.168.1.7/一万年以后</exclude>
</excludes>

Any number of <exclude></exclude> tags can be added and can be as precise or vague as is required by the user eg "plugin://plugin.video" would exclude all video plugin data from being deleted. "http://" would keep all the http info, whereas "http://my_favourite_streaming_site.com" would retain just the info for that site.

Database Backups
Even time a scrub is performed, we make a backup of the SQLite database. These are stored in a backups folder underneath the KODI database folder. These backups can be disabled in the settings but we STRONGLY suggest you leave the backup enabled.

In our addon's settings it is possible to change the name of the backups if you want to make a specific test. This backup name will have a date and timestamp attached and then copied to a sub folder. testclean1_2016-04-26_1345.db If no name is supplied then the backups use the standard name of the backup and attach a datestamp MyVideos99_2016-04-26_1345.db. When you first install the add-on, it defaults to no defined string in the name, so the add-on by default uses the SQLite db name.

MySQL users need to make your own backups. But as you have moved into the advanced world of MySQL I assume you already know all about backups.


How To Use
Okay... so you are ready. If you've decided you want to keep anything not listed in the library, then go write that excludes.xml file as explained above. (Yeah yeah... version 2.0 may make a GUI for this) Check the settings if you want to keep anything bookmarked or from the pvr.

Now just run the addon. You'll find it listed under Program Addons as Video Database Cleaner. When you click Run you'll be shown the confirmation page of what will be cleaned based on your settings.. Check it looks right and then hit OK. The clean will then happen, followed by KODI's own Clean Library function.

That's it. Done. Lots of waffle, but simple cleaning. Now KODI will stop trying to look at stuff long gone, normal library cleaning will be quicker, and your logs cleaner.

GIVE US FEEDBACK. What do we need to add? We know it is a bit geeky at the moment, but this is a BETA. We are open to suggestions.

If you press <Thank User> also thank black_eagle too as he did most of the coding
(This post was last modified: 2016-05-30 17:47 by BatterPudding.)
find quote
BatterPudding Offline
Posting Freak
Posts: 888
Joined: Jan 2015
Reputation: 52
Post: #2
Get your lovely Add-On here.

Black Pudding Repo: http://batteredbits.co.uk/KODI/repo/

The original repository currently only holds 5.4.0, and is left here purely for audit purposes; the latest version is now available from the following repo:

https://github.com/the-black-eagle/repos...-0.1.0.zip

... with master code here:

https://github.com/the-black-eagle/scrip...se.cleaner

In theory, a newer version should overwrite an earlier one without problem, however it was installed (e.g. from the other repo). However, if you see anything different, please report issues back in this thread.

---

Changelog

Version 0.5.7-1 - 26/Nov/2016

[Change] Cosmetic changes to GUI window to allow skins to support the add-on with their own skin

Version 0.5.7 - 30/Oct/2016

[Change] Re-skin the GUI window - Now includes scrollable path list, better buttons etc etc
[Remove] Defunct code previously used to generate the GUI
[Change] Some cosmetic changes to various messages/warnings

Version 0.5.6 - 15/Oct/2016

[Fix] Apostrophes in paths causing SQL error
[Change] Tweak to information dialog

Version 0.5.5 - 30/May/2016

[Add] Ability to rename a path inside the database
[Add] Better documentation
[Add] Logging options - Move debug setting to log settings

Version 0.5.4 - 26/May/2016

[Add] Ability to remove a specific path from the path table in the database

Version 0.5.3 - 23/May/2016

[Change] Don't back-up local database unless actually cleaning
[Add] New dialog window detailing EXCLUDED paths, settings etc
[Fix] Handle mouse clicks in dialog window
[Add] Logging of removed paths to 'database-cleaner.log' in Kodi's 'temp' directory

Version 0.5.2 - 04/May/2016

[Fix] Turning off 'prompt before delete' in settings caused an error
[Fix] Incorrect variable name in debugging routine
[Remove] Un-needed code section and unused variables
[Add] More error handling when reading sources and excludes xml files
[Change] Don't display the SQL statement, rather display a list of paths
and other items set to be retained

Version 0.5.1 - 02/May/2016

[Fix] Handle multiple paths for one source entry in sources.xml
[Change] Change OK dialog to notification to allow script to run silently

Version 0.5.0 - 01/May/2016

Initial beta testing release

Edit - Updated Changelog and added new repo - ProfYaffle 27/Nov
(This post was last modified: 2016-11-27 13:51 by Prof Yaffle.)
find quote
BatterPudding Offline
Posting Freak
Posts: 888
Joined: Jan 2015
Reputation: 52
Post: #3
reserved
find quote
black_eagle Offline
Posting Freak
Posts: 1,815
Joined: Sep 2013
Reputation: 79
Location: East Yorkshire, UK
Post: #4
If the add-on accidentally nukes stuff in the database that you weren't expecting please provide the relevant part of a debug log showing the output of the script when it was run, as well as the relevant part of your sources.xml.

We don't need the full log, just any lines starting with "script.database.cleaner". This will help us resolve issues much faster instead of having to try to reproduce them ourselves.

To be clear, debugging needs to be enabled in both the add-on and Kodi itself before you run the script.

I know this may be a pain if your database just got nuked and you have to restore a back-up and nuke it again with debugging turned on, but without the debug log it's much harder to identify what went wrong !

Thanks.

Learning Linux the hard way !!
(This post was last modified: 2016-05-03 08:16 by black_eagle.)
find quote
black_eagle Offline
Posting Freak
Posts: 1,815
Joined: Sep 2013
Reputation: 79
Location: East Yorkshire, UK
Post: #5
reserved

Learning Linux the hard way !!
find quote
ReplayHarry Offline
Donor
Posts: 113
Joined: Feb 2014
Reputation: 0
Location: California, USA
Post: #6
Ok, ran quickly on my Jarvis 16.1 test machine. And the clean now took about 4 seconds.
Will run it on my Gotham "production" machine on Monday.
find quote
BatterPudding Offline
Posting Freak
Posts: 888
Joined: Jan 2015
Reputation: 52
Post: #7
Thanks ReplayHarry for letting us know. Should be no problems with Gotham as the concept is the same for older KODI\XBMC versions.

If you know how to look into your database, the main difference to look for are in the Paths and Files tables. My "production" machine was full of so much old addon and old stream crud that it caused errors in a normal Clean Library. Things are so much nicer now.


We're expecting a few common addons we need to look for and build some exception rules for. So any suggestions, we are here to listen.
find quote
OTinley Offline
Super Moderator
Posts: 610
Joined: Jun 2015
Reputation: 35
Location: Pangaea
Post: #8
Ran with no issues on Win7 running 16.1.
Took about 14 minutes, successfully created backup and shrunk the db by about 20%.
The only little thing is, that at one point, it looked stuck on 34% complete with absolutely no progress indication for about 5 minutes (it basically looked frozen).
Besides that, I think this is excellent. I can't thank you enough.

Always search the forum before posting. Read/follow the forum rules (wiki).


 
  
  
 
find quote
black_eagle Offline
Posting Freak
Posts: 1,815
Joined: Sep 2013
Reputation: 79
Location: East Yorkshire, UK
Post: #9
(2016-05-02 04:08)OTinley Wrote:  Ran with no issues on Win7 running 16.1.
Took about 14 minutes, successfully created backup and shrunk the db by about 20%.
The only little thing is, that at one point, it looked stuck on 34% complete with absolutely no progress indication for about 5 minutes (it basically looked frozen).
Besides that, I think this is excellent. I can't thank you enough.

Good to know it's worked fine for you with no issues. Smile

The progress box is part of Kodi's built in 'clean library' routine, which is called after the file table is cleaned up. This is so that anything that was associated with those file paths that we just removed is also removed from any other tables in the database. Depending upon how many files are removed, this can take a while to remove any associated artwork etc that Kodi puts in there automatically.

Many thanks for your feedback !

Learning Linux the hard way !!
find quote
Paranoidjack Offline
Senior Member
Posts: 105
Joined: Jul 2010
Reputation: 3
Post: #10
Just ran it on my SQL set-up and everything went fine.

Going to use this with the Library Auto Update addon to schedule weekly cleans on my headless system.
find quote
BatterPudding Offline
Posting Freak
Posts: 888
Joined: Jan 2015
Reputation: 52
Post: #11
(2016-05-02 13:04)Paranoidjack Wrote:  Just ran it on my SQL set-up and everything went fine.

Going to use this with the Library Auto Update addon to schedule weekly cleans on my headless system.
If you are adding it to your schedule, note that you can run this addon instead of the normal Clean Library routine as we call that straight after the pre-scrub.

Also notice that the settings let you turn off the various message boxes allowing it to run almost silently. As I have just been reminded, KODI's own Clean Library function ends with an OK button that you'll have to press even if we suppress all feedback from our addon.
(This post was last modified: 2016-05-02 17:44 by BatterPudding.)
find quote
bonelifer Offline
Member
Posts: 71
Joined: Sep 2011
Reputation: 2
Location: USA
Post: #12
Original size(Before running): 65.2 MiB
After initial run: 62.4 MiB
After phpmyadmin optimize DB: 65.2 MiB

Ran the addon again after the phpmyadmin optimize db routine, and it stayed at 65.2 MiB. Everything seems to be running fine still, so far.
find quote
jeffski10 Offline
Member
Posts: 62
Joined: Nov 2012
Reputation: 2
Location: London
Post: #13
Tried to run on my windows box. I have my sources.xml on a network share via advanced settings. The addon fails to run as I think its looking for this in the local userdata folder:

15:55:02 T:10764 ERROR: EXCEPTION Thrown (PythonToCppException) : -->Python callback/script returned the following error<--
- NOTE: IGNORING THIS CAN LEAD TO MEMORY LEAKS!
Error Type: <type 'exceptions.IOError'>
Error Contents: (2, 'No such file or directory', u'D:\\Kodi\\Kodi Live\\portable_data\\userdata\\sources.xml')
Traceback (most recent call last):
File "D:\Kodi\Kodi Live\portable_data\addons\script.database.cleaner\default.py", line 219, in <module>
tree = ET.parse(source_file)
File "D:\Kodi\Kodi Live\system\python\Lib\xml\etree\ElementTree.py", line 1182, in parse
tree.parse(source, parser)
File "D:\Kodi\Kodi Live\system\python\Lib\xml\etree\ElementTree.py", line 647, in parse
source = open(source, "rb")
IOError: (2, 'No such file or directory', u'D:\\Kodi\\Kodi Live\\portable_data\\userdata\\sources.xml')
-->End of Python script error report<--
find quote
black_eagle Offline
Posting Freak
Posts: 1,815
Joined: Sep 2013
Reputation: 79
Location: East Yorkshire, UK
Post: #14
(2016-05-02 16:57)jeffski10 Wrote:  Tried to run on my windows box. I have my sources.xml on a network share via advanced settings. The addon fails to run as I think its looking for this in the local userdata folder:

15:55:02 T:10764 ERROR: EXCEPTION Thrown (PythonToCppException) : -->Python callback/script returned the following error<--
- NOTE: IGNORING THIS CAN LEAD TO MEMORY LEAKS!
Error Type: <type 'exceptions.IOError'>
Error Contents: (2, 'No such file or directory', u'D:\\Kodi\\Kodi Live\\portable_data\\userdata\\sources.xml')
Traceback (most recent call last):
File "D:\Kodi\Kodi Live\portable_data\addons\script.database.cleaner\default.py", line 219, in <module>
tree = ET.parse(source_file)
File "D:\Kodi\Kodi Live\system\python\Lib\xml\etree\ElementTree.py", line 1182, in parse
tree.parse(source, parser)
File "D:\Kodi\Kodi Live\system\python\Lib\xml\etree\ElementTree.py", line 647, in parse
source = open(source, "rb")
IOError: (2, 'No such file or directory', u'D:\\Kodi\\Kodi Live\\portable_data\\userdata\\sources.xml')
-->End of Python script error report<--

You need to go into the add-on settings and under sources uncheck 'use sources.xml on this machine'. You can then select the setting below to navigate to the network sources.xml. Once this is set correctly in the add-on settings, the script should run correctly.

Learning Linux the hard way !!
find quote
BatterPudding Offline
Posting Freak
Posts: 888
Joined: Jan 2015
Reputation: 52
Post: #15
(2016-05-02 16:57)jeffski10 Wrote:  I have my sources.xml on a network share via advanced settings.
Interesting... what setting are you using for that? Or is it using path substitution?

Path_substitution (wiki)
find quote
Post Reply