Kodi Community Forum

Full Version: song.dateAdded is a TEXT field
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
The "dateAdded" field of the "song" table is a TEXT field.

Why? What is going on here? Surely it should be a datetime (or perhaps a date)?

This is doubly bad, since the "song" table is likely to be the largest table in either MyMusic or MyVideos, and a TEXT field adds a minimum of 1KB to the row size.
Do you also realise that "dateadded" does not contain the date that the song was added Sad

Because of the delete/insert approach to library update we don't actially know when something was first added to the library. So this field contains the file datetime stamp.

I'm not happy with any of this.
Sure, we discussed this here - http://forum.kodi.tv/showthread.php?tid=...pid2380524 - and I'm happy to hear you're not happy about the fact that there's no way of kodi really knowing when an album first was added to the library.

However, this is a completely different issue altogether - the fact that a column that stores datetimes is a TEXT field and not a DATETIME field - and has huge consequences, as I outlined in my original post. Many Kodi users will have music collections with hundreds of thousands of songs. The effects of having this field as a TEXT field is therefore immense.

I know it's not possible to replace all DELETE/INSERTs with UPDATEs for the next release, however a db upgrade script to alter the table and change the TEXT field to a DATETIME field would just be a few lines long:

Code:
ALTER TABLE `MyMusic56` ADD COLUMN new_dateAdded DATETIME;
UPDATE `MyMusic56` SET `new_dateAdded` = STR_TO_DATE(`dateAdded`,'%Y-%M-%d %h:%i:%s');
ALTER TABLE `MyMusic56` DROP COLUMN dateAdded;
ALTER TABLE `MyMusic56` CHANGE new_dateAdded dateAdded DATETIME;

And done, right?
Changing the table column is trivial, but a number of other things would have to change wherever that column is accessed both by GUI and JSON API. Sure it could be done, but does need a bit more thought than just 4 lines of SQL.

I also wonder while my predecessors used TEXT not DATETIME for this field (and the other datetimes)? Was it an accident, or by design? I don't know, but some research is worthwhile as I beleive they generally did good work getting Kodi this far. Other datetimes like lastplayed and lastscraped are varchar (20), I assume for good reasons.

What I have not heard much about is users running out of space for their musicdb. I admit I was worried when I first put Kodi onto a RPi, did I need a big SD card etc. but it turned out to be no problem. So, apart from keeping data to a minimum as a principle, are there users with space issues such that we need to look at making the database smaller? This is importamnt, because I am busy else where in the library increasing the data that is stored e.g. composer etc. My belief is that art work is so bulky, the db sizes are irrelevent by comparison, but I could be wrong.
(2016-08-30, 15:38)ventolin Wrote: [ -> ]...and a TEXT field adds a minimum of 1KB to the row size.

Where do you get that from? Reading up it seems that VARCHAR and TEXT in both MySQL and SQLite are stored variably using only the length of the data they contain. Now this length could be more than taken by a numeric DATETIME field but it is not 1KB.
(2016-09-02, 17:25)DaveBlake Wrote: [ -> ]
(2016-08-30, 15:38)ventolin Wrote: [ -> ]...and a TEXT field adds a minimum of 1KB to the row size.

Where do you get that from? Reading up it seems that VARCHAR and TEXT in both MySQL and SQLite are stored variably using only the length of the data they contain. Now this length could be more than taken by a numeric DATETIME field but it is not 1KB.

Sorry, 256 bytes, not 1KB:

Quote:TEXT and BLOB columns are implemented differently in the NDB storage engine, wherein each row in a TEXT column is made up of two separate parts. One of these is of fixed size (256 bytes), and is actually stored in the original table. The other consists of any data in excess of 256 bytes, which is stored in a hidden table. The rows in this second table are always 2,000 bytes long. This means that the size of a TEXT column is 256 if size <= 256 (where size represents the size of the row); otherwise, the size is 256 + size + (2000 − (size − 256) % 2000).

from http://dev.mysql.com/doc/refman/5.7/en/s...ments.html
Thanks for clarifying and references.

So there could be some space saving on MySQL databases to change dateadded to varchar, consistent with the other date/time fields. In SQLite there really is no saving between varchar and text.

Use DATETIME would need to be a consistent change to all date/time fields in both music and video not just song.dateadded field. I would guess that a string was used originally instead of DATETIME because of differences between MySQL and SQLite handling that the db wrapper could not handle, using own string format rather than local OS etc. giving Kodi control over what is stored. I don't feel a desire to change that approach even if it does take more space. Maybe someone else will?

But I can only repeat that is database size is an actual user issue, rather than a principle, then I am interested to hear about it.
(2016-09-06, 17:52)DaveBlake Wrote: [ -> ]Thanks for clarifying and references.

So there could be some space saving on MySQL databases to change dateadded to varchar, consistent with the other date/time fields. In SQLite there really is no saving between varchar and text.

Use DATETIME would need to be a consistent change to all date/time fields in both music and video not just song.dateadded field. I would guess that a string was used originally instead of DATETIME because of differences between MySQL and SQLite handling that the db wrapper could not handle, using own string format rather than local OS etc. giving Kodi control over what is stored. I don't feel a desire to change that approach even if it does take more space. Maybe someone else will?

But I can only repeat that is database size is an actual user issue, rather than a principle, then I am interested to hear about it.

Hey,

thanks for the reply.

One side of this argument is the efficiency - I chose to emphasise this in my original posts, as this is the most concrete argument against using a TEXT field to store what is, in effect, a DATETIME.

The other side of the argument is the principle - the fact that the database schema is "just wrong" if we're using a TEXT field to store a DATETIME. This may be a little less concrete, but is just as obvious and persuasive, in my opinion, as the efficiency argument.

As far as I'm concerned, if there were differences between how DATETIMEs are handled in mysql and sqlite, then this should be abstracted away at a very low layer (e.g. db connection layer), so that the database schema can be correct, and Kodi - at the application layer - needs know nothing about the peculiarities of mysql and/or sqlite. I would've expected that Kodi was using an ORM (or something similar) that would do exactly this job for it, so I'd be surprised if this were the reason why a TEXT field were chosen.

Could you perhaps ping a few other Kodi core devs who might be able to comment on this and explain the thinking behind it?

I totally understand that you guys are busy working on other necessary refactorings, bugfixes and features without some guy coming along and complaining about the bloody database schema. However, this was something so strange that I'd really love to know the reason for it, and, if it can be changed, would be willing to help out to the best of my efforts on fixing.

Cheers!
(2016-09-06, 22:16)ventolin Wrote: [ -> ]As far as I'm concerned, if there were differences between how DATETIMEs are handled in mysql and sqlite, then this should be abstracted away at a very low layer (e.g. db connection layer), so that the database schema can be correct, and Kodi - at the application layer - needs know nothing about the peculiarities of mysql and/or sqlite. I would've expected that Kodi was using an ORM (or something similar) that would do exactly this job for it, so I'd be surprised if this were the reason why a TEXT field were chosen.
You are mistaken in your assumptions. No ORM (as yet), just a self-rolled db wrapper created by people now long gone.

Quote:Could you perhaps ping a few other Kodi core devs who might be able to comment on this and explain the thinking behind it?

I totally understand that you guys are busy working on other necessary refactorings, bugfixes and features without some guy coming along and complaining about the bloody database schema. However, this was something so strange that I'd really love to know the reason for it, and, if it can be changed, would be willing to help out to the best of my efforts on fixing.

I have chatted to the devs around now (original authors no longer around), and the exact reasoning is unknown. Honestly the schema holdng dates as strings, and any changes to it, are the least of Kodi's problems. There are many far stranger things!

If you want to get involved then great, but it is a huge learning curve and there is little time for mentoring or support. The code is open source, if you have the skills then have a look at how things have been implemented in detail, read the Git history and see if you can figure it out. It is what I do.
A historical search on the forums for "dateAdded" will tell you most of the history. JMarshall used to develop this bit of the database many years ago so that may help filtering down the issues.

I always thought the chronological order sorting was done by the database id.
(2016-09-07, 16:58)zag Wrote: [ -> ]I always thought the chronological order sorting was done by the database id.

Yes it is. This is because with the delete/insert approach to updates we don't know the date a song was added, ID is the nearest way to get something like that oder.
(2016-09-07, 17:15)DaveBlake Wrote: [ -> ]
(2016-09-07, 16:58)zag Wrote: [ -> ]I always thought the chronological order sorting was done by the database id.

Yes it is. This is because with the delete/insert approach to updates we don't know the date a song was added, ID is the nearest way to get something like that oder.

That's a dire way to deal with it (as I'm sure you're aware!) and in theory you can't assume the ID is sequential / correlates with order added...

Whilst it's not an ideal solution, you could possibly create another table with the hash of the file path os a key, along side the 'date added' for that file / track. Albums would probably have to use the date added of the first track found - unless there's a way to avoid deletes and inserts, and just use updates instead, but I could envisage that being quite difficult.

P.S. Appreciate all the work you're putting in here - I can imagine it's nightmare in places - but at least someone's looking at it, and making improvements and tidy-ups!
Not a nightmare, just the usual problems when looking at legacy software with no design documentation. I always start from the assumption that other devs did good job, and it is worth trying to discover their reasoning. But there just isn't enough hours in a day, and the ratio of devs to users is mind blowing.

The way forward is fixed IDs and a proper update approach rather than delete/insert. Any other tinkering with this in the meantime simply isn't worthwhile.