2021-03-11, 00:47
(2021-03-10, 19:13)Montellese Wrote:(2021-03-09, 20:29)jbinkley60 Wrote:I'm not there yet. Initially I also tried to improve the performance by using batch inserts but in the current state of the code it's very difficult because before every(2021-03-08, 23:40)Montellese Wrote: After spending several hours trying to improve the whole code responsible for adding / writing media items retrieved from a media provider into Kodi's database I realized that I only had to adjust the SQL transaction handling to improve the performance by a factor of up to 50. I only tested this on SQLite (so no idea what the performance with MySQL is) and the improvements vary depending on whether the SQLite database is located on a HDD or SDD but either way the performance improvement should be significant. On my dev computer importing 60 movies to an HDD took ~80s and on an SSD it took ~9s. With the improvements this went down to ~1.4s. Now when importing a large library the main time is spent on retrieving the media items from the media provider - adding the retrieved items to the database is very fast.
And last but not least these new test builds are based on Kodi v19 (final).
Downloads: https://github.com/Montellese/xbmc/wiki/...#downloads
Changelog: https://github.com/Montellese/xbmc/wiki/...erformance
I found the same thing with regards to transactions, specifically commits. A commit with each record insert or similar will be very slow and as you observed storage media dependent on the severity of the slowness. I batch up inserts and then do periodic commits. It made a huge difference in performance. Similar to what you've seen. On my Intel NUC with an SSD I can get around 200 inserts a second. 13K records in just over a minute.
JeffINSERT
statement the code first performs aSELECT
statement to check if the item already exists or not. This can be refactored up to a certain point but it's not possible to get rid of these completely because a lot of the relationship data (e.g. actors / cast or genre) require these checks to avoid unique constraint violations and to get the ID of the item so that it can be used as a foreign key.
But from my performance tests with the latest release the limiting factor is by far the time it takes to retrieve all the metadata from the media provider. I just imported ~9K movies from a Plex media server and it took 95 minutes of which 85 minutes were spent on retrieving the items and 10 minutes were spent on writing the items to the database.
I do all of that in my code and still do batch commits. I'll do an insert, then a select to see if it inserted, grab a value for the next piece. I do a path table check/insert, a file table check/insert, then an episode/movie table check/insert, then genre, then actor, artwork tables, then stream details etc.. and with Kodi 19 I am getting around 30 records at a time from Mezzmo. In Kodi 18 I could get well over 1,000 in a pull. All of that gets written to the database and then a commit. I was seeing like you when I did a commit at each insert it was terribly slow. The same code running on a Raspberry Pi 4 is getting around 14 inserts a second and around 42 a second on a Vero 4K+. All of these times include the fetch time from Mezzmo. One thing I did was to ask the Mezmo developers to do is create a special query where I can pull any or all (by index) records from Mezzmo in a single request sorted with newest to oldest.
To optimize my code I started with commits after everything. Then starting at the innermost loop I started removing the commits, ensure data is written properly, go to the next loop etc. I am happy to help with the SQL stuff.
Jeff