Kodi's Dirty Regions - ARM GPU Tech - improving Kodi rendering performance
#1
Lightbulb 
It's funny what you discover when trying to fix seemingly unrelated Kodi Krypton GPU rendering related issues like the (now fixed) LibreELEC Kodi Krypton - AMLogic S912 subtitles - stuttering video playback problem.

Some background reading:

HOW-TO:Modify dirty regions (click)

Kodi - NEWS/DEV JOURNAL - Working with dirty regions [2011] (click)

The XBMC/Kodi Dirty Regions rendering code appears to be pretty old and was needed back in the day when media players had relatively slow CPU's and weak GPU's with constricted, slower GPU memory bandwidth capabilities. RPi's likely still need DR.

New ARM GPU's like the Mali-T820 in the AMLogic S912's now employ superior rendering technologies like ARM - ASTC, AFBC, SC and TE to reduce memory bandwidth needed for data transfer and display rendering. ASTC tech (click) is also present in Nvidia's Maxwell-based Tegra SoCs and Intel GPUs in Skylake and later processors.

ARM Graphics and Multimedia - Mali Technologies (click)

After disabling Dirty Regions, the rendering performance increase for LibreELEC AML S912 Kodi Krypton was such that CPU software decoding and smooth playback of 1080p 11Mbps 10bit H264 aka Hi10P Anime is now possible. The Kodi GUI is now a lot smoother too. Smile

LibreELEC developers have had a discussion over in the Slack chat forum and have found even modern LE/Linux Kodi - Intel hardware benefits from disabling Dirty Regions completely now too, resulting in a smoother Kodi GUI there as well.

For ARM GPU - Kodi users with newer GPU hardware, like AMLogic S912's and I would think the Maxwell microarchitecture GPU found in the NVIDIA Shield, and very likely the Imagination PowerVR GPU's found in the Apple TV 4/4K. It might be worth plugging in a Kodi advancedsettings.xml file and disabling Kodi's Dirty Regions functionality completely there as well.

Copy the following code to a plain text file named advancedsettings.xml - save to a USB stick and use Kodi's File Manager to copy the .xml file into the "Profile" directory.

Code:
<advancedsettings>
  <gui>
    <algorithmdirtyregions>0</algorithmdirtyregions>
  </gui>
</advancedsettings>
W.
Reply
#2
Interesting...

Thanks for sharing. I'll surely give it a try
Reply
#3
On my Mi Box I've noticed that the very first time an external subtitle appears, the image freezes (stutters) for a second or two... Don't know if the DR feature is the source of this, but I will give it a try Smile

Thanks so much for sharing the info! Smile
Reply
#4
I'd like to see a dev chime in here. From what was described back then, I don't understand how disabling this could improve performance. Even more so when video playback performance is mentioned as being influenced.

Furthermore, if this would be the case (I'm not saying it isn't, OP has clearly devoted some time to reason this out), it would be interesting to know which GPUs (not just SOCs) would benefit from disabling dirty region detection.
For troubleshooting and bug reporting please make sure you read this first.
Reply
#5
(2018-04-06, 15:41)ashlar Wrote: From what was described back then, I don't understand how disabling this could improve performance. Even more so when video playback performance is mentioned as being influenced.
Calculating a dirty region needs CPU and it's purpose is to reduce GPU load. With modern fast GPUs, it's fast enough to always render full frame so CPU has better things to do.
I agree also it's not clear why does it improve video rendering as this feature is described as only working for UI overlay.

I'm curious what impact does it have considering usage of libhybris ?
Reply
#6
(2018-04-06, 15:41)ashlar Wrote: From what was described back then, I don't understand how disabling this could improve performance. Even
I'm curious what impact does it have considering usage of libhybris ?
LibreELEC on a S912 uses libhybris and Android gralloc Mali driver to provide OpenGL for Kodi (ie not optimal Linux only drivers)

Disabling Dirty Regions results in:
- Fixes video playback stuttering when displaying Subtitles.
- Now allows CPU software decoding and smooth video playback of 1080p 11Mbps 10bit H264 aka Hi10P Anime.
- A bug free, S912 LE Kodi Leia might now be able to software decode 1080p Netflix - with 5.1DD+ audio and dynamic refresh switching for smooth video playback.
- The Kodi GUI is now smoother too.

Remember that ARM Tech I linked to above - all that memory bandwidth saving and rendering optimisation is far faster than what Kodi can do these days in software with Dirty Regions.



On a related matter this is what the Kodi dev. @peak3d is doing in Kodi Leia:

Improve GUI message handling / gui:: smartredraw #12213

Quote:@wrxtasy smartredraw 's goal is quite simple: prevent CPU (!) usage during idle, walk through the GUI to detect changes and if nothing changed do nothing.
So in short it is same as DR 0 but saves CPU during idle...
...Currently CPU is the issue that complex GUI parts are not ready during 2 frames, and there is really a huge optimisation potential.
Reply
#7
Is this beneficial for Vero 4K users @Sam.Nazarko ?
Reply
#8
(2018-04-07, 01:26)wrxtasy Wrote:
(2018-04-06, 15:41)ashlar Wrote: From what was described back then, I don't understand how disabling this could improve performance. Even
I'm curious what impact does it have considering usage of libhybris ?
LibreELEC on a S912 uses libhybris and Android gralloc Mali driver to provide OpenGL for Kodi (ie not optimal Linux only drivers)

Disabling Dirty Regions results in:
- Fixes video playback stuttering when displaying Subtitles.
- Now allows CPU software decoding and smooth video playback of 1080p 11Mbps 10bit H264 aka Hi10P Anime.
- A bug free, S912 LE Kodi Leia will very likely now software decode 1080p Netflix - with 5.1DD+ audio and dynamic refresh switching for smooth video playback.
- The Kodi GUI is now smoother too.

Remember that ARM Tech I linked to above - all that memory bandwidth saving and rendering optimisation is far faster than what Kodi can do these days in software with Dirty Regions.


On a related matter this is what the Kodi dev. @peak3d is doing in Kodi Leia:

Improve GUI message handling / gui:: smartredraw #12213
Quote:@wrxtasy smartredraw 's goal is quite simple: prevent CPU (!) usage during idle, walk through the GUI to detect changes and if nothing changed do nothing.
So in short it is same as DR 0 but saves CPU during idle...
...Currently CPU is the issue that complex GUI parts are not ready during 2 frames, and there is really a huge optimisation potential.

Will software decoded video go through the ARM texture etc. compression that you linked to in earlier posts? Some of it - like ASTC - sounds pretty hefty,  the compression is optimised for speed and bandwidth not image quality...
Reply
#9
Quote:The lossless compression ratios achievable with AFBC are comparable with other leading standards
Reply
#10
(2018-04-09, 13:17)wrxtasy Wrote:
Quote:The lossless compression ratios achievable with AFBC are comparable with other leading standards
 Ah - so is this using the AFBC compression and not ASTC?  (I guess you are saying it does!)
Reply
#11
I'm only ever seeing AFBC messages in the S912 Kernel log.
Not sure about ASTC to be honest, it does look developer adjustable to fine-tune the tradeoff between quality versus texture size and upload bandwidth.
Reply
#12
Disabling dirty region results in greatly increased CPU usage on AppleTV 4K with zero effect on making GUI 'look' smoother. 

Not convinced smartredraw will solve anything. Checking for (DoProcess) and/or doing a a redraw (DoRender) needs to happen every frame or animations will look bad.  Every 500ms is just too slow. Possible there are some good things in there (smartredraw) that might help other aspects of guilib.

The whole point of dirty regions was not to take the load off the GPU but rather reduce the CPU usage. Reducing load on GPU was a side effect. It was added (2011'ish) to address user complaints of XBMC/Kodi using lots of CPU when nothing was changing on the GUI. The forum was filled with users complaining about this, even on desktop boxes.

In refactoring tvOS focus engine, one thing I noticed was there was lots of calls to various controls' DoProcess that were required to compute that nothing changed. This could be improved. But the focability tracking had to be done in DoRender or one would miss critical changes, like a dialog going away. Which means that you need to handle both DoProcess and DoRender to render the GUI smoothly and correctly.

If one has issues regarding GPU, it tends to be memory bandwidth related. With most all embedded SoC's, there is a finite memory bandwidth that has to be shared between CPU and GPU because they actually share the same RAM. The AppleTV2 had such an issue that showed up when playing full 1080p h264. Even though the hardware decoder was outputting io surfaces that were already in so called GPU memory space. This was solved by checking the video size and having the hardware decoder resize on output to something smaller. No one ever noticed Smile

The AppleTV 4 has a similar issue with 4K content. Since the AppleTV 4 only handled up to 1080p displays, a similar game is played and the hardware decoder is configured to resize to 1080p when handling 4K content. Again, no one noticed as they could not because of the display limits.

The AppleTV 4K does not have this issue with 4K content be it SDR or HDR. It has lots of memory bandwidth as well as a CPU with lots of ponies.

If you really want to resolve GUI drawing, dump the entire guilib for QT. It is a modern GUI drawing API that handles these sort of things.
Reply
#13
Is there a recommended dirtyregion setting for S905x ?
Reply
#14
(2018-04-10, 03:11)davilla Wrote: The whole point of dirty regions was not to take the load off the GPU but rather reduce the CPU usage. Reducing load on GPU was a side effect. It was added (2011'ish) to address user complaints of XBMC/Kodi using lots of CPU when nothing was changing on the GUI. The forum was filled with users complaining about this, even on desktop boxes.
 
1.) If there are animations, smartredraw draws them with full refresh rate. 500ms is for idle (clock)
Means, if there are animation, smartredraw does nothing.

SmartRedraw s only fine to reduce power (GPU / CPU during idle) to save power / reduce heat over the night

2.) DR does not help for reducing CPU load, the CPU is taken in strcmp of the Skin rule engine.
Reply
#15
Funny, DR seems to drop CPU usage a lot when GUI is idle despite strcmp's in Skin rule engine. The strcmp load is nothing, the real load comes from texture upload and render as the display is drawn from scratch each frame.
Reply
 
Thread Rating:
  • 0 Vote(s) - 0 Average



Logout Mark Read Team Forum Stats Members Help
Kodi's Dirty Regions - ARM GPU Tech - improving Kodi rendering performance00