Login at Kodi Home

hannes69 · 2017-03-04, 23:04

Hello,
topic "volume amplification": Cool

Of course it is good that Kodi offers something in this direction.
Often audio of movies is globally too low or there are disbalances or preferred user settings. With "volume amplification" the user has a little tool to influence that. In my opinion this option is implemented in a way that only covers the "user has personal preferences" part but NOT the in my opinion more important part of working against formerly made errors and keeping the originally intended balance.
What´s my point?
In a homecinema many people try to achieve firstly standard conditions (e.g. video calibration with 3DLUT, set the right contrast, master EQing audio,...) and then you can of course if you want realize personal preferences (maybe a certain video gamma value, more bass boost,...).
The "volume amplification" works in reality like a compressor. Loud parts stay loud (you can´t boost them without distortion), so they stay at the same volume. Quieter parts get amplified. So the dynamic range is lowered. At least this behavior of this option is seen when boost is set so high that there would be clipping.
A compressor is the tool for personal preferences (e.g. explosions seem to be too loud, voices seem to be too low...).
Of course it is good to cover that use-case.
What I am missing is an implementation in such a way that the dynamic range is not altered. And I want consistent volume. What I want is called peak volume normalization. And this is very cheap to achieve.
When e.g. making an audio cd you can normalize the audio files: Scan the file, detect the maximum volume, scale the volume so that the loudest part has 0dBFS.
Ok, we have video. We have not the time to scan whole files or we have internet streams and so we can´t scan the whole file at the beginning of playback.
No problem.
MPC HC has achieved this in a very simple, correct and working in reality way:
The boost setting means: "maximum allowed boost". So let´s set this e.g. to 20dB. Now what does the implementation: Sound is coming, movie is starting. Sound has -40dB. +20dB is allowed, so the sound is boosted to -20dB. Now there comes a louder part, volume -20dB. -20dB + 20dB = 0 dB, everything fine. Now a sound is followed with -10dB. +20dB allowed, but -10+20 = 10 = clipping, so only 10dB applied -10+10 = 0. So when the movie is already rather loud, the applied boost is less compared to a vey quiet movie, that means "maximum allowed boost".
What does this mean? This is in fact normalization (and certainly not compression), so the dynamic range is preserved. When you playback rather normal movies with different peak volume, you get consistent results by that. You can allow e.g. 40dB boost (should be sufficient for all cases), there is never clipping and the dynamic range is not altered. You will reach 0dBFS with every movie with this implementation without compressing the audio (when allowing sufficient boost).
There is only one disadvantage, which isn´t really one in practical use: The achieved effective volume is not right until the loudest part of the movie. Normally a movie doesn´t start wit a big explosion, so until the point, where normal/loud volume is present, the effective volume is too high. But that is not a restriction in practical use. Because you won´t have a movie, where you have extremely low sound level for an extended period of time at the beginning of the movie, and then you have to wait eternally for a loud part. Of course in a movie are quiet and loud parts (the dynamic range), but certainly not distributed polarized in a movie. So worst case you´ll have some too loud nature noises like wind or water sounds or whatever at the beginning of certain movies, but after a short time there is definitely normal loud sound like dialogues, technical sounds,....
MPC HC has additionally a checkbox named "regain volume". This setting leads to the compressor, when checked the volume is boosted again after loud parts. But the advantage in MPC HC is, that you have the choice.
There is not much work to be done to achieve that in KODI, because half of the fuction (or more) is already implemented. In the GUI you would only need one additional checkbox, and the algorithm has to have additionally the variant of not compressing the audio.

This functionality is needed because movies don´t have consistent volume (especially stream ressources) and normalization is part of achieving standard conditions (so you can leave the volume knob of the amp in a defined place).
The problem here is that often these two approaches are totally mixed up. Compression or DRC or night mode or... is a matter of personal taste. Normalization is a feature of standard. In my opinion it seems a little strange to me to offer the setting for personal taste and not the one for standard conditions.
Kodi has a massive amount of "super special" features in my opinion, but such a basic one is missing...

I hope that someone will find the time to "fix" that. Nod

Hannes.

**Memphiz** · 2017-03-05, 13:44

Sounds like a plan for an adsp addon...

hannes69 · 2017-03-06, 15:15

Quote:Sounds like a plan for an adsp addon...

Hm. I think a complete addon is a little bit overkill when instead one checkbox and a very simple functionality behind it would do it. And audio DSP addon - there are rumors about that since many years, I don´t see coming that. Seems that noone is really interested in audio quality, most of the effort is given to video. I understand that, the visual part of videos is maybe more important than the audio part, but on the other side most of us don´t watch mostly 1920´s silent movies so good quality audio may have its importance as well...

ironic_monkey · 2017-03-06, 16:33

the perfect proof this fits for an addon: you are the first guy who have asked for this. ever. you are free to hack it in yourself in whatever way you see fit if the adsp schedule does not fit your needs.

hannes69 · 2017-03-06, 21:00

Quote:the perfect proof this fits for an addon: you are the first guy who have asked for this. ever. you are free to hack it in yourself in whatever way you see fit if the adsp schedule does not fit your needs.

Ah, I understand. You have the perfect overview over 1000s of threads and posts of the last years and you of course know that I am the only freaking person who wants such a feature.
I posted in this section here to give a hint in the direction of a different/better/alternative audio normalizing approach, nothing else.
If other people don´t ask, this could probably mean that they are lazy, cant´t describe the problem in proper words, take a software as is, don´t even know that there could be something better in this place,...
By chance or whatever reason, the guys from MPC HC have made the right decision in my opinion and offer the right working solution for the audio normalization problem. And by chance or whatever reason the Kodi guys have left out this solution.
I´m not posting here in order to get a personalized solution for my problems.
I can use MPC HC as external player in Kodi. And if I really want audio peak normalization within Kodi, I´ll have to do some code change (I don´t fear that) and compile Kodi on my computer (here I have more fears, it is a huge packet and that will probably take some time and effort to have the first successful compilation).
My post was more of the sort of giving some well meant hints. I for myself always find workarounds to get the wanted solutions. It was meant in this direction, that an interested dev for Kodi´s audio part may read this here and may get inspired and realize parts of that.
Maybe my approach is not very welcome, then I won´t post such hints anymore. But then you could probably close the "feature request" section when the answer is "you are free to hack it in yourself in whatever way you see fit" - such comments are ridiculous in my point of view. Translated this would mean: take everything as is (it is perfect as is), if you are thinking not - make your own software.
I wish you a whole bunch of success with this attitude.

**Memphiz** · 2017-03-06, 22:36

You've misread I think - the proposal to hack it in was only in case you are unhappy with the adsp addon schedule ...

hannes69 · 2017-03-06, 23:19

Oh, if it was a misunderstanding, then I definitely apologize for using maybe some harsh words...
If ADSP addons are really coming with Kodi 18 and the described functionality above could be implemented then I would be more than happy with that.
It wasn´t a matter for me to have this feature instantly at the moment of asking for it, but rather see the possibility that a missing feature finds its way into Kodi.

**AchimTuran** · 2017-03-18, 02:32

Yeah it definitely fits into an AudioDSP mode. I guess officially my new AudioDSP implementation will not be available for V18, but I started to implement this feature for 17.1. Currently can't say when it is ready.

What I miss in your description how would you estimate the loudness in a block of samples? What do you do in a block of noise?

hannes69 · 2017-03-18, 23:46

Quote:Yeah it definitely fits into an AudioDSP mode. I guess officially my new AudioDSP implementation will not be available for V18, but I started to implement this feature for 17.1. Currently can't say when it is ready.

Oh, nice to hear that there is some work in progress in this field Smile

Quote:What I miss in your description how would you estimate the loudness in a block of samples? What do you do in a block of noise?

I´m not so sure about what you mean exactly. The "algorithm" I have described does not estimate loudness (and I´m not talking about loudness because loudness is perceived volume. I´m rather talking about absolute or peak volume).
My approach (or better to say the approach of MPC HC´s implementation I´m referring to) works simply with gain + clipping detection.
I don´t know at which exact point of the playback chain the AudioDSP docks, but I assume that you have a sort of small buffer. The buffer only needs to be large enough that you can react fast enough for clipping protection.
A "block of noise" is no special case.
Maybe my wording was not appropiate to describe the algorithm, so I try once again:
1. Apply a fixed amount of gain to all audio channels. The amount could be predefined with a useful value or public user defineable.
2. Look into audio buffer. If the applied gain would lead to clipping for a future audio sample, reduce the applied gain permanently to a value that protects clipping (e.g. 0dBFS)

With this approach the dynamic range is not touched. It is simple peak normalization. This "algorithm" is intended for audio material that has not 0dBFS peak audio volume (which is very common). This algorithm is not intended to alter audio to personal preferences (compression, change perceived loudness,...), it simply makes use of the technically possible dynamic range.
The goal is to leave the dynamic range of the audio material as is and leave the volume knob of the final device (e.g. audio amp) at a defined position.
Or in other words: "play back audio as loud as possible without changing its dynamic range and without clipping".
This algorithm only reacts to audio that is too loud (= the clipping protection), it does nothing when audio is low or gets lower (beside of the permanent fixed audio gain of course).
This algorithm is not intelligent, couldn´t be any simpler but is very effective.
I think there is no other way to do that correctly. You can´t scan a whole audio file before playback because this would take some time or in case of streaming sources you don´t have whole of the audio material before playback ready to scan.

Hannes.

**AchimTuran** · 2017-03-29, 23:26

Thanks for describing it again with different words. I found a library with VST and Winamo implementations. It's also possible to use it in a Auditors add-on because it's a C++ library.

https://github.com/lordmulder/DynamicAud.../README.md

Could this fill the gap?

hannes69 · (This post was last modified: 2017-03-30, 22:12 by hannes69.)

Hmm...
I´ve carefully read whole of the readme.
What this library offers is in reality dynamic range compression. They deny it, but it is. Their "trick" is to work with small chunks of the audio data. A compressor e.g. works in real time and doesn´t need a buffer, so the chunk here is indefinetely small. What I would like to have is the simplest version of normalization with a chunk size "the whole file". And the Dynamic Audio Normalizer (the wording is wrong, this is NOT normalization) uses a chunk size in between, not very small and not as big as possible. So they come to their so called "local regions".
You can see the outcome at the bottom of the readme. This is compression, what else? The louder parts stay loud, the quieter parts get louder, this is the result of compression, not of normalization.
The result is better than a simple real time compressor, no question. So this would be the right tool if someone wants to do sophisticated compression without the danger of heavy pumping. So for people who would like to achieve similar perceived volume between different files and compression within a file.
Why is this tool not what I want to have (what I want to have is way way simpler)? Let´s take a normal movie as material. The movie can have a quiet part with nature sounds like quiet water or wind sounds e.g. with a duration of a minute or more. And it can have a part with action scenes with gunfire and car races (lasting several minutes).
With Dynamic Audio Normalizer (and thinking of the bottom picture in their readme) you know what now happens. The car race scene stays loud as is and the nature sounds are pumped up.
But the movie´s sound editor made the nature sounds maybe as quiet as they are by purpose, so it is simply wrong to pump them up (when the goal is to preserve the given dynamic range).
If the tool has some parameters and you can set it in a way, that it only applies gain and protects clipping, then it could be applicable for the purpose I described. If not, it is a tool with a rather sophisticated algorithm to get a kind of "soft" dynamic range compression (compared to a "normal" real time compressor).
With using correct normalization, only one starting gain factor is allowed, and only one direction of changing that is allowed (getting smaller for clipping protection).
As soon as there are different gain factors and they are getting higher and lower over time, this process can´t be called normalization anymore. This is compression. And compression is a matter of personal taste (when playing back already mastered material).
Normalization is good for eliminating total gain errors during the whole distribution/playback chain. Often there are headrooms for audio involved to avoid hopefully possible clipping. And sometimes this is done several times and accumulates then.
I´m often watching TV emissions through the TV station´s online portal (and through a KODI addon if it exists for the TV station/chain) and it is very often the case that the peak volume of an emission is in the region of -20 ... -10dB.
The "algorithm" I described makes 0dBFS peak volume out of all emissions without clipping and without changing the given dynamic range.
The "algorithm" I describe is so simple, that there is no library etc out there, because it´s a matter of a few code lines:

gain = 40; //in dB; can be a useful hardcoded value or user defineable
while (true)
{
buffer = audioinput; //audio sample given in dB
if ((buffer + gain) > 0) gain = buffer * (-1); //clipping protection
audiooutput = buffer + gain; //normalization
}

It´s as simple as that Smile

And the Dynamic Audio Normalizer could cover the "personal taste" part of the story.

**AchimTuran** · 2017-03-30, 22:27

(2017-03-30, 22:10)hannes69 Wrote: gain = 40; //in dB; can be a useful hardcoded value or user defineable
while (true)
{
buffer = audioinput; //audio sample given in dB
if ((buffer + gain) > 0) gain = buffer * (-1); //clipping protection
audiooutput = buffer + gain; //normalization
}

It´s as simple as that

And the Dynamic Audio Normalizer could cover the "personal taste" part of the story.

If it is so simple you should start to read the AudioDSP API headers and start to wrap that algorithm into an add-on. Big Grin

But how would that work?

Code:
if ((buffer + gain) > 0) gain = buffer * (-1); //clipping protection

From a first look this is the implementation for your algorithm.

Quote:I´ve carefully read whole of the readme.
What this library offers is in reality dynamic range compression. They deny it, but it is. Their "trick" is to work with small chunks of the audio data. A compressor e.g. works in real time and doesn´t need a buffer, so the chunk here is indefinetely small. What I would like to have is the simplest version of normalization with a chunk size "the whole file". And the Dynamic Audio Normalizer (the wording is wrong, this is NOT normalization) uses a chunk size in between, not very small and not as big as possible. So they come to their so called "local regions".
You can see the outcome at the bottom of the readme. This is compression, what else? The louder parts stay loud, the quieter parts get louder, this is the result of compression, not of normalization.

I don't know if it is compression, because what I learned about compression is different. Loud parts would be compressed and the dynamic range is decreased. This is what we did here.

hannes69 · 2017-03-31, 18:52

Quote:If it is so simple you should start to read the AudioDSP API headers and start to wrap that algorithm into an add-on.

The algorithm itself should be so simple. Of course there´s always much coding around that for numbering/math purposes, error catching,...
If I find some time, I´ll look into the AudioDSP headers. The problem at this place is, that I´m not really a programmer. I had some courses during my studies in C, basics of C++ and made some little tools with C#. I´m an electric engineer and used some coding skills at work and for personal free time tools. But: I have bigger difficulties to read someone other´s code and I can´t really work with teams in that place (so sharing code, conventions, APIs, ...) I consider my own coding very goal orientated, so the main purpose is that it works, no matter if it is coded in a clean good way or consisting of "dirty hacks". I like to program the main functionality, but everything around that is very annoying to me. Main technique I use is programming with .NET, typing the "." in Visual Studio and choose the function I need Wink

So in short: I didn´t mean my post in a direction to say that the work that has to be done is a gimme or everyone could do that. I only wanted to say that the basic concept of normalization is easier than compression and the core functionality should not be very complex.

Quote:From a first look this is the implementation for your algorithm.

Yes. This seems a little bit more complex than what I suggested, but: a) a huge part of this code covers the audio peak value calculation for different bitrates (I took that as given in my pseudocode snippet) and b) if I read the code right (and this is difficult for me) and consider MPC HC´s audioswitcher functionality, the code covers two use cases: within the audio switcher you have a checkbox called "Regain volume". When checked, you have some sort of compression. The behaviour I described and like to be implemented in Kodi (or an addon) is the scenario when the checkbox is NOT checked. Not checked leads to the described normalization. Checked means: After a loud part (when the gain gets reduced due to clipping protection) the gain value climbs up again when a quieter part is being played back.
So this implementation covers the normalization function and a sort of compression and the user can choose between the 2 with a checkbox. Because of the "Regain volume" part this code is way complexer than when only offering the normalization part. And additionally the code covers a "boost factor" which is chooseable by the user and gives the possibility to add additional boost for very quiet material when the normalization gain factor isn´t sufficient or you don´t want to use the normalzation function and only add plain boost.
Because of this code being publically available (didn´t know that) you now have my wanted algorithm already coded Smile

Quote:I don't know if it is compression, because what I learned about compression is different. Loud parts would be compressed and the dynamic range is decreased. This is what we did here.

Compression means: audio range is compressed = decreased. Normal compression algorithms work like that: You define a threshold, maybe -20dB. Everything below is not processed, everything higher is processed. Then you define a ratio, maybe 3:1. So every sound part louder than -20dB gets compressed by 1/3 (so e.g. -11dB becomes -17dB; -11dB is 9dB too loud, 9dB/3 = 3dB; -20dB + 3dB = -17dB). Finally you define the makeup gain which applies gain after the compression. When configured right, the loud parts stay the same (first compressed and afterwards lifted up agian with makeup gain) and the quieter parts get louder by the makeup gain.
The compression affects technically the louder parts in the first place, but when looking at the whole algorithm (the whole concept of a 'compressor' including all steps) you´ll see that at the end the quieter parts get louder compared to the louder parts. And the result is a reduced dynamic range. And the perceived volume goes up.
When not applying the makeup gain afterwards, the louder parts get quieter, the quieter parts stay the same, the perceived volume is reduced, the dynamic range is reduced. Of course you can use a compressor like that, but it is not the normal way. You always want to use as much as possible of the available dynamic range bandwith.
Not using the makeup gain is a perfect example for my normalization use case: Someone used a compressor in the mastering/broadcast chain anf forgot the makeup gain. Whole of the signal is now reduced by a fixed amount maybe 10dB. Now with normalization I get back that missed 10dB and land at 0dB and use whole of the available technically possible dynamic range without altering the source signal´s dynamic range.
Of course you can leave out normalization and use the final analog amp´s volume knob. But that would mean that for every file you have to change the volume knob and when lifting the signal not digitally but in this analogue way you lift the ground noise of the amp too.
The goal is always to feed the signal to the amp with a maximum of 0dBFS. At the beginning and in the middle of the whole signal chain you can leave some headroom for later stages (so that the signal isn´t being clipped by accident or some tool like an EQ or ...), but at the last stage (the audio renderer should be the last stage) you should have consistent peak volume and maximal possible peak volume. I´m still talking about the technically right peak volume and not personal taste.

AbRASiON · 2017-08-11, 13:16

Was something changed with AAC? because all my AAC files, I need to turn my amp up to maximum (!!!) volume to hear, then when I play a diff TV show, I nearly blow my speakers out.