(2014-04-20, 00:33)alwinus Wrote: have not so much time now to make a big answer, but one question, have you looked before to following links? https://github.com/xbmc/xbmc/pull/4402
Yes I did. There is a problem. You don't care about the sampling frequency when entering the audio DSP pipeline. IMO, a more practical approach is to explicitely manage the sampling frequency before entering the audio DSP pipeline. Compare your proposed pipeline, with my proposed pipeline. What you propose is only good if you ask ffmpeg to do a sample rate conversion, like converting the 48 kHz audio coming from a movie, to the 44.1 kHz that Windows is using. We don't want this. We want ffmpeg to remain "bit exact" and deliver "bit exact" audio to the audio DSP pipeline. Or, if we are lazy and don't want to design IIR Biquads, FIR filters and delays for 44.1 kHz, 48 kHz, 96 kHz and 192 kHz, we must ask ffmpeg to do a sample rate conversion, possibly a high quality sample rate conversion enabling to tune the CPU load / data precision tradeoff. If ffmpeg cannot do the required sample rate conversion, the first stage (Stage 1) of the audio DSP pipeline must do it.
In first place, consider XBMC as a sophisticated GUI for ffmpeg :
- asking ffmpeg to do things
- grabbing data streams coming out from ffmpeg.
You can see the ffmpeg audio options here :
https://ffmpeg.org/ffmpeg.html#Audio-Options.
That's not all.
XBMC also takes care of kindly interfacing the driver of the audio hardware attachment (Realtek HD audio, USB audio, HDMI), actually this is more complicated because you always have an Audio Server deployed on the platform. Basically this is Windows Direct Sound (DS), mixing different audio sources having different sample rates, sending the resulting 44.1 kHz audio to the hardware using the driver that's attached to the hardware. Oops, this means that for movies there is a 48 kHz -> 44.1 kHz resampling occurring within Windows Direct Sound. This is thus not "bit exact". We are kaput. What to do ?
There are more sophisticated Audio Servers like JACK, able to take a Windows Sound source (like a stereo audio stream coming from ffmpeg) and route it to ASIO. Starting from there, if you have a VST host installed, you can stack a few VST for implementing any kind of audio DSP pipeline. The last VST of the pipeline will talk to the ASIO driver of your hardware, for outputting the sound. Currently there are ASIO drivers for Realtek HD Audio on the motherboard (see ASIO4ALL), there are ASIO drivers for multichannel USB audio attachments, but I fear there is no ASIO driver for the HDMI 8-channel LPCM modality. Will you check that ?
Let's consider that there will be a software utility called ASIO4HDMI. Starting from that moment, everything will simplify after the last VST. Let me explain. If you want to output multichannel audio on the Realtek HDaudio motherboard, you will rely on the ASIO4ALL driver. If you want to output multichannel audio on high quality multichannel USB audio attachments, you will rely on the ASIO driver that's supplied with it. And, if you want to output multichannel audio on HDMI 8-channel LPCM, you will rely on the ASIO4HDMI driver.
Thus, it is better to avoid XBMC taking care of the audio that's coming out of the VST pipeline.
Thus, it is better to leave XBMC unaware of the sound that's coming out of the VST pipeline.
Well, it is a little bit more complicated than this, as when you chose the HDMI 8-channel LPCM modality, you need to deliver the 8-channel LPCM audio to XBMC, for XBMC embedding the video on HDMI.
So, in a nutshell, the ultimate, most flexible and most easy to understand audio DSP manager for XBMC would consist on :
- a selection box enabling to chose between "disable advanced audio" and "enable advanced audio"
- when "advanced audio" gets enabled, there should be a selection box about the audio peripheral hardware (typically : Realtek HDaudio motherboard, USB audio attachment, HDMI 8-channel LPCM)
Actually, even in the "advanced audio" modality, XBMC can remain in charge of almost nothing what's regarding audio
XBMC will talk to ffmpeg for :
- asking ffmpeg to do things
- grabbing the audio stream coming out from ffmpeg
XBMC would NOT route the audio CD, DVD or Bluray sound to the Windows sound mixer.
Instead, XBMC would route the audio CD, DVD or Bluray sound to the ASIO subsystem. Actually there are two sound domains :
- the XBMC sound domain (audio CD, DVD, Bluray, MP3, high quality streamed audio)
- the Windows sound domain (YouTube, Security camera, spotify, etc ...)
How to deal with those two audio domains, potentially exhibiting different sample rates ?
The "bit-perfect" strategy would avoid any sample rate conversion what's regarding the XBMC audio.
This means that the ASIO sample rate must change on-the-fly, like when you go from listening a CD (44.1 kHz) to watching a movie (48 kHz).
Don't know if this is feasible. Will you check ?
Say it is feasible.
Comes the need of mixing the XBMC sound, with the sound coming from the Windows mixer ("beeps" from Windows, sound from a security camera, sound from YouTube, ...).
You thus need to resample the Windows sound, for matching the XBMC sound sample rate.
There can be a button telling "disable Windows sound" in which case only the XBMC sound domain gets processed, hence no mixing, no resampling.
For each VST in the pipeline, you need to prepare four sets of filters coefficients and delays (44.1 kHz, 48 kHz, 96 kHz, 192 kHz). You need to select them on the fly.
The "ffmpeg 44.1 kHz resample" strategy would rely on ffmpeg for delivering an audio sample rate, that's matching the nominal Windows sample rate which is 44.1 kHz.
This means that the ASIO sample rate is always equal to the Windows sample rate.
Mixing other sounds is easy (a simple addition) like the "beeps" from Windows, or like allowing your security camera to announce a visitor while watching a movie.
There can be a button telling "disable Windows sound" in which case only the XBMC sound gets played, hence no mixing.
All VSTs in the pipeline will select the 44.1 kHz coefficients and delays.
The "ffmpeg 96 kHz resample" strategy would rely on ffmpeg for delivering an audio sample rate, that's always equal to 96 kHz.
Comes the need of mixing the XBMC sound, with the 44.1 kHz sound coming from the Windows mixer ("beeps" from Windows, sound from a security camera, sound from YouTube, ...).
You thus need to resample the Windows sound, for matching the 96 kHz XBMC sound sample rate.
There can be a button telling "disable Windows sound" in which case only the XBMC sound domain gets plrocessed.