2013-11-22, 04:47
Hello,
With XBMC becoming a de-facto HTPC standard, I guess there will be many users, potentially frustrated in case they cannot select the audio DSP of their choice.
Think about Winamp.
Inside Winamp you have a built-in audio DSP features like equalization. There are many users, wanting for more.
In case you want more you need to access the "DSP studio" this way : Options - Preferences - Plugins - Effect/DSP.
"DSP Studio" is the list of audio plugins to be loaded, with their parameters, to be applied to the audio that's playing.
From the Winamp website, you can download dozens of audio DSP plugins.
Some are exceptional, deserving their own website like Hans van Zutphen Stereo Tools 3.0 http://www.hansvanzutphen.com/stereo_tool/
Think about the diyAudio community.
The diyAudio community is in search of an audio player that can be trusted, "bit-accurate" when feeding audio DACs, and relying on a 32 bit audio precision when applying DSP inbetween.
Think about Android tablets vs. x86 HTPCs
Progressively, the diyAudio communauty got persuaded that audio streaming from a USB key, from a local harddisk, of from a NAS, is the way to go. Products and services like The Beatles Stereo USB files (24 bit FLAC) and www.hdtracks.com acted like triggers. In general, diyAudio people don't trust x86 HTPCs because of the way Windows (XP, Seven, 8) is dealing with audio, most of the time they suspect resampling, and most of the time they assume that Windows can't deliver a "bit accurate" quality. Regarding x86 hardware, the diyAudio community is fearing electromagnetic pollution caused by motherboards consuming tens of watts. On top of this, comes the audio clock jitter issue when the audio is coming from an USB audio attachment, unless relying on the quite recent asynchronous USB audio protocol.
Think about audiophile DSP.
We know how devastating room effects can be, for a given loudspeaker. More and more the diyAudio community is looking in the direction of room correction, basing on devices called "convolvers" that are FIR filters. The idea is to clean up the impulse response, for attaining a flat wide frequency bandwidth, and a linear phase. Coding a 32 bit resolution FIR filter only takes a few minutes. It is a repetitive multiply-accumulate within a do-while loop, done on a row of audio samples. Most of the time you will be more than happy, if your stereo FIR can deal with 4096 audio samples at a time. This is not a fatal workload for GHz-class processors nowadays.
We know how lobing can be devastating. Lobing is a problem mostly caused by multiway speakers, exhibiting radically different frequency responses (and impulse responses), depending on the angle you are listening to them. In a non-concentric 2-way speaker, lobing comes from the midbass driver negatively or positively interfering with the tweeter, when they operate in the frequency overlap zone. When you are listening slightly off-axis, the midbass-ear and tweeter-ear distances are not the same. If the path difference is equal to 1/2 sound wavelength, there will be a giant "dip" in the perceived frequency response, at the frequency corresponding to such wavelength. Basically, knowing that 80% of the sound energy is made of reflected sound, even if you are sitting "on axis", what you will hear is a sound that's 80% corrupt. Amazing, isn't ?
In order to annihilate most of those annoyances at the source, the diyAudio communauty agrees that multiway speakers should be treated for what they are, in digital domain. Thus, in case you are listening in stereo, using 3-way speakers, your diyAudio system should generate 6 audio signals in digital domain, each tailor-made for the driver it will feed. This implies using 6 power amplifiers, and inbetween, a 6-channel volume potentiometer.
A nice bonus, is that speakers crossovers operating in digital domain, can deliver a better performance than any analog crossover.
There are three reasons explaining this.
The first reason is that digital enables the implementation of perfect delays, while in analog, delays only can be approximated by many phase shifters (allpass filters) put in series. Thus, only in digital, you can easily implement Transient Perfect crossovers schemes like the Lipshitz-Vanderkooy Delay Compensated, presented at the AES in 1983. You can implement it using a Butterworth lowpass kernel (what L-V showed at the AES), or using a high-order Bessel lowpass kernel (approximating a Berchin Gaussian Lowpass - AES 1999). The high order Bessel lowpass kernel provides as bonus, no preshoot or ringing in time domain, and no relative phase shift between the speaker drivers in the transition frequency band.
The second reason is that in digital, you can apply the inverse DFT (or FFT) for specifying a target amplitude and phase response, compare it to the actual amplitude and phase response of your driver, then using the DFT (or FFT) you can compute the required filter impulse response response, and - happy surprise - you can view such impulse response as the list of FIR filter coefficients that you need. Truly amazing.
The third reason is that in digital, instead of cleaning up the impulse response of your driver using a quite long FIR filter, you can pre-process the signal using a few IIR BiQuad filters (they consume almost no processing power), for shortening the impulse response. This is to be remembered, when dealing with midbass drivers exhibiting a strong resonance at 5 kHz or so, like some Kevlar, Fiberglass, Carbone or Aluminium membrane can exhibit. This way, the IIR BiQuad filters rub out most of the resonance, and consequently, the impulse response of the drivers becomes a lot shorter. This way, you only need a dozen IIR BiQuad filters plus four 128-tap FIR filters when dealing with a crossover operating at 2 kHz.
The net result of all this, is that you can "focus" your multiway speaker in digital domain, achieving a wide polar diagram, achieving a quasi-perfect amplitude and quasi-perfect phase response for a particular listening point. The whole stuff represents almost no CPU load, when running on 32 bit multicore GHz-class machines. Now, if the user wants to apply a room correction in stereo, this means two 4096-tap FIR filters, and that's all. Such "room decorrelator" can represent a 10% load on a GHz-class machine.
What I am suggesting here, is to leave all options open for the diyAudio communauty. How to do this in XBMC ?
In XBMC AudioEngine, this time we need to tell that the audio must always come as stereo audio, which means that in case of a multichannel source, we want the XBMC AudioEngine to downmix the audio in stereo 24 bit (fixed point) at 48 kHz. Possibly 24 bit (fixed point) at 96 KHz if this is feasible.
In XBMC AudioEngine, we need to output the stereo downmix to a virtual Jack (2-channel) called "audio DSP-in".
In XBMC AudioEngine, we need to grab the audio coming from a virtual Jack (6 channel) called "audio DSP-out".
In XBMC AudioEngine, we need to tell where to route the "audio DSP-out". This time, we'll ask AudioEngine to send that 6-channel audio over HDMI using the HDMI multichannel LPCM modality. This way our 6 audio channels will go over HDMI, and reach a 6-channel amp that we will wire in a special mode, with three channels going to the left multiway speaker, and three channels going to the right multiway speaker.
In XBMC AudioEngine, we need to tell if XBMC can manage the listening volume using the HDMI CEC.
Clearly, within XBMC, we need a structure (and a well documented API) for dealing with the "Jack-in" and "Jack-out" concepts.
There can be various situations.
1) No audio DSP : in such case, XBMC drops the concept of Jack-in and Jack-out. It behaves just as usual, and multichannel, of course.
2) Naming the DSP module to be executed : it will thus receive the 2-channel audio DSP-in, and output 8-channel audio DSP-out
3) For the sake of simplicity, don't allow chaining DSP modules.
4) An exciting possibility would be to execute a VST plugin (2 channel input, 8 channel output) as compiled by Flowstone. Could you have a look at this?
Avoid doing more at the beginning.
I definitively know this is only the beginning.
For instance, when the source is multichannel audio, we would like to have the choice between two behaviors :
1) Do what's described above (stereo active multiway speakers with digital crossover).
2) Delivering a genuine 5.1 or 7.1 (depending on the source) sound, with some DSP being applied on each loudspeaker.
My best recommendation, is that you design the DSP API right now, for supporting both modalities later on.
JRiver Media Center already has built-in DSP features.
JRiver Madia Center already has bolts and nuts for an external DSP module, including splitting channel into more channels.
Unfortunately, JRiver Media is designed for the x86 hardware, being Windows, Mac, or Linux.
See here http://www.diyaudio.com/forums/digital-l...ost3712144
Best regards,
Steph
With XBMC becoming a de-facto HTPC standard, I guess there will be many users, potentially frustrated in case they cannot select the audio DSP of their choice.
Think about Winamp.
Inside Winamp you have a built-in audio DSP features like equalization. There are many users, wanting for more.
In case you want more you need to access the "DSP studio" this way : Options - Preferences - Plugins - Effect/DSP.
"DSP Studio" is the list of audio plugins to be loaded, with their parameters, to be applied to the audio that's playing.
From the Winamp website, you can download dozens of audio DSP plugins.
Some are exceptional, deserving their own website like Hans van Zutphen Stereo Tools 3.0 http://www.hansvanzutphen.com/stereo_tool/
Think about the diyAudio community.
The diyAudio community is in search of an audio player that can be trusted, "bit-accurate" when feeding audio DACs, and relying on a 32 bit audio precision when applying DSP inbetween.
Think about Android tablets vs. x86 HTPCs
Progressively, the diyAudio communauty got persuaded that audio streaming from a USB key, from a local harddisk, of from a NAS, is the way to go. Products and services like The Beatles Stereo USB files (24 bit FLAC) and www.hdtracks.com acted like triggers. In general, diyAudio people don't trust x86 HTPCs because of the way Windows (XP, Seven, 8) is dealing with audio, most of the time they suspect resampling, and most of the time they assume that Windows can't deliver a "bit accurate" quality. Regarding x86 hardware, the diyAudio community is fearing electromagnetic pollution caused by motherboards consuming tens of watts. On top of this, comes the audio clock jitter issue when the audio is coming from an USB audio attachment, unless relying on the quite recent asynchronous USB audio protocol.
Think about audiophile DSP.
We know how devastating room effects can be, for a given loudspeaker. More and more the diyAudio community is looking in the direction of room correction, basing on devices called "convolvers" that are FIR filters. The idea is to clean up the impulse response, for attaining a flat wide frequency bandwidth, and a linear phase. Coding a 32 bit resolution FIR filter only takes a few minutes. It is a repetitive multiply-accumulate within a do-while loop, done on a row of audio samples. Most of the time you will be more than happy, if your stereo FIR can deal with 4096 audio samples at a time. This is not a fatal workload for GHz-class processors nowadays.
We know how lobing can be devastating. Lobing is a problem mostly caused by multiway speakers, exhibiting radically different frequency responses (and impulse responses), depending on the angle you are listening to them. In a non-concentric 2-way speaker, lobing comes from the midbass driver negatively or positively interfering with the tweeter, when they operate in the frequency overlap zone. When you are listening slightly off-axis, the midbass-ear and tweeter-ear distances are not the same. If the path difference is equal to 1/2 sound wavelength, there will be a giant "dip" in the perceived frequency response, at the frequency corresponding to such wavelength. Basically, knowing that 80% of the sound energy is made of reflected sound, even if you are sitting "on axis", what you will hear is a sound that's 80% corrupt. Amazing, isn't ?
In order to annihilate most of those annoyances at the source, the diyAudio communauty agrees that multiway speakers should be treated for what they are, in digital domain. Thus, in case you are listening in stereo, using 3-way speakers, your diyAudio system should generate 6 audio signals in digital domain, each tailor-made for the driver it will feed. This implies using 6 power amplifiers, and inbetween, a 6-channel volume potentiometer.
A nice bonus, is that speakers crossovers operating in digital domain, can deliver a better performance than any analog crossover.
There are three reasons explaining this.
The first reason is that digital enables the implementation of perfect delays, while in analog, delays only can be approximated by many phase shifters (allpass filters) put in series. Thus, only in digital, you can easily implement Transient Perfect crossovers schemes like the Lipshitz-Vanderkooy Delay Compensated, presented at the AES in 1983. You can implement it using a Butterworth lowpass kernel (what L-V showed at the AES), or using a high-order Bessel lowpass kernel (approximating a Berchin Gaussian Lowpass - AES 1999). The high order Bessel lowpass kernel provides as bonus, no preshoot or ringing in time domain, and no relative phase shift between the speaker drivers in the transition frequency band.
The second reason is that in digital, you can apply the inverse DFT (or FFT) for specifying a target amplitude and phase response, compare it to the actual amplitude and phase response of your driver, then using the DFT (or FFT) you can compute the required filter impulse response response, and - happy surprise - you can view such impulse response as the list of FIR filter coefficients that you need. Truly amazing.
The third reason is that in digital, instead of cleaning up the impulse response of your driver using a quite long FIR filter, you can pre-process the signal using a few IIR BiQuad filters (they consume almost no processing power), for shortening the impulse response. This is to be remembered, when dealing with midbass drivers exhibiting a strong resonance at 5 kHz or so, like some Kevlar, Fiberglass, Carbone or Aluminium membrane can exhibit. This way, the IIR BiQuad filters rub out most of the resonance, and consequently, the impulse response of the drivers becomes a lot shorter. This way, you only need a dozen IIR BiQuad filters plus four 128-tap FIR filters when dealing with a crossover operating at 2 kHz.
The net result of all this, is that you can "focus" your multiway speaker in digital domain, achieving a wide polar diagram, achieving a quasi-perfect amplitude and quasi-perfect phase response for a particular listening point. The whole stuff represents almost no CPU load, when running on 32 bit multicore GHz-class machines. Now, if the user wants to apply a room correction in stereo, this means two 4096-tap FIR filters, and that's all. Such "room decorrelator" can represent a 10% load on a GHz-class machine.
What I am suggesting here, is to leave all options open for the diyAudio communauty. How to do this in XBMC ?
In XBMC AudioEngine, this time we need to tell that the audio must always come as stereo audio, which means that in case of a multichannel source, we want the XBMC AudioEngine to downmix the audio in stereo 24 bit (fixed point) at 48 kHz. Possibly 24 bit (fixed point) at 96 KHz if this is feasible.
In XBMC AudioEngine, we need to output the stereo downmix to a virtual Jack (2-channel) called "audio DSP-in".
In XBMC AudioEngine, we need to grab the audio coming from a virtual Jack (6 channel) called "audio DSP-out".
In XBMC AudioEngine, we need to tell where to route the "audio DSP-out". This time, we'll ask AudioEngine to send that 6-channel audio over HDMI using the HDMI multichannel LPCM modality. This way our 6 audio channels will go over HDMI, and reach a 6-channel amp that we will wire in a special mode, with three channels going to the left multiway speaker, and three channels going to the right multiway speaker.
In XBMC AudioEngine, we need to tell if XBMC can manage the listening volume using the HDMI CEC.
Clearly, within XBMC, we need a structure (and a well documented API) for dealing with the "Jack-in" and "Jack-out" concepts.
There can be various situations.
1) No audio DSP : in such case, XBMC drops the concept of Jack-in and Jack-out. It behaves just as usual, and multichannel, of course.
2) Naming the DSP module to be executed : it will thus receive the 2-channel audio DSP-in, and output 8-channel audio DSP-out
3) For the sake of simplicity, don't allow chaining DSP modules.
4) An exciting possibility would be to execute a VST plugin (2 channel input, 8 channel output) as compiled by Flowstone. Could you have a look at this?
Avoid doing more at the beginning.
I definitively know this is only the beginning.
For instance, when the source is multichannel audio, we would like to have the choice between two behaviors :
1) Do what's described above (stereo active multiway speakers with digital crossover).
2) Delivering a genuine 5.1 or 7.1 (depending on the source) sound, with some DSP being applied on each loudspeaker.
My best recommendation, is that you design the DSP API right now, for supporting both modalities later on.
JRiver Media Center already has built-in DSP features.
JRiver Madia Center already has bolts and nuts for an external DSP module, including splitting channel into more channels.
Unfortunately, JRiver Media is designed for the x86 hardware, being Windows, Mac, or Linux.
See here http://www.diyaudio.com/forums/digital-l...ost3712144
Best regards,
Steph