Mixing multiple WaveStreams

Jan 25, 2012 at 2:43 PM

I'm trying to output multiple audio streams from a server at the same time, but because my WaveStream is getting the data live, it doesn't implement Position or Length which WaveMixerStream32 requires to function. Each stream also starts and stops differently than the rest of the streams since each stream is actually a person talking on a mic.

Any suggestions on how to mix multiple streams together?

Thanks!

Coordinator
Jan 25, 2012 at 3:04 PM

I usually implement my own mixer stream for things like this, probably deriving from IWaveProvider rather than WaveStream since Length and Position are not really important. You can easily make one based on WaveMixerStream32. In the future I will be encouraging use of the ISampleProvider interface which already has the MixingSampleProvider class that can be used.

Mark

Jan 25, 2012 at 4:08 PM

Since you mentioned you usually implement your own, do you have any samples besides the ones that are there now?

I've been trying to roll my own for a few days now, but the main issue I keep getting stuck on is how to know where to mix the streams. MixingSampleProvider looks like it would work, but it looks like it knows position/length still.

WaveMixerStream32 is also all about position and length too. If all 10 streams are different lengths and start and stop at different times, position couldn't be the same, right?

Thanks again,

Tyler

Coordinator
Jan 25, 2012 at 4:13 PM

You can just ignore the position and length of input streams in your mixing class. IWaveProvider doesn't have Position and Length properties - you just need to implement Read. In Read, simply read from all your source streams, and mix together whatever returns. If some of the inputs have ended already, they will just return 0 from their Read method. You could either just always return count from Read, making a never-ending stream, or return the max number of bytes read from any of your inputs.

Jan 25, 2012 at 4:54 PM

Thanks Mark for your prompt replies,

I think I'm starting to figure this out, but I'm looking at WaveMixerStream32 and MixingSampleProvider - Why do both require Ieee as the encoding? I'm using NSpeex and it appears to be using PCM. Speex also is 16 bit, whereas WaveMixerStream32 mixes 32 bit.

As an audio noob, what all would I need to do to mix this?

Thanks!

Coordinator
Jan 26, 2012 at 9:16 AM

the reason for mixing in 32 bit instead of 16 bit is headroom. If you mix in 16 bit, then you are adding together Int16s, which could overflow, causing distortion. However, when you go to 32 bit IEEE, you make 1.0 the max scale, so if you mix together and the total goes above 1.0, you can simply scale the overall output level back down before converting back into 16 bit to send to the speakers.

NAudio includes adapter classes to turn 16 bit audio into 32 bit (and vice versa if you need it)

Jan 26, 2012 at 3:14 PM

Thanks Mark!

I was able to convert the audio to 32 bit and mix it together and convert it back to 16 bit. The audio quality has improved greatly.

There is one issue when two people talk at the same time though. When both streams mix, you hear this horrible screeching sound.

Coordinator
Jan 26, 2012 at 3:18 PM

Are you able to tell if any clipping occured (i.e. did the summed audio go over 1.0 at any point)? The conversion back to 16 bit ought to clip to plus or minus 1.0 before multiplying up by 32,768 to get back to a 16 bit number.

Also, a good way to debug this kind of problem is to save your inputs to WAV files, and then pass those WAV files through the mixer into another WAV file. Then you can look at the results in an audio editor and get a good idea of what the problem is.

Mark

Jan 26, 2012 at 4:40 PM

It seems to be clipping correctly (I put some breakpoints in and there were numbers over 1.0, but it set them to 1.0).

So right now I have:

  • Main voice provider that has a mixer provider object.
  • Mixer provider that takes all the 16 bit PCM (Speex) streams and converts them to 32 bit Ieee format (Using the Wave16ToFloatProvider code).
  • A Jitter Buffer provider for each user - This decodes each user talking.
  • The Main voice provider takes the 32 bit Ieee mixer Read data and converts it back to 16 bit PCM (Using WaveFloatTo16Provider).
  • Then I convert my main voice provider to a IWaveProvider object.

I based my providers off of ISampleProvider

Jan 26, 2012 at 4:51 PM

Could it be because I'm using DirectSoundOut as my player?

Coordinator
Jan 26, 2012 at 6:47 PM

OK, if it is clipping, then you will be introducing some distortion. It may be that your input streams are at a high level (perhaps using some kind of automatic gain). Try reducing the volume of the mixed stream a little (e.g. multiply every sample by 0.75) before it gets converted back to 16 bit.

I don't think DirectSoundOut should be causing a problem, but it should be very easy to swap in WaveOut or WaveOutEvent to see if that makes a difference.

Mark

Jan 26, 2012 at 7:53 PM

Okay, that didn't change the result.

Something is obviously not right when converting to/from 16bit to 32bit as this didn't happen before.

I've checked the wave formats, set the volume at both the conversion from 16 bit and to 16 bit, still the same result.

Coordinator
Jan 26, 2012 at 8:27 PM

Are you able to write out to WAV file at any stage in your pipeline? I should probably add a utility waveprovider to NAudio that does this but it's not too hard. Create a class that implements IWaveProvider. In the constructor take a source stream and open a WaveFileWriter. In the Read method, read from the source, but before returning bytes read, also write to the writer. Then make sure you Dispose the when you are done to make a valid WAV file.

You can then insert this waveprovider at different points in your pipeline to see at what stage the audio is getting corrupted.

 

public int Read(byte[] buffer, int offset, int count)

{

   int bytesRead = this.source.Read(buffer, offset, count)'

   this.writer.Write(buffer, offset, bytesRead);

   return bytesRead;

}