This project has moved. For the latest updates, please go here.

NAudio source code modifications

Apr 29, 2014 at 1:45 PM
Hi,

We have recently been using the NAudio library to write some audio processing and RTP send and receive applications. We have had to make a few modifications to the source code that we downloaded from codeplex (version 1.7.0.14) to achieve certain things.

We are submitting these code changes for a couple of reasons. The first is to see if you think there is in fact a better way of achieving what we were aiming for. The second is, if you think these changes are acceptable, would you consider adding them into the NAudio source code so they are available to us the next time the library is released through NuGet – if and when you are planning a library update of course?

MediaFoundationTransform.cs:

This class was hardcoded to read in one second at a time which was producing too much latency. We have changed the constructor from this:
    /// <summary>
    /// Constructs a new MediaFoundationTransform wrapper
    /// Will read one second at a time
    /// </summary>
    /// <param name="sourceProvider">The source provider for input data to the transform</param>
    /// <param name="outputFormat">The desired output format</param>
    public MediaFoundationTransform(IWaveProvider sourceProvider, WaveFormat outputFormat)
    {
        this.outputWaveFormat = outputFormat;
        this.sourceProvider = sourceProvider;
        sourceBuffer = new byte[sourceProvider.WaveFormat.AverageBytesPerSecond];
        outputBuffer = new byte[outputWaveFormat.AverageBytesPerSecond + outputWaveFormat.BlockAlign]; // we will grow this buffer if needed, but try to make something big enough
    }
..to this:
    /// <summary>
    /// Constructs a new MediaFoundationTransform wrapper
    /// </summary>
    /// <param name="sourceProvider">The source provider for input data to the transform</param>
    /// <param name="outputFormat">The desired output format</param>
    public MediaFoundationTransform(IWaveProvider sourceProvider, WaveFormat outputFormat)
    {
        this.outputWaveFormat = outputFormat;
        this.sourceProvider = sourceProvider;

        // This class has been modified so audio is taken from the source provider and added to the output buffer in 10 milliseconds byte arrays.
        // Before this change, you would have to wait for 1 second of audio to be buffered before it would be converted.
        // Calculation:
        // 1000 milliseconds (i.e. 1 second) / 10 = 100, which is why the source buffer and output buffer average bytes per second is divided by 100.
        var bufferDiviser = 100;
        sourceBuffer = new byte[sourceProvider.WaveFormat.AverageBytesPerSecond / bufferDiviser];
        outputBuffer = new byte[(outputWaveFormat.AverageBytesPerSecond / bufferDiviser) + outputWaveFormat.BlockAlign]; // we will grow this buffer if needed, but try to make something big enough
    }
There was also a couple of comments dotted around in the class which we have removed just for clarity:
        // strategy will be to always read 1 second from the source, and give it to the resampler
        // we always read a full second
WasapiCapture.cs:

We made a change in here to be able to specify the length of audio in milliseconds that is outputted from the class each time audio becomes available.

We added a private variable:
    private int audioBufferMillisecondsLength;
We changed the constructors from this:
    /// <summary>
    /// Initialises a new instance of the WASAPI capture class
    /// </summary>
    /// <param name="captureDevice">Capture device to use</param>
    public WasapiCapture(MMDevice captureDevice)
    {
        syncContext = SynchronizationContext.Current;
        audioClient = captureDevice.AudioClient;
        ShareMode = AudioClientShareMode.Shared;

        waveFormat = audioClient.MixFormat;
        var wfe = waveFormat as WaveFormatExtensible;
        if (wfe != null)
        {
            try
            {
                waveFormat = wfe.ToStandardWaveFormat();
            }
            catch (InvalidOperationException)
            {
                // couldn't convert to a standard format
            }
        }
    }
...to this:
        /// <summary>
        /// Initialises a new instance of the WASAPI capture class
        /// </summary>
        /// <param name="captureDevice">Capture device to use</param>
        public WasapiCapture(MMDevice captureDevice) : 
                   this(captureDevice, 100)
        {
        }

        /// <summary>
        /// Initialises a new instance of the WASAPI capture class
        /// </summary>
        /// <param name="captureDevice">Capture device to use</param>
        /// <param name="audioBufferMillisecondsLength">Length of the audio buffer milliseconds</param>
        public WasapiCapture(MMDevice captureDevice, int audioBufferMillisecondsLength)
        {
              this.audioBufferMillisecondsLength = audioBufferMillisecondsLength;
              syncContext = SynchronizationContext.Current;
              audioClient = captureDevice.AudioClient;
              ShareMode = AudioClientShareMode.Shared;

              waveFormat = audioClient.MixFormat;
              var wfe = waveFormat as WaveFormatExtensible;
              if (wfe != null)
              {
                  try
                  {
                      waveFormat = wfe.ToStandardWaveFormat();
                  }
                  catch (InvalidOperationException)
                  {
                      // couldn't convert to a standard format
                  }
              }
          }
In the InitializeCaptureDevice function, we then changed the following line from this:
        long requestedDuration = REFTIMES_PER_MILLISEC * 100;
... to this:
        long requestedDuration = REFTIMES_PER_MILLISEC * this.audioBufferMillisecondsLength;
Thanks
Sep 25, 2014 at 7:07 AM
Hello,

any update on this issue/solution? I have the same problem: I need to send RTP packets instead of the default 100 milliseconds I need to send 20 milliseconds and this seems the easiest way to do it. Is there any other solution to this issue or is this the best way to do it?

Regards,
Istvan
Sep 25, 2014 at 9:45 AM
Hi,

No I haven't had any updates on this. We are still using a modified NAudio library with the changes above in order to resolve our issues.

Regards,
cuffindall
Sep 26, 2014 at 9:17 AM
Hi,

thanks for the update and the code modifications. It seems that we will use this as well. It works great for me :) Hopefully one day it will get into a build.

Regards,
Istvan
Coordinator
Sep 29, 2014 at 4:18 PM
yes, sorry haven't had time to process this contribution yet. Would probably want to make the MFT buffer size configurable (maybe as a number of milliseconds).
Dec 21, 2014 at 5:44 PM
@cuffindall: Thanks for sharing your mods.

@markheath: Yes. For the MediaFoundationTransform constructor change, I've done the following so that the application can control the buffer size

in NAudio\MediaFoundation\MediaFoundationTransform.cs
    /// <summary>
    /// Constructs a new MediaFoundationTransform wrapper
    /// </summary>
    /// <param name="sourceProvider">The source provider for input data to the transform</param>
    /// <param name="outputFormat">The desired output format</param>
    /// <param name="bufferLenMs">Source buffer size in Miliseconds</param>
    public MediaFoundationTransform(IWaveProvider sourceProvider, WaveFormat outputFormat, int bufferLenMs = 10)
    {
        this.outputWaveFormat = outputFormat;
        this.sourceProvider = sourceProvider;

        int bufferDiviser = 1000 / bufferLenMs;
        sourceBuffer = new byte[sourceProvider.WaveFormat.AverageBytesPerSecond / bufferDiviser];
        outputBuffer = new byte[(outputWaveFormat.AverageBytesPerSecond / bufferDiviser) + outputWaveFormat.BlockAlign];
    }
And applied the new parameter to the constructors for MediaFoundationResampler in NAudio\Wave\WaveProviders\MediaFoundationResampler.cs as well