This project has moved and is read-only. For the latest updates, please go here.

Convert WaveStream to double-array (or apply FFT directly)

Jan 28, 2013 at 4:15 PM
Edited Jan 28, 2013 at 4:17 PM

I'm reading a mp3-file and convert it to a WaveStream

Mp3FileReader mp3 = new Mp3FileReader(open.FileName);
WaveStream pcm =WaveFormatConversionStream.CreatePcmStream(mp3);

Now I'd like to convert that stream to a double-array.

I've read about WaveBuffer but have no idea how to use it for that purpose.

Can you help me out?

Jan 28, 2013 at 5:03 PM

I've answered your question on StackOverflow. Also, unless you are using a really old NAudio, no need for the WaveFormatConversionStream. Simply reading out of the Mp3FileReader will give you PCM. It is most likely 16 bit stereo, so you'd want to split out the channels before you do an FFT.

WaveBuffer saves you using BitConverter. Simply bind a WaveBuffer to your byte array containing PCM samples, and then you can access each sample using the ShortBuffer property. But your FFT probably wants float/double anyway, so you still need to divide by 32,768 and put it into your separate left and right floating point arrays.

Jan 28, 2013 at 5:12 PM

Thanks for the quick reply, Mark.

How can I split the channels and why do I need it anyways?
My FFT (the one by Lomont) asks for interleaved pairs of the real and immaginary parts.

Up until now I thought the real part is the left channel and immaginary part is the right channel.

Jan 28, 2013 at 5:13 PM

the left and right samples will be interleaved, so just throw every other one away if you only need to do the FFT of one channel. Audio is entirely real, so the imaginary component of the input signal will always be zero.

Jan 28, 2013 at 5:20 PM
Edited Jan 28, 2013 at 5:20 PM

Ok, very well :)

Is there an easy way to check whether the FFT worked correctly?

Before applying the FFT I split the audio-material in chunks of 4096 bytes (to keep some kind of time-domain).

I created a sine-wave at 100 Hz and looked after applying the FFT in the double-array whether I could find the high magnitude but I couldn't. To find the correct entry in the array I calculated the frequency-band of every array-entry (44100/4096 = 10,7Hz) and looked in the corresponding part of the array where I suspected to find the high magnitude of 100 Hz.

I'm explaining this because I hope you see my error.

Jan 29, 2013 at 11:39 AM

4096 is the number of bytes not the number of samples. To work out the bin sizes, N is the number of samples (i.e. complex numers)

Jan 29, 2013 at 11:43 AM

I'm sorry, I don't quite get what you are trying to say. Can you explain it in a different way?

Jan 29, 2013 at 11:45 AM

say you record 4096 bytes of audio. If it is 16 bit that means you have 2048 samples. If it is stereo, you have 1024 sample pairs (and you should only pass one channel in to the FFT). So your bin resolution is not 44100/4096 but 44100/1024.

Feb 11, 2013 at 3:04 PM
Ah, I see.

I think I'm already doing something wrong when reading the bytestream from the mp3-file. Can you confirm that instead of byte[8] I have to use byte[16] as buffer since I'm reading 16 bit audio?
using (Mp3FileReader mp3 = new Mp3FileReader(filename))
            {
                //Convert byte- to double-array
                double[] real = new double[mp3.Length / 8];
                byte[] buffer = new byte[8];
                int read;
                int count = 0;
                
                while ((read = mp3.Read(buffer, 0, buffer.Length)) > 0)
                {
                    real[count] = BitConverter.ToInt16(buffer, 0) / 32768.0;
                    count++;
                }

                //Seperate to mono
                double[] mono = new double[real.Length / 2];
                for (int i = 0; i < real.Length; i += 2)
                {
                    mono[i / 2] = real[i];
                }
Feb 12, 2013 at 3:12 PM
no, a byte is 8 bits. So your 8 byte buffer is actually going to have space for 4 samples (left, right, left, right). So your code will end up with only a quarter of the samples in the input file.
Feb 12, 2013 at 4:38 PM
Ah, thanks.
I should've seen that.

One more question though (sorry for all the asking).
After applying my FFT (http://www.lomont.org/Software/Misc/FFT/LomontFFT.html) a lot magnitude-values are negative.
AFAIK they should only be positive.

Am I right on this?
Feb 13, 2013 at 3:50 PM
I am certainly not a DSP or FFT expert, but as I understand it, you calculate the magnitude of the FFT output by taking the square root of the sum of squares of the real and imaginary parts of the output, which will always be positive. The ouput is like coordinates, so you have to calculate phase and magnitude yourself.