Jan 28, 2013 at 4:15 PM
Edited Jan 28, 2013 at 4:17 PM

I'm reading a mp3file and convert it to a WaveStream
Mp3FileReader mp3 = new Mp3FileReader(open.FileName);
WaveStream pcm =WaveFormatConversionStream.CreatePcmStream(mp3);
Now I'd like to convert that stream to a doublearray.
I've read about WaveBuffer but have no idea how to use it for that purpose.
Can you help me out?



I've answered your question on StackOverflow. Also, unless you are using a really old NAudio, no need for the WaveFormatConversionStream. Simply reading out of the Mp3FileReader will give you PCM. It is most likely 16 bit stereo, so you'd want to split out
the channels before you do an FFT.
WaveBuffer saves you using BitConverter. Simply bind a WaveBuffer to your byte array containing PCM samples, and then you can access each sample using the ShortBuffer property. But your FFT probably wants float/double anyway, so you still need to divide
by 32,768 and put it into your separate left and right floating point arrays.



Thanks for the quick reply, Mark.
How can I split the channels and why do I need it anyways?
My FFT (the one by Lomont) asks for interleaved pairs of the real and immaginary parts.
Up until now I thought the real part is the left channel and immaginary part is the right channel.



the left and right samples will be interleaved, so just throw every other one away if you only need to do the FFT of one channel. Audio is entirely real, so the imaginary component of the input signal will always be zero.


Jan 28, 2013 at 5:20 PM
Edited Jan 28, 2013 at 5:20 PM

Ok, very well :)
Is there an easy way to check whether the FFT worked correctly?
Before applying the FFT I split the audiomaterial in chunks of 4096 bytes (to keep some kind of timedomain).
I created a sinewave at 100 Hz and looked after applying the FFT in the doublearray whether I could find the high magnitude but I couldn't. To find the correct entry in the array I calculated the frequencyband of every arrayentry (44100/4096 = 10,7Hz)
and looked in the corresponding part of the array where I suspected to find the high magnitude of 100 Hz.
I'm explaining this because I hope you see my error.



4096 is the number of bytes not the number of samples. To work out the bin sizes, N is the number of samples (i.e. complex numers)



I'm sorry, I don't quite get what you are trying to say. Can you explain it in a different way?



say you record 4096 bytes of audio. If it is 16 bit that means you have 2048 samples. If it is stereo, you have 1024 sample pairs (and you should only pass one channel in to the FFT). So your bin resolution is not 44100/4096 but 44100/1024.



Ah, I see.
I think I'm already doing something wrong when reading the bytestream from the mp3file. Can you confirm that instead of byte[8] I have to use byte[16] as buffer since I'm reading 16 bit audio?
using (Mp3FileReader mp3 = new Mp3FileReader(filename))
{
//Convert byte to doublearray
double[] real = new double[mp3.Length / 8];
byte[] buffer = new byte[8];
int read;
int count = 0;
while ((read = mp3.Read(buffer, 0, buffer.Length)) > 0)
{
real[count] = BitConverter.ToInt16(buffer, 0) / 32768.0;
count++;
}
//Seperate to mono
double[] mono = new double[real.Length / 2];
for (int i = 0; i < real.Length; i += 2)
{
mono[i / 2] = real[i];
}



no, a byte is 8 bits. So your 8 byte buffer is actually going to have space for 4 samples (left, right, left, right). So your code will end up with only a quarter of the samples in the input file.






I am certainly not a DSP or FFT expert, but as I understand it, you calculate the magnitude of the FFT output by taking the square root of the sum of squares of the real and imaginary parts of the output, which will always be positive. The ouput is like
coordinates, so you have to calculate phase and magnitude yourself.

