Audio pattern recognition

Sep 30, 2010 at 2:07 PM

Hi there, and congratulations for this great library.

I'm making a sw that has to recognize some patterns from some mp3s;

I extract 2 samples (at 2 seconds distance) from a 1hour long mp3 (32kbs, 32kHz, single channel CBR) and then search for those "patterns" in other mp3s. If i have two patterns at exactly 2 seconds distance, I have found the pattern.

My first idea was, since mp3 frames are compressed, to convert those samples to wav; I'm doing this as:


public static byte[] convertMP3toPCM(byte[] frames)        {            

using (Stream mp3Stream = new MemoryStream(frames))            {                

using (Mp3FileReader reader = new Mp3FileReader(mp3Stream))                {                    

using (WaveStream pcmStream = WaveFormatConversionStream.CreatePcmStream(reader))                    {                        

byte[] buffer = new byte[pcmStream.GetReadSize((int)pcmStream.TotalTime.TotalMilliseconds)];                        

int bytesread = pcmStream.Read(buffer, 0, buffer.Length);                        

return buffer;                    






Then, I search for those samples on the mp3s with a L2 Norm. I've used the same approach in video recognition with good results.

However, right now I can only find the pattern in the file from which I'm taking the samples, not in others. Any suggestion? Do you think my idea is correct?

Oct 12, 2010 at 6:21 PM

hi there,

this is a very difficult signal processing problem. IIRC there is a question about this on StackOverflow which has links to a technical paper.