Concatenation of streams

Jun 30, 2009 at 8:49 AM

Hi, Yet Another Naive Sound Programmer here.

I am happily bumbling around getting things working on a little app (imagine it as an audio phrasebook) where I am wanting to capture audio and then store it in an SQLServer database as blobs. I have used SoundCapture to get hold of the input from a mic, and I have used NAudio to stream it, with a couple of hacks to get MemoryStreams to be the working form of my internal file handling.

What I want to do are a couple more things, and I thought I'd check that I am heading off in the right direction:

1. I want to concatenate the various sound snippets I have captured into a single wave file, so I presume I simply use the WaveStreams and the reads to read the sounds together (small amounts of padding are not critical, say less than 0.5 seconds) and add in silence, which I am guessing can be taken from reading a null stream or simply generating lots of zeros (but I am trying to be ignorant in my code about wave internals - it's not hard for me!). Essentially the task is to read several phrases and then create a single wave file which I can then turn into an mp3 for use on an iPod (I think I can sort that bit out with using soundcapture code which can convert wavs into mp3s - so how hard can that be!!!).

2. I want to cut and paste from existing files - I am assuming I will read in the existing mp3s into a wave and use the NAudio display, select and store the selection in a MemoryStream. I haven't really looked into that too closely, but your waveviewer is just that. Presumably, for a wave, I can pick up a start and end point in a file, view that (a bit of zoom and stuff would help as is typical) and with a byte count and start position I can create a WAV file - am I right in thinking I can steal the RIFF from the whole file and stick it on the front of the bytes, assuming I do something with block alignment of the data?

Anyway, not expecting you to write my whole app, just want some confirmation and clues that I am heading in the right direction.

 

Cheers, Spenny

Coordinator
Jun 30, 2009 at 11:39 AM

Hi Spenny,

Concatenation is nice and easy so long as all the things you are concatenating are in the same format. Simply make a wave file writer, read all the contents from your first WAV and write them into it. Then you can call WriteData with an empty array to add some silence. Then add your next file.

byte[] buffer = new byte[1024];
using(WaveFileWriter waveFileWriter = new WaveFileWriter(targetFile, waveFormat))
{
    foreach (string sourceFile in sourceFiles)
    {
        using (WaveFileReader reader = new WaveFileReader(sourceFile))
        {
            int read;
            while ((read = reader.Read(buffer, 0, buffer.Length)) > 0)
            {
                waveFileWriter.WriteData(buffer, 0, read);
            }
        }
        // Write some silence in
        Array.Clear(buffer, 0, buffer.Length);
        waveFileWriter.WriteData(buffer, 0, buffer.Length)
    }
 }

To select part of a file you just need to get the locations of the first and last samples to copy. As for the waveform display, you can make use of the existing waveform drawing code, but you would likely need to enhance it a bit to display the whole wav file and allow selection, so this part of your task will be a bit longer.

WaveFileWriter will do the RIFF header for you so there is no need to do it yourself.

Mark

Jun 30, 2009 at 1:24 PM
Mark

Thanks for that. With a good ol' Google I have pieced together the code for the DB and aside from tripping over using the buffer length of the memory stream (via the byte array) rather than the length of the stream, getting blobs in and out of a SQLServer CE (using the Image type) was a breeze. For short phrases which end up in a single block (100kB), the response time is really good.

Initially I am going to keep my code in WAV files, but eventually I expect I'll switch to mp3 within the DB to get the compression, again, hopefully with short phrases, response time will be acceptable.

If I come up with something that is not too scruffy on the cut and paste, I'll forward it on so you can integrate it into the project.

For my purposes, I am expecting to have someone speak a few phrases, but what I am finding is that there seems to be some "bounce" somewhere on my hardware so there is a noticeable noise on starting and ending recording. I do hear this elsewhere on other kit, but it seems particularly bad on this PC. I am pondering whether to design in a little trim to the capture, or whether to have a little phrase trimmer. It makes the other function I intended to do, programmatic trimming of silence a nuisance.

Thanks for the prompt response.


Spenny

Jul 9, 2009 at 7:07 PM

Happy naive programmer here. I  have strung together all sorts of stuff to come up with a nifty application. As I had to string things together from a few places, I thought I'd gather together some pointers here.

The app uses memory streams internally as it is moving mainly WAV files around.

Input from a mic -> wave memory stream - mp3 soundcapture

Hacked, not very nicely, the wavefilereader/writers to create a WaveMemoryReader from a straight copy of  WaveFileReader. The only significant change was that the memory stream is not closed (for what I hope are obvious reasons!) - comment out the close in Dispose. The proper fix would be to make a base class and then have the slight differences for files and memory streams.

WaveMemoryWriter. Being lazy again simply copied the WaveFileReader. Altered the constructor slightly to remove the file name stuff and then did not do a close in the Dispose.


Storing into a database record. Simple:

Extract from DB (SQLSever CE Image field:

{   //BLOB is read into Byte array, then used to construct MemoryStream,
                   
                  if (ds.Tables["sounds"].Rows[c - 1]["prompt"] != DBNull.Value)
                    {
                        Byte[] bytePromptData = new Byte[0];
                        bytePromptData = (Byte[])(ds.Tables["responses"].Rows[c - 1]["prompt"]);
                        promptstream = new MemoryStream();
                        promptstream.Write(bytePromptData, 0, bytePromptData.Length);
                    } else
                        promptstream = null;

Input into DB

 

           if (promptstream != null)
                {
                    byte[] promptData = promptstream.GetBuffer();
                    SqlCeParameter prompt_parameter = new SqlCeParameter(
                        "@prompt_data", System.Data.SqlDbType.Image, (int)promptstream.Length);
                    prompt_parameter.Value = promptData;
                    sql.Parameters.Add(prompt_parameter);
                }
                else
                    sql.Parameters.Add("@prompt_data", DBNull.Value);


Importing various mp3's and waves:

1) Had to update the mp3reader code to cope with some mono stuff in the constructor:

switch (mp3Frame.ChannelMode)
            {
                case ChannelMode.Mono:
                    {
                        channels = 1;
                        break;
                    }

                default:
                    {
                        channels = 2;
                        break;
                    }
            }

and also created an alternative constructor for memory streams. Converted to WAV via a wobbly strack that seemed to do the trick:

                        WaveFormat standard_format = new WaveFormat(22050, 16, 1); // Decide on internal format
                         Mp3FileReader mp3Reader = new Mp3FileReader(FileInputDialog.FileName);
                         WaveStream pcmStream = WaveFormatConversionStream.CreatePcmStream(mp3Reader);
                       WaveStream bastream = new BlockAlignReductionStream(pcmStream);
                        diskfile = new WaveFormatConversionStream(standard_format, bastream);

then use it as a standard WaveStream.


Constructed the output from multiple memory streams thus:

 

Created an output mp3stream by doing this (I've included some example code on configuration as it is not well documented):

          WaveStream wm = new WaveMemoryReader(outputmemorystream);
           
            int sr = 22050;
            if (sample_rate == OptionsDialog.SampleRateEnum.sp44)
            {
                sr = 44100;
            WaveFormat high_format = new WaveFormat(sr, 16, 1);
                wm = new WaveFormatConversionStream(high_format, wm);
            }
            WaveLib.WaveFormat fmt = new WaveLib.WaveFormat(sr, 16, 1);
  
            Mp3WriterConfig cfg = new Mp3WriterConfig(fmt);
            switch (data_rate)
            {
                case OptionsDialog.DataRateEnum.kbs64:
                    {

                        cfg.Mp3Config.format.lhv1.dwBitrate = 64;
                        break;
                    }
                case OptionsDialog.DataRateEnum.kbs96:
                    {
                        cfg.Mp3Config.format.lhv1.dwBitrate = 96;
                        break;
                    }
                case OptionsDialog.DataRateEnum.kbs128:
                    {
                        cfg.Mp3Config.format.lhv1.dwBitrate = 128;
                        break;
                    }
            }

            cfg.Mp3Config.format.lhv1.bEnableVBR = vbr?1:0;
            cfg.Mp3Config.format.lhv1.nQuality = (ushort)Yeti.Lame.LAME_QUALITY_PRESET.LQP_VOICE;
            cfg.Mp3Config.format.lhv1.nVbrMethod = VBRMETHOD.VBR_METHOD_ABR;
            cfg.Mp3Config.format.lhv1.nMode = MpegMode.MONO;
   
            Mp3Writer writer = new Mp3Writer(new FileStream(default_directory + "\\" + outputfilename + ".mp3",
                                                FileMode.Create), cfg);
            try
            {
                byte[] buff = new byte[writer.OptimalBufferSize];
                int read = 0;
                long progress = 0;
                long length = wm.Length;
                StatusProgressBar1.Minimum = 0;
                StatusProgressBar1.Maximum = 100;
                
                while ((read = wm.Read(buff, 0, buff.Length)) > 0)
                {
                    writer.Write(buff, 0, read);
                    progress += read;
                    StatusProgressBar1.Value = (int)(progress * 100/length);
                }
            }
            finally
            {
                writer.Close();
                StatusProgressBar1.Value = 0;
            }

 

 

            MemoryStream player = new MemoryStream();
            using (WaveMemoryWriter wmw = new WaveMemoryWriter(player, standard_format))
            {
                int progress = 0;
                StatusProgressBar1.Minimum = 0;
                StatusProgressBar1.Maximum = 100;


                foreach (SoundListItem item in sound_list)
                {
                    progress++;
                    StatusProgressBar1.Value = (int)(100 * progress / sound_list.Count);
                    MemoryStream p;
                    string text;
                       GetResponseRecord(item.id, out p);
                    if (p != null)
                    {
                        AddToWaveWriter(wmw, p);
                        int bps = BPS(wmw.WaveFormat);
                        WriteSilenceToWaveWriter(wmw, ResponsePause((int)(r1.Length / bps)));
                    }
                }
                wmw.Close();
            }

            return player;

        }

private int AddToWaveWriter(WaveMemoryWriter wmw, MemoryStream stream)
        {
            long len = 0;
            if (stream != null)
            {
                byte[] buffer = new byte[1024];

                using (WaveMemoryReader reader = new WaveMemoryReader(stream))
                {
                    int read;

                    len = reader.Length;
                    while ((read = reader.Read(buffer, 0, buffer.Length)) > 0)
                    {
                        wmw.WriteData(buffer, 0, read);
                    }
                }

                return (int)len;
            }
            return 0;
        }

 

 

private void WriteSilenceToWaveWriter(WaveMemoryWriter wmw, int seconds)
        {
            int bytescount= 0;
            int bytesps = BPS(wmw.WaveFormat);
            bytescount= bytesps * seconds;  // Will come out block aligned
            byte[] buffer = new byte[1024];
            Array.Clear(buffer, 0, buffer.Length);
            while (bytescount> 0)
            {
                int len = buffer.Length;
                if (bytescount< buffer.Length)
                {
                    len = bytescount;
                }
                wmw.WriteData(buffer, 0, len);
                bytescount= bytescount- len;
            }
        }


I did hit a snag in using the WaveOffsetStream  - it assumes that you want something of a particular length, so if you use it in a copy bytes loop - say as in AddToWaveWriter above, you use lots of Gigabytes of memory stream before it crashes!

 

Spenny

Aug 19, 2010 at 2:17 PM

Hi, I have the same problem.. see this thread plz

http://naudio.codeplex.com/Thread/View.aspx?ThreadId=79028

Aug 20, 2010 at 11:54 PM

> For my purposes, I am expecting to have someone speak a few phrases, but what I am finding is that there seems to be some "bounce" somewhere on my hardware so there is a noticeable noise on starting and ending recording.

It seems like your recording doesn't start and end at zero-crossing. Usually these kinds of problems are addressed with a noise gate. Alternatively, you can implement a short fade-in, fade-out to get rid of the clicks.