Playback plays at 50% speed with AAC files

Jan 12 at 1:41 AM
Edited Jan 12 at 6:20 PM
I have some AAC files streamed from the internet, the playpack runs at half speed for all 50 or so files. Yet the play back speed is normal in WMP, Winamp, and other audio apps.

MediaInfo shows: 47Kbps, 44.1KHz / 22.05 KHz, 2 channels, AAC (Version 2)(HE=AAC/LC)

In NAudio (using MediaFoundationReader via AudioFileReader) the waveformat is 22050Hz 2ch IeeeFloat 16bps. It should be 44100Hz.


Here is a small sample (direct 'slow' download, no sign up):
http://www.fileswap.com/dl/1lgYhPgX0/
Coordinator
Jan 13 at 11:22 AM
hi, that file seems to contain a media type change early on which NAudio is not handling correctly. I'd need to do a bit more investigation before saying what exactly the issue is.
Jan 13 at 12:18 PM
Ok thanks.
Jan 16 at 12:59 PM
Any update regarding this issue?

(no rush, just wandering)
Coordinator
Jan 16 at 1:12 PM
the trouble is, what do we do with files like this that change WaveFormat midway through. It is a real headache, as the output playback device will already have been initialized at the wrong sample rate/
Jan 16 at 1:20 PM
Is it possible that the headers in the frames are incorrectly reporting the wrong sample rate?
Coordinator
Jan 16 at 1:25 PM
yes, we get this with MP3 files sometimes as well, junk frames at the start reporting a different sample rate. It's a real pain to work out which one to decide is the first "real" frame.
Jan 16 at 2:06 PM
I know what you mean, I recall seeing a function where it ignores all previous frame changes and ends up using the last frame as a reference.

I've been meaning to 'dissect' the AAC format since I need to split AAC files at a lossless frame level, I'll provide the code/class if successful.
Coordinator
Jan 16 at 2:23 PM
I'd be very interested in seeing what you come up with

One thing I have been contemplating is for a WaveFormatChanged event to be raised by MediaFoundationReader (i've actually added one in a recent commit). Then it would be up to the user whether they want to handle it and re-open the playback device at a new sample rate to keep playing, or even to resample the audio themselves.
Jan 17 at 1:56 AM
Edited Jan 19 at 1:01 PM
I have created an AacFileReader class based on Mp3FileReader and got it working. 22050 is consistent through the whole file.

Detect an MP4 container (looks for ftyp box). Throws Unsupported exception.
Reads ID3v1 and ID3v2 tags if present
Seeks the frame start signature (0xFFF)
Reads each frame header, fills AacFrame class, jumps to the next frame using the frame length
ACM decompressor works.
ADTS format works 100%
ADIF format header done, frame parsing not done yet.

Dumping a range of frames to an aac file works, but the audio is still half speed as expected.
NAudio doesnt seem to play aac files with an ID3v2 tag (HRESULT 0xC00D36B4), but Winamp plays it fine. Perhaps it will work if the ACM decompressor is used instead of MF.

Here are the first few frames of that aac file which plays at half speed.
Frame: Len=331,Samples=1024,BitRate=57019,SampleRate=22050,Channels=2
Frame: Len=298,Samples=1024,BitRate=51335,SampleRate=22050,Channels=2
Frame: Len=261,Samples=1024,BitRate=44961,SampleRate=22050,Channels=2
Frame: Len=289,Samples=1024,BitRate=49784,SampleRate=22050,Channels=2
Frame: Len=248,Samples=1024,BitRate=42721,SampleRate=22050,Channels=2
Frame: Len=342,Samples=1024,BitRate=58914,SampleRate=22050,Channels=2
Jan 22 at 7:38 AM
Edited Jan 22 at 8:23 AM
btw I found a bug in MP3FileReader.cs, ID3v1 only reads 3 bytes instead of 128
I tried pushing it using TortoiseHG but it wont accept my copeplex/hotmail username and password
        // try for an ID3v1 tag as well
        mp3Stream.Position = mp3Stream.Length - 128;
        byte[] tag = new byte[128];
        mp3Stream.Read(tag, 0, 3); // <- Change this to mp3Stream.Read(tag, 0, 128);
        if (tag[0] == 'T' && tag[1] == 'A' && tag[2] == 'G')
        {
            id3v1Tag = tag;
            this.mp3DataLength -= 128;
        }
Jan 22 at 10:30 AM
Edited Jan 22 at 12:48 PM
I found out why that halfspeed.aac file was playing at half speed in NAudio, it is because the AAC file uses SBR (spectral band replication) for low quality AAC files which halves the sample rate of a 44100 to the 22050 samples as reported in the ADTS header, the encoder decodes the 22050 samples, processes the SBR and outputs at 44100hz.

Since we are reading 22050 at the start, we are setting the wave out device to 22050 which is wrong, we need to set it to 44100 if SBR is detected, the AacFrame.cs class will contain an SBR flag and the AacWaveFormat samplerate AND samplecount will be adjusted/doubled. I figured out how to detect SBR (and Panoramic Stereo) by calculating various values in the ADTS header and doubling the WaveFormat sample rate to 44100 and Sample count x2 accordingly. I have confirmed it works using the ACM decoder, but we need to make adjustments to the MediaFoundation side. Initially parsing all AAC files through the AacADTSParser.cs class instead of MF should do the trick. The AacFileReader class defaults to ACM as like Mp3FileReader but parsing an MF decompressor should work fine in theory. I'll do some testing to be sure.

The ADIF AAC storage format is mostly done (The ADIF header is a complicated beast btw), the SBR information is in the ADIF header. Frame grabbing may be an issue, we may need to decompress the whole file at once and store the buffer since the frames have no header.

Output samples per frame are commonly 1024 but apparently they can be 960 in some situations. Something to look into, or at least be aware of. The sample rate doubles to 2048 per frame if SBR is used, so the decompressor buffer has been doubled as well.

LATM and LAOS streaming formats seem a little trickier, I haven't gotten to those yet.

So far the ACM decompressor can play 100% of the 'odd' ADTS files I've collected, MF plays 70% or so. I'm using the ACM codec from http://www.free-codecs.com/download/AAC_ACM_Codec.htm
Coordinator
Jan 23 at 10:25 AM
wow, sounds like you are making some great progress. Your pull request had no commits in it, but I'll try to get your MP3 tag fix in anyway.

thanks
Jan 23 at 11:08 AM
I got TortoiseHG working so the MP3 fork should be up now.

Regarding ACC, neither the AACACM decoder, CoreAAC DS filter, and MediaFoundation MS codec support the ADIF format so I may just leave it out. It was the original format and after some googling no one seems to care about it anyway. I'll leave the ADIF header reader in anyway just in case. LATM and LAOS doesn't seem popular either so I may just support ADTS for now, which is probably 95% of AAC files out there.

MP3FileReader defaults to ACM but I think we should default to MediaFoundation for AAC since Windows 7/8/(Vista?) have an AAC codec built in. Win8 doesn't ship with an AAC ACM codec. However I'm not quite sure how to replace ACM with MF code. I'll have a look at MediaFoundationReader.cs
Coordinator
Jan 23 at 11:19 AM
You'd need to inherit from MediaFoundationTransform, in the same way that MediaFoundationResampler does.
Jan 23 at 12:16 PM
Edited Jan 23 at 12:43 PM
Ok thanks I'll look into it.

MFTransform wants IWaveProvider and WaveFormat, which somewhat confuses things, I'll work it out..
    public MfAacFrameDecompressor(WaveFormat sourceFormat, IWaveProvider sourceProvider, WaveFormat outputFormat)
        : base(sourceProvider, outputFormat)
Jan 23 at 1:22 PM
Edited Jan 24 at 7:25 AM
EDIT... ok its not that bad after all after removing all the resampler code.
Coordinator
Jan 23 at 2:01 PM
yes MediaFoundation is painfully complicated. You don't need any of the stuff that is causing compile errors though. Your MFT won'thave IWMResamplerProps or resampler quality. And you replace ResamplerMediaComObject with a new COM object that is the AAC decoder MFT. So you'll need its GUID (either hard-code it, or find it by enumerating the MFTs)
Jan 23 at 2:21 PM
Why MS complicate COM objects like this is beyond me (I wish COM would gracefully disappear to be honest) but anyway I'll give it a try.
Jan 24 at 7:29 AM
Is there any benefit allowing the frame decompressor to be parsed in the constructor?

Mp3FileReader(string mp3FileName, FrameDecompressorBuilder frameDecompressorBuilder)

It complicates things in AacFileReader since MF needs to access the SourceProvider in the WaveStream interface. I've removed it for now.
Coordinator
Jan 24 at 8:08 AM
with MP3 file reader, it gets used in the constructor, so to enable users to override it, it had to be a constructor param. If you can avoid it with AacFileReader, that is fine.
Jan 24 at 8:15 AM
Edited Jan 24 at 8:16 AM
Ok no worries. I'll try to put it back in once I get it all working to keep things consistent. I may have it attempt ACM first and then fallback to MF to simply things for the user, and perhaps add a 'bool PreferMediaFoundation' as an optional constructor.

I found some AAC MF playback code in cpp which I'm basing my code on. Hopefully it's the correct way to do it.
http://stackoverflow.com/questions/16565292/imftransformprocessoutput-returns-mf-e-transform-stream-change-for-he-aac-p
Jan 24 at 1:22 PM
Not to pick nits, but I think the "PreferMediaFoundation" option would be better suited to a static property... My chi doesn't like it as a constructor parm. :-)

Seriously, I can't imagine someone wanting to be able to change it on an instance-by-instance basis, so a global setting makes a lot of sense. Just my US$0.02...
Jan 24 at 1:24 PM
Edited Jan 24 at 2:57 PM
Edit.. no worries just realised MFTransform is an abstract class
Jan 24 at 1:26 PM
Edited Jan 24 at 4:14 PM
ioctlLR wrote:
Not to pick nits, but I think the "PreferMediaFoundation" option would be better suited to a static property... My chi doesn't like it as a constructor parm. :-)

Seriously, I can't imagine someone wanting to be able to change it on an instance-by-instance basis, so a global setting makes a lot of sense. Just my US$0.02...
Thats fine, I could do both I suppose.
.... Actually come to think of it a static Property is the best way to do it. I'll do that instead.
Jan 24 at 4:10 PM
Edited Jan 25 at 12:17 AM
EDIT nevermind, MediaFoundationTransform designed too much for the resampler. Rather than modify MediaFoundationTransform I'll create a MediaFoundationDecoder class that is more in tune with the AcmStream class. This also keeps thing a lot neater/uniform in AacFileReader.
Jan 25 at 4:33 PM
Edited Jan 26 at 3:23 AM
I submitted a patch to fix the half speed aac playback in MediaFoundationReader.cs, basically I check for SBR in the extra codec data within the AAC WaveFormatEx MFReader provides and double the sample rate so that should be good to go. Basically all the AAC ADTS files I have play normal now.

I got the MF Decoder working in the same way the ACM decoder works, just need to clean everything up now.

Basically the features AacFileReader will provide are:

Add ACM decoding if an AAC ACM codec is present (XP+Vista users would appreciate this feature).
Add Lossless Frame based cutting of AAC ADTS files.
Improve MediaFoundation compatibility by removing the ID Tags before sending the stream data to MF (MF throws an exception if an ID3v2 tag is present).
A general MediaFoundationDecoder class which should come in handy if MediaFoundationReader fails and the input data needs pre-processing (rip out tags, correct inconsistencies such as stream changes, corrupt frames etc..)
Jan 31 at 3:48 PM
Edited Feb 1 at 10:05 AM
MediaFoundationTransform has proven to be troublesome, it plays normal AAC files fine but if it runs into an oddly formatted AAC file is throws an exception. For example a mono 16000hz file fails to transform, but MFReader plays it perfectly.

So obviously MediaFoundationReader (Source Reader) is doing extra work to get these files playing as they supposed to.

I tried to init the AAC file with MediaFoundationReader and grab its MediaType, then pass it into MFTransform.SetInputType it would accept the MediaType and work as expected..... no dice. It fails (MF.SetInputType accepts it but when it decodes the data it says 'not accepting any more data'). I even tried replicating the exact bytes of the MFReader WaveFormat into my own AacWaveFormat as per the AAC WaveFormatEx specs, MFTransform still wont decode it. Even with the fail-safe WaveFormatExtraData with no byte[] array.

My main goal is to provide max compatibility so I have decided to dump the MFDecoder and ghost MediaFoundationReader within AacFileReader instead (Similar to what AudioFileReader.cs does).
Compatibility has improved but MediaFoundationReader intermittently crashes the whole app with certain AAC files, even with try..catches everywhere. The crash must be happening in unmanaged code. I can replicate this with the NAudio WPF Demo app (same binaries as from the CodePlex website, totally unmodified). So the bug appears to be in the NAudio library, Something to look into... EDIT: It seems to only crash when the debugger is running. On its own it is fine.

Anyway the ACM AAC decoder works perfectly, if there is no ACM AAC codec it falls back to MFReader. I'll submit the code once I have cleaned it up.
Jan 31 at 4:36 PM
Edited Jan 31 at 4:38 PM
btw is there a trick to getting Property changes recognised in the Constructor without an extra Init() function?

For example:

I have this property:
        public bool PreferMediaFoundation
        {
            get { return preferMediaFoundation; }
            set { preferMediaFoundation = value; }
        }   
The user calls it via:
_waveStream = new AacFileReader(sAudioFile) 
{ 
    PreferMediaFoundation = true 
};
... but "PreferMediaFoundation = true" is processed AFTER the constructor is done.

I need to take care ofPreferMediaFoundation in the constructor itself if possible. Otherwise I would need to add an extra constructor variable.
Feb 11 at 12:36 PM
Edited Feb 11 at 1:31 PM
Well I finally got reliable SBR detection working. It actually plays more oddly formatted AAC files than Winamp now, including the LTP profile.

For those who are trying to do the same thing and googling to no end, I ended up having to read the actual raw AAC data and parse/jump through the Syntactic Elements, and enable SBR if the sample rate is <= 24000, channels = 2, and an ID_CPE element is used in the raw frame. (There are no complex table/trig lookups or anything, just a few bitstream reads)

Assuming SBR by checking if the sample rate is <= 24000 is wrong (a common occurrence in other peoples code), there are non-SBR AAC files that are lower than 24000hz.

Retrieving the SBR info from the ADTS header is impossible, decoding a few frames using a decoder should work but MediaFoundation Source Reader doesn't update the GetCurrentMediaType after the first few ReadSample calls when it really should like other decoders (A recent Mozilla patch does it this way but I doubt it works properly, I tried using the exact technique, no luck). GetNativeMediaType also doesn't work, and reading the AudioSpecificConfig from GetCurrentMediaType is also useless as it returns a 2 byte config, not the expected 5 byte config with SBR info. As for ACM, I couldn't find a suitable function to retrieve the actual output samplerate.


Anyway one more task to go, remove the ID3v2 tag and give MF a raw ADTS stream so it doesn't reject the file...

There is one AAC file that fails to play, for some reason MediaFoundation thinks it is a FLAC file and uses the mfFLAC decoder then crashes miserably, but that's beyond my control. I added a subtype check to handle it. ACM plays it fine though.
Coordinator
Feb 11 at 6:40 PM
sounds like you are making good progress. I haven't had a lot of time for NAudio stuff recently, but hopefully get round to looking at your AacFileReader soon. I'm also very interested in what you've done with media foundation from a stream. I've been trying a couple of different approaches to that, and it is one of my most wanted features for NAudio 1.8
Feb 12 at 11:10 AM
I basically mirrored the functionality of the MediaFoundation File and kept it as simple as possible. Much cleaner than using an Abstract class over the top.

I'm using the MF Stream in AacFileReader and found no problems with it at all.