Reports

When you pass nil to AVAssetReaderAudioMixOutput, like this:

[AVAssetReaderAudioMixOutput assetReaderAudioMixOutputWithAudioTracks:audioTracks audioSettings:nil];

You're telling AVFoundation:

“Just give me the original audio format, exactly as stored in the asset. No conversion. I’ll handle it myself.”

When you're working with a spatial audio, here's what happens:

The source audio is stored in a complex multichannel format — not just stereo.
It might be:
- 4 channels (from mic arrays)
- Ambisonic B-format
- Custom layout (like mic A + mic B + directional data)
When you pass nil settings to the reader, AVFoundation says:

“Alright, here's your raw multichannel format (e.g. 4ch at 48kHz). Have fun!”
But your writer input is expecting:

AVNumberOfChannelsKey: @1 // or @2 AVSampleRateKey: @44100

So when you do:

[input appendSampleBuffer:sampleBuffer]

It fails with:

-11800 (cannot complete) / -12780 (format mismatch)

Because:

You’re feeding it raw spatial audio, but asking it to encode AAC mono/stereo — totally incompatible.

How to fix it?

Provide explicit settings for the reader (e.g. downmix to 2-channel PCM), like:

NSDictionary *audioReaderSettings = @{
    AVFormatIDKey: @(kAudioFormatLinearPCM),
    AVSampleRateKey: @(44100),
    AVNumberOfChannelsKey: @(2),
    AVLinearPCMBitDepthKey: @(16),
    AVLinearPCMIsFloatKey: @(NO),
    AVLinearPCMIsBigEndianKey: @(NO),
    AVLinearPCMIsNonInterleaved: @(NO)
};

self.audioOutput = [AVAssetReaderAudioMixOutput assetReaderAudioMixOutputWithAudioTracks:audioTracks audioSettings:audioReaderSettings];

Then AVFoundation knows:

“Ah, okay, I’ll decode and downmix this spatial audio into regular stereo for you.”

Now the writer is happy because it gets standard 2-channel PCM and can encode it to AAC smoothly.

79559574

How to fix it?