Multi-Channel Audio for the Raspberry Pi - Part 1
Now for something completely different...
I've been spending quite a bit of time lately writing audio applications on the Raspberry Pi using Juce. An RPi 4 with the full desktop Raspberry Pi OS, Juce and the Code::Blocks IDE is actually quite a nice platform for embedded audio development, and applications, once compiled, can be moved to a more minimal linux configuration, or even a Compute Module 4, for deployment. Not to take away from the WAV Trigger and Tsunami, but the RPi is way, way faster and has orders of magnitude more memory. Plus Ethernet and Bluetooth.
There are several very good, high quality HATs available for stereo audio in/out. However, if you want more than 2 channels of output, you likely have to go to a USB Audio Class 2 interface. While these devices are class compliant with linux and require no drivers, interfaces with more than 2 outputs are typically designed for studio use and are therefore in their own enclosure with things like mic preamps, phantom power and lots of bells and whistles. They are expensive and impractical for embedded projects.
The issue is that the I2S port available on the RPi's expansion header (the interface used by all of the stereo HATS) only supports I2S stereo modes. At least that's what my research has shown. (If anyone know this to be untrue, please let me know!) Multichannel audio codecs (those with more than 2 DAC and/or ADC channels) use a modified version of I2S known as TDM or DSP. This protocol uses the same signals as I2S but instead of sending just 2 channels per frame, it can accommodate up to 16 sequential channels per frame. Tsunami uses an ADAU1328 codec with 8 DAC channels and 4 ADC channels. The TDM interface between the MCU and the codec transmits 8 "slots" per frame in both directions.
It occurred to me recently that there might be a way to convert stereo I2S into TDM on the fly. If the RPi sends pairs of channels via I2S at 4 times the actual sample rate, then if the channel pairs are sequentially interleaved, this looks very close to the required TDM for 8 channels. If you can somehow divide the I2S frame clock by 4, then it actually is TDM.
I'm certainly not the first to have this thought, but the general wisdom is that due to start-up issues, you can't know where to divide the frame clock so that what the RPi thinks is the first channel is actually not always the first channel of the TDM frame. This results in random channel swapping each time you start up. Not very useful.
So... my idea was for a HAT that could be synchronized to the I2S each time the linux alsa sound device is opened and audio is started. This works via an I2C command to put the HAT into a mode where it can identify channel 1 after audio has started and remap the channels to the correct order for the outgoing TDM. Both interfaces - I2S from the RPi and TDM to the codec - have to use the same bit clock in order to remain completely synchronized. This requires that the HAT be the audio clock master with the RPi I2S slaved to the HAT.
I decided to hack up a Tsunami and see if I could at least get the the RPi to output 4 sequential I2S frames slaved to Tsunami at 176.4KHz. I was able to modify an existing device tree overlay so that my Tsunami "HAT" was recognized as a sound device, and using Juce, I wrote a mixer that takes blocks of audio data and creates buffers of 8 interleaved channels to send to the alsa device driver as if it were a large stereo buffer. With Tsunami driving BLCK and LRCK at 4 times 44.1kHz, it worked just fine.
Next, I was able to get Tsunami to output those 8 channels to the codec via TDM, which also worked. So far so good, albeit with the aforementioned random channel assignments on startup. Then I modified my Juce mixer on the RPi so that when it starts up, it forces all the channels except channel 1 to output 0 and at the same time sends a sequence of commands to Tsunami via I2C to tell it to "sync". Using this method, I'm now able to identify channel 1 and correctly format the outgoing TDM so that there is no channel swapping.
This was all very promising, but having taken my Tsunami hack about as far as I could, I decided it was time to try a real prototype board. So I have the following PCB on order, which has 8 unbalanced line-level outputs and 2 unbalanced inputs. I'm not 100% certain about getting the inputs working, since that involves converting TDM to I2S and I don't have the hardware resources to do this on the board itself, but I believe that since all the clocks are synchronized, the RPi will just interpret the incoming 8 channels as 4 stereo pairs. We'll see...
The next post will be about the software requirements for the Raspberry Pi side of things. I don't claim to be a linux device driver or alsa expert, so I'd love to get some knowledgeable feedback about how to structure things. At the moment, my device presents itself as a stereo sound device, requiring code to manage the interleaved alsa buffer as well as send the I2C commands for channel synchronization. This is not difficult, but I'm assuming it's theoretically possible to build a driver that makes this actually look like an 8 channel device so that it plays nicely with existing apps.