I’m also facing a similar issue in Unity. At first, I thought the audio I was sending to ElevenLabs might be corrupted or invalid. But after testing the API in a Node.js web app, I encountered the same problem.
I also tried using ElevenLabs’ WebSocket Explorer. Interestingly, I received a valid response from the first audio chunk, but after that, the connection got stuck in a continuous ping/pong loop with no agent response.
Additionally, I tested with a .wav
file encoded to base64 using Base64.Guru. That worked and returned a response for the first time after opening the wss
connection. However, when I tried sending audio using ElevenLabs' built-in input encoder, the connection was immediately closed. It seems like the audio chunk might be in an incompatible or unsupported format, but I’m still trying to pinpoint the exact cause.
For comparison, I also tested this repo: https://github.com/mapluisch/OpenAI-Realtime-API-for-Unity. It's a real-time WebSocket implementation for OpenAI, and it follows a similar architecture. Interestingly, it works perfectly, which makes the ElevenLabs behavior even more puzzling.