Maybe it is best if you use only one of them (depending on what you're exactly trying to achieve)? Most platforms dont support both audio_url and voice_id together. It assumes you're either using TTS (voice_url) or playing a file (audio_url).