I think your suspicion is correct. SpeechSynthesisUtterance pipes the outgoing audio stream directly to the system (browser), and provides no option to redirect the audio stream to a JavaScript handler.