79657026

Date: 2025-06-07 14:14:36
Score: 0.5
Natty:
Report link

Instead of trying to make the partial results render the way final results might, I started to think "ok, well, how could I render these like closed captions so that I can stop worrying about timestamps", but my partial text was getting processed ahead of the audio because it's throwing buffers at the transcriber, so as-is I did not have a workable solution with that angle.

I briefly considered trying to calculate buffer timing information and carefully throw buffers at the transcriber as-needed, and that led me to thinking of the various examples you can find online about tapping the mic and rendering the transcription text. Sure enough I could tap the node I'm playing my audio stream on and it works reasonably well. It lags a bit because the processing takes time, so I still might try the "try to time the buffers" solution to see if I can get the timing right, but even if that doesn't work out this tapping the stream solution is reasonably decent.

Reasons:
  • Whitelisted phrase (-1): solution is
  • Long answer (-0.5):
  • No code block (0.5):
  • Self-answer (0.5):
  • Low reputation (1):
Posted by: colourmebrad