79708295

Date: 2025-07-20 19:01:40
Score: 0.5
Natty:
Report link

A solution that does not require looking at the transcription strings and guessing if the words actual represent a new piece of speech:

I have only tested this on MacOS and not on iOS so take it with a grain of sand, but I have found that the bestTranscription will generally be emptied/reset after the speechRecognitionMetadata field in the result is not nil.

Which means that gathering the complete transcription is simply a matter of concatenating all the transcriptions when the speechRecognitionMetadata is present:

var cominedResult = ""

func combineResults(result: SFSpeechRecognitionResult) {
    if (result.speechRecognitionMetadata != nil) {
        cominedResult += ". " + result.bestTranscription.formattedString
    } 
    else {
        // I still want to print intermediate results, you might not want this.
        let intermediate = cominedResult + ". " + result.bestTranscription.formattedString
        print(intermediate)
    }
}
Reasons:
  • Contains signature (1):
  • Long answer (-0.5):
  • Has code block (-0.5):
  • Low reputation (0.5):
Posted by: Matt