Did you try doing as it said and padding the audio with at least 100ms of silence (preferably more, around 1 second)? Also, how long are your files? Whisper doesn't work so great on shorter files