Short answer: you can’t tap other participants’ audio from the Teams client. Microsoft only supports two paths:
- After-the-meeting transcripts (official API):
- If “Live transcription” was enabled during the meeting, you can fetch the transcript via Microsoft Graph.
- Endpoints (app permissions required, e.g., OnlineMeetingArtifact.Read.All):
- Find the meeting: GET /me/onlineMeetings?filter=JoinWebUrl eq '{joinUrl}'
- List transcripts: GET /communications/onlineMeetings/{onlineMeetingId}/transcripts
- Download content (e.g., VTT): GET /communications/onlineMeetings/{onlineMeetingId}/transcripts/{id}/content
- This is not real time; it’s available only after Teams finishes processing.
- Real-time audio (build a meeting bot):
- Create a Teams calling/meeting bot (Microsoft Graph Communications SDK / Application-hosted media).
- The bot joins the meeting as a participant and receives raw audio (PCM) over the AudioSocket.
- Pipe that audio to a Speech-to-Text service (e.g., Azure Speech) for live captions.
- Docs to search: “Microsoft Graph real-time media calls and meetings,” “Register a calling bot for Teams,” “AudioSocket.”
- There is no public API to subscribe to Teams’ own live captions stream; you must run STT on the media your bot receives.
- Ensure your tenant policies allow bots to join meetings and inform participants for compliance.
Notes:
- Azure Communication Services can interop with Teams and handle recordings; real-time media to your app is scenario/feature dependent—check current ACS docs.
If you need a WebSocket fan-out for live transcript chunks to your UI, our ChatterBox meeting bot API shows a practical pattern for streaming to clients. Disclosure: I work on ChatterBox (https://chatter-box.io/).