Jul 2024•Full Stack•2 min read
AssemblyAI Live Transcriber
Live transcription and highlights for bilingual stakeholder calls.
WebRTCEdge FunctionsBilingual
TypeScriptNext.jsWebRTCAssemblyAI
Outcomes
- Streaming latency ~450ms
- Auto summaries under 2 minutes
Problem
Distributed product teams needed live transcripts and summaries for bilingual (ES/EN) roadmap reviews. Off-the-shelf tools struggled with code-switching and required manual exporting.
Approach
- Built a Next.js edge function that proxies AssemblyAI live transcription with auth token rotation.
- Created a WebRTC capture widget that streams microphone audio via MediaRecorder chunks over WebSockets.
- Implemented bilingual speaker diarization with language hints and automatic glossary injection.
- Added highlights + action item extraction using a hybrid of AssemblyAI summarization and custom prompt engineering in Vercel AI SDK.
- Stored sessions in Supabase with row-level security for invite-only access and auto-cleanup after 14 days.
Results
- End-to-end streaming latency around 450 ms, validated with synthetic tests.
- Meeting summaries and action items ready <2 minutes after call end.
- Boosted async participation; PMs reuse transcripts in Linear + Notion automation.