Insights & Use Cases

December 2, 2025

How to remove or reduce background noise from audio for (stt) transcription

By

Kelsey Foster

,

Growth

December 2, 2025

How does context (like names spoken) influence automatic speaker labeling?

By

Kelsey Foster

,

Growth

November 24, 2025

Speaker identification and diarization with AssemblyAI

By

Kelsey Foster

,

Growth

November 24, 2025

AI medical transcription

By

Kelsey Foster

,

Growth

November 24, 2025

AI in customer service: Top use cases for 2026

By

Kelsey Foster

,

Growth

November 21, 2025

Why evals in voice AI are so hard (and how to fix them)

By

Ryan Seams

,

VP, Customer Solutions

November 20, 2025

Gemini 3 Pro vs GPT-5 vs Claude 4.5: Which model wins for audio workflows?

By

Meredith Rauch

,

Growth

November 12, 2025

Voice agents in healthcare: Automating phone interactions for scheduling, billing, and more

By

Kelsey Foster

,

Growth

October 29, 2025

Build a real-time medical transcription analysis app with AssemblyAI and LLM Gateway

By

Kelsey Foster

,

Growth

October 29, 2025

Beyond transcription: Combining speech-to-text with AI analysis

By

Kelsey Foster

,

Growth

October 22, 2025

How to transcribe (stt) audio with timestamps for captions with AssemblyAI

By

Kelsey Foster

,

Growth

October 22, 2025

Video transcription made simple: From segments to timestamps

By

Kelsey Foster

,

Growth

October 22, 2025

Large-scale audio transcription: Handling hours of content efficiently

By

Kelsey Foster

,

Growth

October 22, 2025

What is real-time speech to text?

By

Kelsey Foster

,

Growth

October 15, 2025

5 Deepgram alternatives in 2025

By

Kelsey Foster

,

Growth

October 15, 2025

5 Speechmatics alternatives in 2025

By

Kelsey Foster

,

Growth

October 15, 2025

5 Google Cloud Speech-to-Text alternatives in 2025

By

Kelsey Foster

,

Growth

October 15, 2025

5 Amazon Transcribe alternatives in 2025

By

Kelsey Foster

,

Growth

September 30, 2025

Best medical speech recognition software and APIs in 2025

By

Kelsey Foster

,

Growth

September 22, 2025

Voice agents take center stage: Highlights from the SF Voice Agent Hackathon

By

Devon Malloy

,

Staff Growth Manager

September 17, 2025

Top 8 open source STT options for voice applications in 2025

By

Kelsey Foster

,

Growth

September 17, 2025

Speech-to-text AI: A complete guide to modern speech recognition technology

By

Kelsey Foster

,

Growth

September 16, 2025

AI notetakers beyond transcription: How leading companies turn meetings into measurable business value

By

Kelsey Foster

,

Growth

November 18, 2025

Top 9 AI notetakers in 2026: Compare features, pricing, and accuracy

By

Kelsey Foster

,

Growth

October 23, 2025

Build voice AI apps with LLM Gateway

By

Kelsey Foster

,

Growth

August 28, 2025

How intelligent turn detection (endpointing) solves the biggest challenge in voice agent development

By

Martin Schweiger

,

Senior API Support Engineer

August 27, 2025

How accurate is speech-to-text in 2025?

By

Kelsey Foster

,

Growth

August 27, 2025

The complete guide to speaker diarization APIs and tools

By

Kelsey Foster

,

Growth

August 26, 2025

How does real-time agent assist work? An implementation guide

By

Kelsey Foster

,

Growth

August 20, 2025

The voice AI stack for building agents in 2025

By

Kelsey Foster

,

Growth

August 20, 2025

The conversation intelligence value machine: How AI transforms every customer interaction

By

Kelsey Foster

,

Growth

August 15, 2025

Build a call center analytics pipeline in Python with AssemblyAI

By

Kelsey Foster

,

Growth

August 14, 2025

How to perform speaker diarization in JavaScript

By

Kelsey Foster

,

Growth

August 14, 2025

The conversational AI evolution: How agentic systems are rewriting contact center operations

By

Kelsey Foster

,

Growth

August 11, 2025

Build and deploy real-time AI voice agents using LiveKit and AssemblyAI

By

Kelsey Foster

,

Growth

August 7, 2025

These 7 voice AI projects just blew us away

By

Meredith Rauch

,

Growth

August 7, 2025

Offline speech recognition with Whisper: Browser + Node.js implementations

By

Tema Bolshakov

,

Contributer

August 7, 2025

How to use Whisper API to transcribe audio in JavaScript

By

Tema Bolshakov

,

Contributer

August 11, 2025

How to build and deploy a voice agent using Pipecat and AssemblyAI

By

Kelsey Foster

,

Growth

July 21, 2025

How to choose the best speech-to-text API for voice agents

By

Kelsey Foster

,

Growth

July 14, 2025

How to build the lowest latency voice agent in Vapi: Achieving ~465ms end-to-end Latency

By

Daniel Ince

,

Applied API Engineer

July 14, 2025

Top APIs and models for real-time speech recognition and transcription in 2025

By

Kelsey Foster

,

Growth

July 10, 2025

Conversational AI in healthcare: maturity model and 7 use cases

By

Jesse Sumrak

,

Featured writer

July 9, 2025

OpenAI Whisper for developers: Choosing between API, local, or server-side transcription

By

Tema Bolshakov

,

Contributer

November 25, 2025

Medical voice recognition: How AI solves terminology problems

By

,

July 4, 2025

29 questions to ask when building AI voice agents

By

Jesse Sumrak

,

Featured writer

June 20, 2025

6 best orchestration tools to build AI voice agents in 2025

By

Jesse Sumrak

,

Featured writer

June 30, 2025

Build your first AI voice agent: 3 step-by-step examples

By

Kelsey Foster

,

Growth

October 15, 2025

AI voice agents: what they are and how they work in 2025

By

Jesse Sumrak

,

Featured writer

May 4, 2025

AI call centers: How AI voice agents are transforming contact centers

By

Jesse Sumrak

,

Featured writer

April 28, 2025

How to build an MCP voice agent with OpenAI and LiveKit Agents

By

Juan Luis Ruiz-Tagle

,

Contributor

April 24, 2025

Transformative use cases of AI in contact centers

By

Kelsey Foster

,

Growth

April 22, 2025

Model Context Protocol (MCP) - What it is, how it works, and why it matters

By

Ryan O'Connor

,

Senior Developer Educator

July 18, 2025

AI in Sales Calls: Ways speech AI helps sales teams win more deals

By

Kelsey Foster

,

Growth

April 7, 2025

Conversation intelligence in contact centers

By

Jesse Sumrak

,

Featured writer

July 11, 2025

Build an AI Voice Agent with DeepSeek R1, AssemblyAI, and ElevenLabs

By

Smitha Kolan

,

Developer Educator

November 15, 2023

7 best practices for product teams to consider when building with AI

By

Kelsey Foster

,

Growth

April 7, 2025

8 best revenue intelligence platforms using AI in 2025

By

Jesse Sumrak

,

Featured writer

March 12, 2025

Biggest challenges in building AI voice agents (and how AssemblyAI & Vapi are solving them)

By

Smitha Kolan

,

Developer Educator

March 5, 2025

Monitor your SpeechAI app with OpenLIT

By

Ryan O'Connor

,

Senior Developer Educator

July 12, 2024

How to Create a Real-Time Language Translation Service with AssemblyAI and DeepL in JavaScript

By

,

March 2, 2022

Differentiable Programming - A Simple Introduction

By

Ryan O'Connor

,

Senior Developer Educator

February 28, 2025

Top 7 meeting intelligence platforms in 2025

By

Jesse Sumrak

,

Featured writer

October 28, 2025

How to summarize meetings with LLMs

By

Ryan O'Connor

,

Senior Developer Educator

September 4, 2025

Conversation Intelligence: The complete guide for 2025

By

,

February 21, 2025

Summarize meetings in 5 minutes with Python

By

Ryan O'Connor

,

Senior Developer Educator

February 21, 2023

Why every Fortune 500 business needs a chief AI officer

By

Dylan Fox

,

Founder, CEO

March 15, 2022

Built with AssemblyAI - YouTube Transcripts

By

Kelsey Foster

,

Growth

March 15, 2022

Transcribe Audio Files in an S3 Bucket with AssemblyAI

By

Ryan O'Connor

,

Senior Developer Educator

May 2, 2023

Everything you need to know about Generative AI

By

Ryan O'Connor

,

Senior Developer Educator

August 21, 2024

Decoding Strategies: How LLMs Choose The Next Word

By

Marco Ramponi

,

August 14, 2023

Customer Stories: Conformer-2 in Action

By

Kelsey Foster

,

Growth

July 8, 2024

Create Multi-Lingual Subtitles with AssemblyAI and DeepL

By

Aniket Bhattacharyea

,

September 22, 2023

What AI Music Generators Can Do (And How They Do It)

By

Marco Ramponi

,

December 1, 2025

Is Word Error Rate Useful?

By

Dylan Fox

,

Founder, CEO

July 9, 2025

How to convert voice to text in real time using JavaScript

By

Patrick Loeber

,

Senior Developer Advocate

November 11, 2024

Universal in Action: Transforming Conversational Data Across Industries

By

Ryan O'Connor

,

Senior Developer Educator

November 25, 2024

How to transcribe Zoom participant recordings (multichannel)

By

Ryan O'Connor

,

Senior Developer Educator

November 10, 2025

Python Speech-to-Text with Punctuation, Casing, and Formatting

By

Matt Makai

,

December 11, 2024

Top Speech AI projects and winners at 2024 AssemblyAI Hackathon

By

Whitney DeGraaf

,

Program Manager

November 7, 2024

The race to AI integration

By

Chelsea Weber

,

November 30, 2022

Stable Diffusion in Keras - A Simple Tutorial

By

Ryan O'Connor

,

Senior Developer Educator

August 26, 2025

Text Summarization for NLP: 5 Best APIs, AI Models, and AI Summarizers in 2025

By

Kelsey Foster

,

Growth

November 15, 2024

Talk to ChatGPT on a Phone Call

By

Artem Oppermann

,

Featured writer

October 3, 2024

AI-powered meeting company Supernormal launches customizable Voice Agents

By

Kelsey Foster

,

Growth

December 6, 2022

Stable Diffusion 1 vs 2 - What you need to know

By

Ryan O'Connor

,

Senior Developer Educator

September 10, 2024

How to perform Speaker Diarization in Python

By

Ryan O'Connor

,

Senior Developer Educator

October 16, 2025

Speech recognition in the browser using Web Speech API

By

Patrick Loeber

,

Senior Developer Advocate

July 1, 2021

Speaker Diarization - Speaker Labels for Mono Channel Files

By

Joe Zaghloul

,

August 22, 2023

RLHF vs RLAIF for language model alignment

By

Ryan O'Connor

,

Senior Developer Educator

September 24, 2021

Review - Text-Free Prosody-Aware Generative Spoken Language Modeling

By

Steven Hillis

,

Featured writer

September 26, 2023

Retrieval Augmented Generation on audio data with LangChain and Chroma

By

Ryan O'Connor

,

Senior Developer Educator

June 12, 2024

Redact Personally Identifiable Information (PII) from audio with Node.js

By

Niels Swimberghe

,

July 27, 2023

Recent developments in Generative AI for Audio

By

Marco Ramponi

,

April 23, 2024

Redact PII in Audio with Make and AssemblyAI

By

Niels Swimberghe

,

March 16, 2022

React Speech Recognition with React Hooks

By

Stefan Rosanitsch

,

Contributor

March 20, 2022

React Text to Speech - Simplified!

By

Stefan Rosanitsch

,

Contributor

July 20, 2021

Python Speech Recognition in 30 Lines of Code

By

Yujian Tang

,

Contributor

May 30, 2024

Node.js Speech-to-Text with Punctuation, Casing, and Formatting

By

Niels Swimberghe

,

February 25, 2025

Modern Generative AI for images

By

Ryan O'Connor

,

Senior Developer Educator