EXTENSION

Real-Time Speech to Text

Create a better user experience and integrate with large language models (LLMs) using the most accurate cloud-based live transcription and subtitling.
A live video conference interface showing a woman presenting with real-time transcription and meeting notes displayed, including participant thumbnails and a transcription panel with highlighted key points.
Supported Platforms
RESTful API

Features

Cloud-based live transcription icon
Cloud-based live transcription
Cloud-based transcription converts audio to text for active or selected hosts in real time. Text can be distributed as live captions to all participants in the channel.
LLM integration icon
LLM integration
Integrate speech to text with LLMs for further processing, without impacting RTC performance. Upload transcription text as .vtt files to LLMs like GPT to generate summaries, notes, and more.
Simultaneous speakers icon
Transcribing and labeling simultaneous speakers
Easily label who said what—even with up to 3 simultaneous speakers. Separate transcription for each host ensures accuracy and allows you to choose to transcribe for one specific host.
Captioning for cloud recordings icon
Captioning for cloud recordings
Transcribe audio to text on video or audio recordings to enable closed captions (CC) on playback or review important discussion items in the transcript.
Multi-language support icon
Multi-language support
Real-time transcription supports all major languages and dialects, and each channel can support audio-to-text transcription for up to two languages simultaneously. 
Enterprise-grade security and compliance icon
Enterprise-grade security and compliance
Agora is ISO and SOC 2 certified and meets compliance standards for regional privacy laws and industry regulations, including GDPR, CCPA, and HIPAA. Live captions and transcription can be encrypted in the same way as encrypted RTC audio or video.
One real-time view for the metrics that matter the most
Use a single dashboard to monitor every active session around the world. Track the metrics that are most important to you, from concurrent users and channels to network latency and so much more.

Instantly transcribe speech to text for live audio and video

Agora’s Real-Time Speech to Text provides accurate live transcription and subtitling services at a low cost.
A live news session with anchor announcing breaking news and demonstrating speech-to-text.
Reduce cost and increase efficiency icon

Reduce cost and increase efficiency

More efficient and cost-effective than traditional client-side live transcription, Agora’s solution by uses advanced technology to remove silence, reduce Word Error Rate (WER), and distribute live captions to all participants in a channel.
Reduce cost and increase efficiency

Reduce cost and increase efficiency icon

Get the most accurate results at scale icon

Get the most accurate results at scale

Cutting-edge AI ensures the highest accuracy even with overlapping speech, regional accents, and poor network conditions. Scale from one-to-one meetings to up to millions of participants with the same accuracy.
Get the most accurate results at scale

Get the most accurate results at scale icon

Integrate with ease icon

Integrate with ease

Agora’s Real-Time Speech to Text is highly integrated with Agora’s network (SD-RTN™), providing global user transcription and real-time text distribution even in poor network environments.
Integrate with ease

Integrate with ease icon

Recording options for:

Cloud recording
Store, retrieve and share recordings in the cloud.
Go to Docs
On-premise recording
Store on a local server for security and confidentiality.
Go to Docs
Webpage recording
Record the entire web browser screen experience.
Go to Docs

Agora Media Services

Recording icon
Recording
Record audio streams, video streams and web pages for archive, review, or distribution.
Live icon
Media Gateway
Directly push media streams into Agora voice and video channels using the RTMP/SRT protocol and enable advanced transcoding processing on media streams to facilitate distribution.
Download icon
Media Pull
Add additional engagement to your Agora sessions by  pulling live or recorded video and audio content and ingesting directly into your Agora channel.
Media Push
Expand your audience with hybrid engagement experiences by pushing audio and video streams from Agora channels to Content Delivery Networks (CDN).
Real-Time Speech to Text
Enable the power of voice to create a better user experience with fast, accurate, automated transcription and subtitling services. Powered by cutting-edge AI that ensures the highest accuracy even with crosstalk and poor network conditions.

Your vision, unrestricted.

With Interactive Whiteboard, you can build a collaborative app fast—with custom branding and full of features. Our platform makes it easy to create a customized and engaging learning environment.
  • Flexible APIs support custom branding and extensive digital whiteboard features.
  • Easily integrate real-time voice and video calling, interactive streaming and signaling.
  • Save users’ bandwidth by preloading, sharing, and annotating files, and retain all the dynamic content.
And have peace of mind with HIPAA, GDPR, and CCPA compliance.

Made for developers

Your Code

Agora SDK

Customize your experience from the start with our flexible SDK.
Your Code

Agora SDK

Build and integrate real-time video into your app with the most flexibility and  customization using Agora's Video SDK.
NO CODE

App Builder

Agora’s App Builder is the fastest and easiest way to real-time video into your product using our no-code visual designer.
Go to Docs
low code

Agora UI Kit

Add real-time video to your app with only a few lines of code using low-code UI Kit libraries.
Go to Docs
your code

Agora SDK

Customize your experience from the start with our flexible SDK.
RESTful API
Go to Docs
low code

Agora UI Kit

Integrate real-time communication and streaming using only a few lines of code with low-code UIKit libraries.
Go to Docs

Documentation

This project presents you a set of API examples to help you understand how to use Agora APIs.
Platform-agnostic RESTful APIs make it easy to add highly accurate and cost-effective real-time speech-to-text capabilities.
RESTful API
Go to Docs

Activate the AI Noise Suppression extension on the Agora Console.

Activate the Real-Time Speech to Text extension in the Agora Console.

your code

Agora SDK

Build and integrate Voice Calling with the most flexibility and full customization using Agora's Voice SDK.
RESTful API
Go to Docs
NO code

App Builder

Agora’s App Builder is the fastest and easiest way to add real-time voice chat, video chat, and live streaming into your product.
Go to Docs
your code

Agora SDK

Build and integrate real-time visual collaboration features into your application with the most flexibility and full customization using Agora's Interactive Whiteboard SDK.
RESTful API
Go to Docs
LOW code

Fastboard

Build real-time visual collaboration faster with a pre-built UI and the ability to include custom plug ins.
Try it Now
Security, privacy and compliance
Agora is certified to the ISO/IEC 27001, 27017, 27018, 27701 and SOC 2 security standards and meets privacy regulations like GDPR, CCAP, COPPA, and HIPAA. Agora doesn’t collect or store any end-user data aside from Internet Protocol (IP) addresses and operational information necessary for providing our services.
ISO 27001:2022
ISO 27017:2015
ISO 27018:2019
ISO 27701:2019
HIPAA
GDPR
SOC2 Type1&2
CCPA
COPPA
Use cases

Transcribe speech to text for any real-time application

Securely transcribe and record real-time audio or video and organize recordings and transcripts to speed up workflows.
An online classroom with real-time captioning powered by speech-to-text transcription and subtitling.
Education
Give faculty and students real-time captions and analyze them with an LLM to provide lesson summaries and suggestions for further learning.
A live video call with a doctor and speech-to-text transcription services.
Telehealth
Keep secure records of virtual appointments for Minimum Effective Response (MER) and cross-reference telehealth knowledge bases.
A live basketball game showing player soaring through the air and making a slam dunk in front of a packed arena. Overlay text via speech-to-text reads "Unbelievable move! The score is now 68-65."
Events
Empower your event with real-time, accurate notes, ensuring a more accessible, searchable, and engaging event experience.
A speech-to-text enriched live shopping session with woman detailing a veggie basket product offering.
Live shopping
Use virtual assistants to improve accessibility and reach a wider audience by offering detailed product information, personalized recommendations, and guiding customers through the purchasing process.
A virtual meeting between four people with real-time automated notes and documented outstanding questions and action items via an LLM.
Virtual meetings
Provide real-time automated notes in meetings and document outstanding questions and action items via an LLM.
An influencer on social channel sharing a review of a sandwich with speech-to-text translations into Vietnamese.
Social & metaverse
Eliminate communication barriers for people with different languages or disabilities. Extract conversation for business optimization, advertising, and moderation.
Mouse cursor illustration

Fastboard

Easily build and integrate Agora’s Interactive Whiteboard with our newest Fastboard SDK that delivers all the same whiteboard features with a pre-built UI and the ability to include custom plug ins.
Try it Now
“Agora’s Real-Time Speech to Text enabled us to integrate with AI to automate translation and feedback, providing substantial improvements in the overall language learning experience.”
Zackery Ngai
Zackery Ngai
CEO, HelloTalk