PRODUCT

Convo AI Device Kit

Easily add conversational voice AI into IoT devices and accelerate time-to-market with an integrated hardware chipset and module.
ConvoAI Toy Kit - Hero Image
Supported Platforms
No items found.
PRODUCT

Convo AI Device Kit

Easily add conversational voice AI into IoT devices and accelerate time-to-market with an integrated hardware chipset and module.
Stylized glowing AI orb and a microphone icon labeled 'Your AI Agent'
Supported Platforms
No items found.
Customers building with
Agora and OpenAI
grepp logoWYZE logokileon logokumu logoScaler logoParallel logoJorJin logoAnotherBall logoEllie logozigbang logo
grepp logoWYZE logokileon logokumu logoScaler logoParallel logoJorJin logoAnotherBall logoEllie logozigbang logo

Turnkey solution for adding voice AI to any device

Human-like AI

Human-like conversational AI

Ultra-low latency, clear audio, smart speaker recognition, and customizable wake-up for seamless, lifelike interactions. 
Real-time AI conversations

Global 4G connectivity

Native support for Nano-SIM and eSIM across China, Southeast Asia, Middle East, India, North America, Latin America, and Europe.
AI-based noise suppression, voice activity detection (VAD), and intelligent interruption handling ensure accuracy and smooth, natural conversation flow.

Visual understanding 

Built-in camera gives AI “vision” with the ability to recognize and interpret images for smarter, more immersive interactions.
Edge processing

Dynamic eye displays

Dynamic Eyes bring expressions to life, changing in real time with the conversation with options for dual-screen or single-screen display. 
Seamless cloud integration

Broad compatibility

Supports integration with mainstream AI services and custom large language models. Compatible with a wide range of Wi-Fi, Cat.1, and ISP chipsets for flexible deployment.
Fast deployment

Rapid prototyping

Get a working demo in 1 hour and a production-ready prototype in 1 day. With open hardware and software, accelerate the entire journey from design to mass production.

Talk to a voice agent powered by the Conversational AI Engine

Try it now
One real-time view for the metrics that matter the most
Use a single dashboard to monitor every active session around the world. Track the metrics that are most important to you, from concurrent users and channels to network latency and so much more.

Your vision, unrestricted.

With Interactive Whiteboard, you can build a collaborative app fast—with custom branding and full of features. Our platform makes it easy to create a customized and engaging learning environment.
  • Flexible APIs support custom branding and extensive digital whiteboard features.
  • Easily integrate real-time voice and video calling, interactive streaming and signaling.
  • Save users’ bandwidth by preloading, sharing, and annotating files, and retain all the dynamic content.
And have peace of mind with HIPAA, GDPR, and CCPA compliance.

See OpenAI's Realtime API in action

Bring connected devices to life

Transform smart devices—from educational robots to connected toys—into intelligent, interactive companions with dynamic, real-time AI-driven conversations.
Boost user engagement

Build with maximum flexibility

Leverage an open ecosystem with customizable AI development and broad compatibility with leading ASR, LLM, TTS services and mainstream chipsets to create unique, responsive, and lifelike devices. 
Boost user engagement

Build with maximum flexibility

Build with maximum flexibility

Multimodal interaction 

Enable your device to hear, see, speak, think, and touch with versatile inputs including voice, touchscreen, gestures, and dual-screen display for seamless AI interaction and engagement.
Build with maximum flexibility

Multimodal interaction 

Accelerate time to market

Accelerate time to market

Prototype in hours and scale in months with rapid prototyping, open SDKs, and flexible hardware-software integration. Deploy confidently across worldwide markets with full compliance certifications and support for 45+ languages.
Accelerate time to market

Accelerate time to market

Recording options for:

Cloud recording
Store, retrieve and share recordings in the cloud.
Go to Docs
On-premise recording
Store on a local server for security and confidentiality.
Go to Docs
Webpage recording
Record the entire web browser screen experience.
Go to Docs

Agora Media Services

Recording icon
Recording
Record audio streams, video streams and web pages for archive, review, or distribution.
Live icon
Media Gateway
Directly push media streams into Agora voice and video channels using the RTMP/SRT protocol and enable advanced transcoding processing on media streams to facilitate distribution.
Cloud Transcoding
Beta
Obtain audio and video source streams from hosts in RTC channels and perform transcoding, audio mixing, and video compositing.
Download icon
Media Pull
Add additional engagement to your Agora sessions by  pulling live or recorded video and audio content and ingesting directly into your Agora channel.
Media Push
Expand your audience with hybrid engagement experiences by pushing audio and video streams from Agora channels to Content Delivery Networks (CDN).

Quickstart guide

View the quickstart guide to get up and running with Agora and Open AI.

How the Conversational AI Engine works

Your Code

Agora SDK

Customize your experience from the start with our flexible SDK.
Go to Docs
No items found.
Your Code

Agora SDK

Build and integrate real-time video into your app with the most flexibility and  customization using Agora's Video SDK.
Go to Docs
No items found.
NO CODE

App Builder

Agora’s App Builder is the fastest and easiest way to real-time video into your product using our no-code visual designer.
Go to Docs
low code

Agora UI Kit

Add real-time video to your app with only a few lines of code using low-code UI Kit libraries.
Go to Docs
your code

Agora SDK

Customize your experience from the start with our flexible SDK.
No items found.
Go to Docs
low code

Agora UI Kit

Integrate real-time communication and streaming using only a few lines of code with low-code UIKit libraries.
Go to Docs

Documentation

This project presents you a set of API examples to help you understand how to use Agora APIs.
Go to Docs

Activate the AI Noise Suppression extension on the Agora Console.

Activate the Convo AI Device Kit extension in the Agora Console.

your code

Agora SDK

Build and integrate Voice Calling with the most flexibility and full customization using Agora's Voice SDK.
No items found.
Go to Docs
NO code

App Builder

Agora’s App Builder is the fastest and easiest way to add real-time voice chat, video chat, and live streaming into your product.
Go to Docs
your code

Agora SDK

Build and integrate real-time visual collaboration features into your application with the most flexibility and full customization using Agora's Interactive Whiteboard SDK.
No items found.
Go to Docs
LOW code

Fastboard

Build real-time visual collaboration faster with a pre-built UI and the ability to include custom plug ins.
Try it Now
Security, privacy and compliance
Agora is certified to the ISO/IEC 27001, 27017, 27018, 27701 and SOC 2 security standards and meets privacy regulations like GDPR, CCAP, COPPA, and HIPAA. Agora doesn’t collect or store any end-user data aside from Internet Protocol (IP) addresses and operational information necessary for providing our services.
ISO 27001:2022
ISO 27017:2015
ISO 27018:2019
ISO 27701:2019
HIPAA
GDPR
SOC2 Type1&2
CCPA
COPPA
HOW TO INTEGRATE?
Streamlined 3-step integration process:
01
Activate Agora Conversational AI Engine
Unlock real-time Speech-to-Text (STT) and Text-to-Speech (TTS) capabilities, enabling seamless conversational interactions. 
02
Integrate Agora Edge Chip on Hardware
Optimize microphone, speaker, and system efficiency to ensure ultra-low-latency and high-fidelity conversations.
03
Deploy AI Voice Agents
Enable interactive, multilingual, and user-customized conversations for a wide range of IoT applications.

Integrated chipset and module

By building our Conversational AI technology into RiseLink's high-performance IoT chip modules, the turnkey solution makes it easy to integrate voice AI into any connected toy.
“With Agora’s conversational AI technology and our optimized AI hardware, we’re enabling the next generation of toys to think, respond, and interact naturally. We are excited to usher in the future of robotics and toys, ones that can react to the environment around them and interact fluently with users.” 
Pengfei Zhang
CEO, Riselink
Use cases

Build conversational AI into any smart device or toy

Agora’s conversational AI platform powers a diverse range of use cases across industries.
Voice-interactive robots with expressive "Dynamic Eyes" and animation, ready to engage anywhere with 4G.

Pocket AI robot

Voice-interactive robots with expressive "Dynamic Eyes" and animation, ready to engage anywhere with 4G.
Emotionally responsive AI that offers real-time interaction, care, and personalized companionship for lasting bonds. 

AI companions or pets

Emotionally responsive AI that offers real-time interaction, care, and personalized companionship for lasting bonds. 
Integrate conversational AI into any type of toy or figurine to bring IP-based characters to life with voice interaction.

Talking figurines

Integrate conversational AI into any type of toy or figurine to bring IP-based characters to life with voice interaction.
Photograph an object with the built-in camera and instantly access cloud LLMs for educational insights, bilingual stories, and interactive quizzes—anytime, anywhere, with 4G connectivity.

Educational devices

Photograph an object with the built-in camera and instantly access cloud LLMs for educational insights, bilingual stories, and interactive quizzes—anytime, anywhere, with 4G connectivity.
Add advanced, LLM-based conversational AI to any smart home or smart devices.

Smart home

Add advanced, LLM-based conversational AI to any smart home or smart devices.
Create wearable AI devices so users can stay connected via 4G for live translation, search, and navigation.

Always-on AI assistant

Create wearable AI devices so users can stay connected via 4G for live translation, search, and navigation.
Robopoet's Fuzzoo, an AI companion robot, leverages Agora's ConvoAI Device Kit to deliver real-time emotional support and personalized interaction.
"Agora’s AI technology enables toys and robots to interact in a way that feels natural and engaging. With real-time voice processing, emotional AI, and advanced speech capabilities, Agora makes seamless human-machine interaction possible and ensures exceptional performance and reliability." 
Yuna Pan
Co-Founder and CTO
Mouse cursor illustration

Fastboard

Easily build and integrate Agora’s Interactive Whiteboard with our newest Fastboard SDK that delivers all the same whiteboard features with a pre-built UI and the ability to include custom plug ins.
Try it Now
No items found.
Request more information
Connect with our experts to answer your questions, discuss requirements, and provide more detail on the ConvoAI Device Kit

Frequently asked questions

How does Agora improve the experience in comparison with other solutions for voice interaction with AI?

Agora enables more natural voice conversations with AI, thanks to low-latency responses and real-time interruption handling. Agora’s built-in background noise suppression, echo cancelation, and selective attention locking allow AI to hear the user clearly in any environment. Agora’s global real-time network ensures connectivity and performance in any location.

What LLMs can be connected to Agora’s conversational AI platform?

Agora's Conversational AI Engine offers support for a wide range of large language models (LLMs), including:

  • OpenAI
  • OpenAI Realtime API
  • Azure OpenAI
  • Google Gemini
  • Google Vertex AI
  • Anthropic Claude
  • Dify
  • Custom LLM

Review our documentation on connecting LLMs here: https://docs.agora.io/en/conversational-ai/models/llm/overview

What automatic-speech-recognition (ASR) / speech-to text (STT) models are supported?

Agora’s Conversational AI Engine currently supports the following ASR providers:

  • ARES (default)  
  • Microsoft Azure
  • Deepgram

Review our documentation on connecting ASR models here: https://docs.agora.io/en/conversational-ai/models/asr/overview

What text-to-speech (TTS) models are supported?

Agora’s Conversational AI Engine currently supports the following TTS providers:

  • Microsoft Azure
  • ElevenLabs
  • Cartesia (Beta)
  • OpenAI (Beta)
  • Hume AI (Beta)

Review our documentation on connecting TTS models here: https://docs.agora.io/en/conversational-ai/models/tts/overview

What avatar providers are supported?

Agora’s Conversational AI Engine currently supports the following AI avatar providers:

  • Akool (Beta)
  • HeyGen (Alpha)

Review our documentation on connecting avatar providers here: https://docs.agora.io/en/conversational-ai/models/avatar/overview

What additional technology is required to implement a voice AI agent?

To implement a voice AI agent, you need to connect an LLM and a text-to-speech service to Agora’s Conversational AI Engine. This enables full customization of the experience, with the LLM and voice of your choice.

What is a “chained” or “cascade” model” in relation to conversational voice AI?

The chained or cascade model refers to the processing flow of the user’s voice being processed by automatic speech recognition (ASR) technology that converts speech to text, then that text being processed by the LLM, then the LLM’s response being processed by text-to-speech technology and ultimately outputting the AI agent’s voice response.

Does Agora’s Conversational AI Engine enable the creation of an AI model or LLM?

No, Agora’s Conversational AI Engine requires an existing AI model or LLM. The Engine enables customized voice interaction with the LLM but is not capable of creating or training an LLM.