svara-global-v1 streams the first byte of audio in 187ms from any of 9 regions — voice agents that interrupt cleanly, phone calls that feel native, in-app companions that don't make you wait. One voice across 50+ languages, in one stream.
Sub-200ms first byte. Bi-directional streaming, server-side barge-in support, sentence-level interrupt. Pay per second, not per character.
OpenAI-compatible endpoint with chunked output. Drop-in replacement for /audio/speech — same SDK, different latency profile.
Native SIP trunk. Plug into Twilio, Vonage, Daily, or your own carrier. Outbound + inbound calls; same voice across web, app, and phone.
Three actual voice-agent transcripts, each showing what the model sounds like under load. Tap one to play the recorded turn.
OpenAI-compatible. Drop-in WebSocket. Same voice across all 50+ languages — no per-locale switch, no per-emotion model.
[lang=ja] tags work mid-utterance.# pip install svara · 2 lines, realtime voice from svara import Svara ws = Svara().realtime.connect(voice="aria") ws.send("[warm] Welcome back. [lang=ja] " "今日は素晴らしい一日ですね。") async for chunk in ws.audio: speakers.write(chunk) # first byte: 187ms
Free realtime tier — 60 minutes per month, no card. Open weights, Apache 2.0. Self-host on a single A10 if you want to.
Start a live session →