SVARA × YOUR INDUSTRY · 2026

One voice model.
Six industries.
Same voice — every time.

svara-global-v1 is a 780M-parameter voice model designed to fit how each team actually works. The same voice that carries an ad spot through 14 languages also handles your phone agent, your audiobook narrator, and your game NPCs. One model. One API. One voice. Many industries.

50+ languages, single voice

187ms first-byte, P50

Apache 2.0 open weights

See it in your industry ↓ Talk to sales

▸ FOR · MARKETING & ADS

Localize a video to 14 markets in an afternoon.

The same voice your audience has heard for 18 months — now in Spanish, Japanese, Portuguese, and 47 more. Same brand. Same script. New languages, in one API call.

Aurora Coffee
Spring Drop · 0:14

00:00 00:42

VOX ▶

What your team gets

Same voiceAcross every language. Brand voice stays brand voice.
Inline tags[lang=es] mid-script switches language without re-prompting.
Stem exportWAV, MP3, FLAC for your editor; alignments for captions.
Brand approvalLock a voice; reviewers approve once, ship globally.

Book a brand demo Aurora case study →

▸ FOR · AUDIOBOOKS & PUBLISHING

One narrator. Twenty-six chapters. Seventeen languages.

Cast a single narrator and let them carry your title from cover to cover, in every language your distributor will take. Studio-grade voice, no booking, no re-recording for re-edits.

The Lantern
Keepers

Devika Rao

Chapter 12 — A long letter home

Narrated by Eleanor · 17 languages, one voice

▶ 04:18 / 12:47

ENESFRDEHIJAARZH+ 9 more

Built for publishers

Long-formTested on 60-hour books — voice identity stays stable chapter-to-chapter.
PronunciationCustom lexicon for character names, places, made-up words.
Re-narrationEdit a paragraph, regenerate just that paragraph, stitch automatically.
DistributionAudible-, Spotify-, Storytel-ready masters out of the box.

Talk to publishing team Tessera Press case study →

▸ FOR · EDUCATION & TRAINING

Every course, every language.

Author once in your authoring tool. Pick a narrator. The same voice teaches the same course in every language your students speak — with built-in pronunciation hints for technical terms.

Module 4 · Photosynthesis

Plants take in carbon dioxide through tiny pores called stomata. Inside the leaf, chlorophyll absorbs sunlight and powers the conversion to glucose.

Glucose travels through phloem to nourish the rest of the plant, while oxygen exits through the same pores.

AUTO-NARRATED · 5:12 · EN · ES · HI · AR · ZH · + 11

Choose a narrator

AAriaCalm · journalistic▶

MMateoWarm · teacher▶

YYukiSteady · documentary▶

Built for learning teams

AuthoringPlug-ins for Articulate, Rise, Captivate, and SCORM authoring tools.
PronunciationTerm-level overrides for jargon, equations, and chemistry/biology terms.
AccessibilityCaptions, transcripts, and chapter markers generated alongside audio.
FERPA / GDPRSelf-host on your own infra with the open-weights release.

Talk to learning team Keystone EDU case study →

▸ FOR · VOICE AGENTS & SUPPORT

Phone-call quality. Browser-fast.

Stream replies under 200ms first-byte; interrupt cleanly when the customer talks; switch language mid-call. Plug into Twilio, Vonage, Daily — keep the same voice.

CALL · LIVE · agent="aria" · P50 184ms

CALLER

Hi, I think I was double-charged on my last invoice — can you check?

SVARA · ARIA

[warm] Of course — I see the duplicate from May 3rd. I'll refund it now and email you a confirmation.

CALLER

¿Pueden mandármelo en español?

SVARA · ARIA

[lang=es] Claro — el reembolso se procesará en 2-3 días hábiles. ¿Algo más?

184ms

First-byte, P50

$0.014

Per minute, all-in

SIP

Twilio · Vonage · Daily

Built for support & sales teams

StreamingWebSocket + SSE; first byte under 200ms from any region.
Barge-inCustomer interrupts cut audio cleanly without artifacts.
Code-switchOne agent answers in EN, switches to ES if customer does — same voice.
ComplianceSOC 2, HIPAA self-host options, watermarked output.

Book an agents demo Atlas Bank case study →

▸ FOR · GAMES & INTERACTIVE FICTION

200 NPCs. One voice budget.

Cast a handful of voices, tag your dialog with emotion and language, ship in 12 markets without a re-record. Inline [whisper], [shout], [laugh] as part of your script.

DIALOG · LINE 12
RTF 0.18 · A10

NPC · TIER-2 · MERCHANT

Brann the Cartographer

[whisper]Keep your voice down, friend. The maps you seek are not for everyone's eyes.

[stern]Twenty gold. Not a coin less. [laugh] No, I'm not haggling — that face won't work on me.

[lang=ja][playful]こちらの古い地図、欲しい？金貨二十枚で君のものだ。

[lang=fr][stern]Vingt pièces d'or. Pas une de moins.

Built for game studios

EnginesUnity, Unreal, Godot SDKs; runtime synthesis or pre-bake to FMOD/Wwise.
LocalizationSame voice carries from EN to FR, JA, BR-PT — no re-cast per locale.
PerformanceRTF 0.18 on A10, 0.31 on T4, INT8 quantized 420MB.
Self-hostOpen weights — ship offline-capable single-player with the model bundled.

Get game-studio license Northstar Games case study →

▸ FOR · IVR & TELEPHONY

An IVR that doesn't sound like one.

Brand-matched voice. 14 languages on day one. Containment up, escalation down — because callers actually understand what they're being asked, in the language they speak.

▸ "Welcome to Atlas Bank — EN ES ZH +11

For account balance, press 1

To report a lost card, press 2

To speak with an agent, press 0

▸ Tone: warm, professional, brand-matched

▸ Latency: under 200ms first-byte, anywhere

▸ Drop-in: SIP, Twilio Flex, Genesys, Five9

LIVE · LAST 60S · ATLAS BANK

Calls handled1,247

Avg first-byte184ms

Languages served14

Containment78.4%

Cost / call$0.018

Built for ops & contact centers

Drop-inReplace existing IVR text-to-speech without changing your call-flow tool.
Multilingual14 languages out of the box, same voice carries across all.
Brand voiceLock a tone (warm / formal / sales) — 100% deterministic across regenerations.
Cost$0.018 per call typical. Self-host for sub-cent at scale.

Get an IVR pilot Atlas Bank case study →

One model. Six teams. Same answer.

svara-global-v1 is small enough to self-host on a single A10, expressive enough for audiobook narration, fast enough for live phone agents, and broad enough to serve every market your team works in.

780M

Parameters · sub-billion · single-GPU deployable

50+

Languages, one voice identity per cast

187ms

First-byte streaming latency, P50 from 9 regions

Apache 2.0

Open weights · self-host · no per-seat lock-in

One voice model.Six industries.Same voice — every time.