Next-Generation Indic Voice AI.

An LP in nineteen Indian languages, pressed for developers. Open-source, low-latency, expressive — speech in milliseconds.

Side A · Languages19

Side B · Voices50+

Pressing · Parameters3 Billion

RPM · Latency<200 ms

Edition · Downloads400,000+

LicenseApache 2.0

The Tracklist

12 · selected · 19 total

Liner Notes

on the model

Svara is an open-source text-to-speech model trained on a corpus of nineteen Indian languages, including the under-resourced: Magahi, Maithili, Bhojpuri, Bodo, Dogri. Apache 2.0. Self-host the weights, or call our edge network and skip the GPUs altogether.

The model runs at sub-200 millisecond average latency end-to-end, comfortably under the threshold where conversation begins to feel turn-taking rather than transactional. Six emotion tags — happy, sad, anger, fear, surprise, clear — can be requested at the prompt level without retraining.

Fine-tuning is supported with a few hours of audio. Voice cloning will arrive in the v2 pressing later this year. 3B parameters, distilled from a Llama backbone, with discrete audio tokens via the Orpheus tokenizer.

Built by Kenpath Labs in Bengaluru. Trained on SYSPIN, RASA, IndicTTS, and SPICOR datasets. Free for the first ten thousand characters per month — no card on file.

Appendix

how to play

A.1 — Python · openai-compatible

# pip install openai
from openai import OpenAI

client = OpenAI(
    base_url="https://api.svara.ai/v1",
    api_key="sk-svara-...",
)

audio = client.audio.speech.create(
    model="svara-v1-indic",
    voice="asha",
    input="आज का दिन तो सच में बहुत ख़ास है",
)
audio.stream_to_file("hello.wav")

Order a pressing →

Free 10K chars/month
No card required
github.com/kenpath-labs · huggingface.co/kenpath