Rumik Launches Silk Mulberry 1.5: 'Describe a Voice Into Existence' with Plain-Language Prompts, Matching Commercial TTS Giants at 95% Lower Cost
Technology📅 June 19, 2026👤 FreeReadText Team

Rumik Launches Silk Mulberry 1.5: 'Describe a Voice Into Existence' with Plain-Language Prompts, Matching Commercial TTS Giants at 95% Lower Cost

Indian AI startup Rumik releases Silk Mulberry 1.5, a text-to-speech model that replaces preset voice menus with plain-language voice descriptions, achieving MOS scores competitive with ElevenLabs and Google at roughly $0.0046 per minute.

On June 19, 2026, Indian AI startup Rumik launched Silk Mulberry 1.5, a text-to-speech model built around a radical premise: instead of picking a voice from a dropdown menu, users describe the voice they want in plain language — specifying age, gender, accent, pitch, emotional tone, and delivery style — and the model synthesizes it on demand. The release positions Rumik as India's most ambitious entrant in the global TTS race, competing on quality while undercutting established players on cost by over 95%.

Under the hood, Silk Mulberry 1.5 is built as an audio language model — a transformer backbone that generates speech tokens which are then decoded into waveforms — rather than a conventional TTS system bolted onto a neural vocoder. The architecture delivers sub-200ms time-to-first-chunk on a single NVIDIA H100 GPU even under 80 concurrent requests. In benchmark evaluations, the model achieved a MOS (Mean Opinion Score) of 4.23, placing it within competitive range of ElevenLabs v3 and Google Gemini 3.1 Flash TTS. On the InstructTTS Eval benchmark, it scored 74% overall consistency, climbing to 82.7% on descriptive voice prompts — its strongest modality.

Rumik also published mechanistic interpretability research alongside the launch, using linear probes on internal model activations to study how Silk Mulberry handles code-switching between English and Hindi. The research revealed that the model does not maintain a persistent 'language mode' but reacts token-by-token to lexical content — probe confidence spikes on Hindi tokens and falls on English ones, suggesting a more fluid, context-sensitive internal representation than traditional multilingual TTS systems. This transparency-oriented research approach is unusual for a commercial TTS launch and signals Rumik's intent to build credibility in both the academic and developer communities.

Pricing is aggressively positioned for the Indian and emerging markets: ₹0.40 per minute, or approximately $0.0046 per minute — roughly 95% below typical commercial TTS API rates. Analysts note that if Rumik can sustain its quality claims at scale, the pricing could force downward pressure on TTS costs across South Asia, Southeast Asia, and Africa, where cost sensitivity has historically limited adoption of premium voice AI services. The model is available through Rumik's API playground, with enterprise deployment options expected later in 2026.

RumikSilk MulberryIndian AIText-to-SpeechVoice DesignCode-Switching

Источник

← Back to News