Notizie del Settore

Resta aggiornato sugli ultimi sviluppi nella tecnologia vocale IA, sintesi vocale e nel panorama normativo in evoluzione

Technology

Microsoft Unveils MAI-Voice-1: Hyper-Realistic Speech Generation from Just One Minute of Audio

Microsoft launches three new foundational AI models including MAI-Voice-1, which delivers hyper-realistic voice synthesis and custom brand voice creation, marking a major leap in enterprise TTS capabilities.

👤 FreeReadText Team📅 April 2, 2026

MicrosoftMAI-Voice-1Enterprise TTSVoice SynthesisFoundry

Leggi di più →

Business

ElevenLabs Reaches $11 Billion Valuation, Eyes IPO as Voice AI Becomes Enterprise Standard

AI voice startup ElevenLabs raises $500 million at an $11 billion valuation, tripling its worth in just over a year while forging major partnerships with IBM and planning a potential IPO.

👤 FreeReadText Team📅 February 4, 2026

ElevenLabsFundingIPOIBM PartnershipEnterprise AI

Leggi di più →

Regulation

Global AI Voice Regulation Tightens: EU AI Act Deepfake Rules Take Effect as Voice Cloning Crosses 'Indistinguishable Threshold'

As voice cloning technology reaches human-level quality, regulators worldwide respond with new laws — the EU AI Act's deepfake labeling rules, the US ELVIS Act, and emerging biometric voice data protections reshape the industry landscape.

👤 FreeReadText Team📅 March 15, 2026

EU AI ActELVIS ActVoice CloningDeepfakeBiometric DataCompliance

Leggi di più →

Technology

OpenAI Launches Voice Engine to the Public: Real-Time Conversational TTS Now Available to All Developers

After over a year of limited preview, OpenAI opens Voice Engine to all API developers, introducing real-time streaming TTS with emotional awareness and 40+ language support at significantly reduced pricing.

👤 FreeReadText Team📅 March 28, 2026

OpenAIVoice EngineReal-Time TTSAPIConversational AI

Leggi di più →

Technology

Google DeepMind Brings Studio-Quality TTS to Smartphones with SoundStorm 2 Edge — No Internet Required

Google DeepMind announces SoundStorm 2 Edge, a compact on-device TTS model that runs entirely on mobile hardware, delivering studio-quality voice synthesis without cloud connectivity and opening new possibilities for offline accessibility.

👤 FreeReadText Team📅 March 20, 2026

Google DeepMindSoundStorm 2On-Device AIMobile TTSAccessibility

Leggi di più →

Business

AI Dubbing Market Surges Past $2 Billion as Hollywood, Streaming Giants, and Game Studios Embrace Automated Localization

The AI-powered dubbing and localization market crosses the $2 billion mark in Q1 2026, driven by adoption from Netflix, Disney+, and major game publishers seeking to reach global audiences at a fraction of traditional costs.

👤 FreeReadText Team📅 April 5, 2026

AI DubbingLocalizationNetflixGamingVoice ActingStreaming

Leggi di più →

Technology

Apple Unveils 'Personal Voice 2.0' in iOS 20: On-Device Voice Cloning Creates Your Digital Twin in 3 Minutes

Apple announces Personal Voice 2.0 at its spring event, allowing users to create a highly realistic clone of their own voice in just 3 minutes of recording — all processed entirely on-device with Apple Silicon, positioning it as the privacy-first alternative to cloud-based voice AI.

👤 FreeReadText Team📅 April 8, 2026

AppleiOS 20Personal VoiceOn-Device AIPrivacyVoice Cloning

Leggi di più →

Business

Spotify Rolls Out AI Voice Translation for Podcasts Globally: Your Favorite Hosts Now Speak 40 Languages in Their Own Voice

Spotify launches its AI-powered podcast translation feature worldwide, using voice cloning technology to automatically dub podcasts into 40 languages while preserving each host's unique voice characteristics — opening 100,000+ shows to global audiences overnight.

👤 FreeReadText Team📅 April 7, 2026

SpotifyPodcast TranslationVoice CloningLocalizationStreaming AudioCreator Economy

Leggi di più →

Technology

FDA Clears First AI Voice Assistant for Clinical Use: Voice-Based Patient Screening Enters the Hospital

The FDA grants its first clearance for an AI voice assistant designed for clinical patient interaction, allowing automated voice-based symptom screening and triage in emergency departments — marking a historic milestone for voice AI in healthcare.

👤 FreeReadText Team📅 April 10, 2026

Healthcare AIFDAVoice AssistantClinical AIPatient ScreeningHippocratic AI

Leggi di più →

Technology

Meta Releases Llama-Voice: First Fully Open-Source TTS Model to Match Commercial Giants in 50+ Languages

Meta drops Llama-Voice under an Apache 2.0 license, delivering near state-of-the-art voice synthesis, zero-shot voice cloning from 10 seconds of audio, and 52-language coverage — all runnable on a single consumer GPU.

👤 FreeReadText Team📅 April 12, 2026

MetaLlama-VoiceOpen SourceMultilingual TTSVoice CloningHugging Face

Leggi di più →

Technology

NVIDIA Launches Voice Foundry NIM: Blackwell-Optimized Microservices Cut Real-Time TTS Costs by 70%

NVIDIA unveils Voice Foundry, a dedicated suite of NIM inference microservices for TTS and STT optimized for Blackwell GB200 hardware, promising sub-80ms first-token latency and 70% lower per-character costs for enterprise voice applications.

👤 FreeReadText Team📅 April 15, 2026

NVIDIAVoice FoundryNIMBlackwellEnterprise InfrastructureTensorRT

Leggi di più →

Business

Audible Opens AI-Narrated Audiobook Catalog to 400,000 Backlist Titles — Narrators Split on Landmark Royalty Model

Amazon's Audible launches the industry's largest AI-narrated audiobook catalog, adding 400,000 previously unnarrated titles using voice clones of consenting narrators, with a first-of-its-kind per-listen residual model that splits the narration community.

👤 FreeReadText Team📅 April 17, 2026

AudibleAmazonAI NarrationAudiobooksVoice ActingRoyaltiesSAG-AFTRA

Leggi di più →

Technology

Google Launches Gemini 3.1 Flash TTS: 70+ Languages, Multi-Speaker Dialogue, and a Top Spot on the Artificial Analysis Leaderboard

Google introduces Gemini 3.1 Flash TTS, a new text-to-speech model with audio tags for fine-grained vocal control, native multi-speaker dialogue, and 70+ language support — landing in the 'most attractive quadrant' of the Artificial Analysis TTS leaderboard with an Elo of 1,211.

👤 FreeReadText Team📅 April 15, 2026

GoogleGeminiFlash TTSMulti-SpeakerSynthIDVertex AI

Leggi di più →

Technology

OpenAI Launches GPT-Realtime-2: Voice Models with GPT-5-Class Reasoning, Live Translation, and Streaming Transcription

OpenAI introduces three new Realtime API voice models — GPT-Realtime-2 with GPT-5-class reasoning, GPT-Realtime-Translate covering 70+ input languages, and GPT-Realtime-Whisper for live transcription — quadrupling the context window to 128K tokens and bringing voice agents closer to production-ready workflows.

👤 FreeReadText Team📅 May 7, 2026

OpenAIGPT-Realtime-2Realtime APILive TranslationSpeech-to-TextVoice Agents

Leggi di più →

Technology

Microsoft Launches MAI-Voice-2 at Build 2026: Expressive Speech and Zero-Shot Voice Cloning Across 15 Languages

Microsoft unveils MAI-Voice-2, calling it the most expressive and natural-sounding text-to-speech model it has built, expanding from English-only to 15 languages with granular emotion control, code-switching, and zero-shot voice prompting from a few seconds of audio.

👤 FreeReadText Team📅 June 2, 2026

MicrosoftMAI-Voice-2Multilingual TTSVoice CloningBuild 2026Azure

Leggi di più →

Business

Wispr Hits ~$2 Billion Valuation as AI Voice Dictation Becomes a Workplace Standard

Wispr, the startup behind the AI dictation tool Wispr Flow, is raising roughly $260 million at a near-$2 billion valuation led by Menlo Ventures — nearly tripling its worth in six months as voice-to-text moves from novelty to everyday workplace productivity tool.

👤 FreeReadText Team📅 May 13, 2026

WisprWispr FlowFundingMenlo VenturesVoice DictationSpeech-to-Text

Leggi di più →

Regulation

FTC Begins Enforcing the TAKE IT DOWN Act: Platforms Face $53,088-Per-Violation Penalties for AI Deepfakes

The FTC's civil enforcement of the TAKE IT DOWN Act took effect on May 19, 2026, requiring platforms to remove nonconsensual intimate imagery — including AI-generated deepfakes — within 48 hours, with penalties of $53,088 per violation. The agency promptly sent warning letters to major platforms and 'nudify' websites.

👤 FreeReadText Team📅 May 19, 2026

TAKE IT DOWN ActFTCDeepfakeVoice CloningSynthetic MediaCompliance

Leggi di più →

Business

Poland Government Takes Stake in ElevenLabs, Launches AI Lab to Build Voice AI from Europe

The Government of Poland invests in ElevenLabs through its Vinci/BGK Group, joining Andreessen Horowitz and Sequoia as a strategic backer, while launching AI Lab Poland to nurture the next generation of voice AI companies with global ambition.

👤 FreeReadText Team📅 June 18, 2026

ElevenLabsPolandGovernment InvestmentAI LabEuropeVoice AI

Leggi di più →

Technology

ElevenLabs Launches Dubbing v2: Emotion-Preserving AI Dubbing Across 90+ Languages

ElevenLabs releases Dubbing v2, a breakthrough AI dubbing model that preserves the original speaker's emotion, tone, and pacing across 90+ languages by conditioning directly on the performance rather than just transcripts.

👤 FreeReadText Team📅 May 28, 2026

ElevenLabsDubbing v2AI DubbingTranslationLocalizationAudio AI

Leggi di più →

Business

ElevenLabs Partners with UK Government to Bring Voice AI to Public Services, Doubles London Headquarters

ElevenLabs signs a Memorandum of Understanding with the UK's Department for Science, Innovation and Technology to deploy voice AI in public services, focusing on accessibility for the visually impaired, elderly, and linguistically diverse communities.

👤 FreeReadText Team📅 June 8, 2026

ElevenLabsUK GovernmentPublic ServicesAccessibilityAI PolicyDSIT

Leggi di più →

Technology

Rumik Launches Silk Mulberry 1.5: 'Describe a Voice Into Existence' with Plain-Language Prompts, Matching Commercial TTS Giants at 95% Lower Cost

Indian AI startup Rumik releases Silk Mulberry 1.5, a text-to-speech model that replaces preset voice menus with plain-language voice descriptions, achieving MOS scores competitive with ElevenLabs and Google at roughly $0.0046 per minute.

👤 FreeReadText Team📅 June 19, 2026

RumikSilk MulberryIndian AIText-to-SpeechVoice DesignCode-Switching

Leggi di più →

Business

Michael Caine's AI Voice Narrates 13-Hour 'The Odyssey' Audiobook — 20 AI Characters, Original Score, Built by 4 Producers in 6 Weeks

ElevenLabs releases a cinematic audiobook of Homer's The Odyssey narrated by an authorized AI replica of Sir Michael Caine's voice, featuring ~20 AI-generated character voices, original music, and sound design — all produced by a four-person team in six weeks.

👤 FreeReadText Team📅 June 23, 2026

ElevenLabsMichael CaineAI AudiobookVoice CloningIconic MarketplaceHollywood

Leggi di più →

Technology

Five9 Launches Voice AI Agents with ElevenLabs, Deepgram, and OpenAI Under the Hood — Targeting Legacy IVR Replacement

Five9 unveils Voice AI Agents at Customer Contact Week 2026, combining ElevenLabs TTS, Deepgram ASR, and OpenAI reasoning in a proprietary three-model architecture built to replace scripted IVR systems with natural, human-like voice self-service.

👤 FreeReadText Team📅 June 23, 2026

Five9Voice AI AgentsContact CenterAgentic AIEnterpriseCustomer Experience

Leggi di più →

Technology

xAI Launches Voice Agent Builder: No-Code Platform Harnesses Grok Voice to Beat GPT and Gemini in Telephony Benchmarks

Elon Musk's xAI enters the voice AI market with Voice Agent Builder, a no-code platform powered by Grok Voice Think Fast 1.0 that scores 67.3% on the τ-voice Bench — far outpacing Google Gemini 3.1 Flash Live (43.8%) and OpenAI GPT Realtime 1.5 (35.3%) — with pricing starting at $0.05 per minute.

👤 FreeReadText Team📅 July 1, 2026

xAIGrok VoiceVoice Agent BuilderNo-CodeSpeech-to-SpeechCall Center AI

Leggi di più →

Business

Bland.ai Raises $50M Series C After 180 Investor Rejections, Now Powers 3.5 Million Voice Calls Per Week

San Francisco voice AI startup Bland.ai closes a $50 million Series C led by Dell Technologies Capital, bringing total funding past $100 million — after founders were rejected by 180 investors who told them 'phone calls won't exist in a year.'

👤 FreeReadText Team📅 June 16, 2026

Bland.aiSeries CFundingDell Technologies CapitalVoice AgentsEnterprise AI

Leggi di più →

Technology

NetEase Youdao Releases Confucius4-TTS: Open-Source 14-Language Voice Cloning from Just 3 Seconds of Audio

Chinese edtech giant NetEase Youdao open-sources Confucius4-TTS under Apache 2.0, a 1.3B-parameter voice cloning model achieving 85%+ voice similarity from 3 seconds of audio across 14 languages — with no reference text needed for cross-lingual cloning.

👤 FreeReadText Team📅 June 23, 2026

NetEase YoudaoConfucius4-TTSOpen SourceVoice CloningMultilingual TTSFlow Matching

Leggi di più →

Regulation

NO FAKES Act Unanimously Passes Senate Judiciary Committee, Creating Federal Voice and Likeness Protection

The bipartisan NO FAKES Act clears the Senate Judiciary Committee by unanimous voice vote, creating a federal intellectual property right over AI-generated digital replicas of voice and visual likeness — with platform liability, 70-year post-mortem protections, and DMCA-style takedown provisions.

👤 FreeReadText Team📅 June 18, 2026

NO FAKES ActSenate Judiciary CommitteeDigital ReplicaVoice RightsDeepfake RegulationFederal IP Law

Leggi di più →

Business

Kotoba Technologies Raises $10 Million to Bring Real-Time Voice AI to East Asian Languages

San Francisco and Tokyo-based Kotoba Technologies raises an additional $10 million in seed funding led by Kindred Ventures, with Salesforce Ventures and Sony Innovation Fund participating, to expand its Koto voice AI model optimized for Japanese, Korean, and Chinese — languages spoken by roughly 1.6 billion people.

👤 FreeReadText Team📅 June 24, 2026

Kotoba TechnologiesEast Asian AISpeech-to-SpeechVoice AISeed FundingOn-Device AIMultilingual

Leggi di più →

Technology

ViiTorVoice-NAR Goes Open Source: First TTS Model That Edits Single Words Inside Finished Audio

Chinese startup Yunshang Qulv releases ViiTorVoice-NAR under Apache 2.0, introducing word-level audio editing that replaces individual words without regenerating surrounding content — alongside sub-60ms latency and benchmark-leading accuracy on both English and Chinese.

👤 FreeReadText Team📅 July 1, 2026

ViiTorVoiceOpen Source TTSWord-Level EditingChinese AINAR ArchitectureApache 2.0

Leggi di più →

Technology

OpenAI Launches GPT-Live: Full-Duplex Voice Model Lets ChatGPT Listen and Speak Simultaneously

OpenAI rolls out GPT-Live-1 and GPT-Live-1 mini globally, introducing full-duplex architecture that enables ChatGPT to listen and speak at the same time — with background task delegation to GPT-5.5 for complex reasoning, marking voice AI's shift from turn-based chat to continuous conversation.

👤 FreeReadText Team📅 July 8, 2026

OpenAIGPT-LiveFull-DuplexChatGPT VoiceReal-Time AIGPT-5.5

Leggi di più →

Business

Gradium Raises $100M Seed Round Backed by Nvidia to Build Ultra-Low-Latency Voice AI

Paris-based voice AI startup Gradium, spun out of French research lab Kyutai, extends its seed round to over $100 million with Nvidia joining as a strategic investor — signaling that the race to eliminate latency in AI voice conversations is attracting infrastructure-level capital.

👤 FreeReadText Team📅 July 8, 2026

GradiumNvidiaSeed FundingKyutaiVoice AIParisUltra-Low Latency

Leggi di più →

Business

Tencent Cloud Partners with Inworld AI to Deliver One-Stop Real-Time Voice AI with Sub-130ms Latency Across 100+ Languages

Tencent Cloud and Inworld AI announce a strategic partnership integrating Inworld's top-ranked TTS models into Tencent RTC's global infrastructure, creating a production-grade voice AI solution with sub-130ms first-chunk latency, 100+ language support, and voice cloning — backed by 3,200+ global edge nodes.

👤 FreeReadText Team📅 June 16, 2026

Tencent CloudInworld AIReal-Time VoiceTTSPartnershipEnterprise AITencent RTC

Leggi di più →