Unmute turns a text LLM into a voice AI. At unmute.sh, it’s Mistral AI's Mistral-Small-3.2-24B, making it fully open-source. Play a quiz game with a snarky host, catch up on tech news, or just hang out and talk. Or modify it to do anything you want!
Kyutai
Technology, Information and Internet
Build and democratize Artificial General Intelligence through open science.
About us
Build and democratize Artificial General Intelligence through open science.
- Website
-
https://kyutai.org/
External link for Kyutai
- Industry
- Technology, Information and Internet
- Company size
- 11-50 employees
- Type
- Nonprofit
- Founded
- 2023
Employees at Kyutai
-
Guillaume Rouzaud
Part-time HR Director | Kyutai 🧠🤖 | Join us
-
Jennifer Coscas
General Counsel | Legal & Compliance | Data Privacy | Transformative technologies
-
Alexandre Défossez
Chief exploration officer at Kyutai, formerly RS at FAIR Paris
-
Aude Durand
Deputy CEO, iliad Group - #free #Scaleway #Play #iliaditalia
Updates
-
Kyutai TTS and Unmute are now open source! The text-to-speech is natural, customizable, and fast: it can serve 32 users with a 350ms latency on a single L40S. Try it out and get started on the project page: https://lnkd.in/eHmfhdaK We’re also releasing the code for unmute.sh, the modular voice AI system. Make your own personal assistant, make it role-play, give a voice to your agent, or connect it to external tools. You can make it fit onto a single GPU. https://lnkd.in/eVBNzFJS
-
Kyutai Speech-To-Text is now open-source! It’s streaming, supports batched inference, and runs blazingly fast: perfect for interactive applications. Check out the details here: https://lnkd.in/gUk_3-8G Today we are releasing two models. The first one is a 2.6B English-only model that beats Whisper Large v3 on benchmarks even though it’s a streaming model that doesn’t process all the audio at once. It can process 400 sequences in parallel on a single H100. The other model is a lightweight English/French 1B model optimized for real-time voice chat apps like https://unmute.sh . It comes with a semantic voice activity detector that predicts if you’re done talking or just pausing mid-sentence. The open-source releases of Kyutai Text-To-Speech and unmute.sh will follow soon!
-
Kyutai's memories from a busy Vivatech 2025: 1️⃣ Patrick Pérez with Grégoire Genest (Albert School) & Gilles Moyse (reciTAL) on KPMG's podcast hosted by Albane Liger-Belair (KPMG France) 2️⃣ Live demo of our voice open models (Unmute, Hibiki, MoshiVis) 3️⃣ Alexandre Défossez talking about our contribution to CMA Media & La Provence's Media Audio Experience -MAX- solution, with our new real-time speech generation technology 4️⃣ Patrick discussing with Yoav Shoham (AI21 Labs), Eliot Andres (Photoroom) & Laurent Sifre (H Company) at and NVIDIA's GTC panel hosted by Meriem Bendris. 5️⃣ Patrick on Olivier Frigara's OnRefaitVivatech (https://lnkd.in/eAcd-v8W) broadcast, with Guillaume Avrin 6️⃣ Sarah Hôte, Amélie Royer, Alexandre Défossez and Patrick at Scaleway booth
-
-
Kyutai reposted this
We will be at VivaTech with: - Interactive demos of our groundbreaking voice AI technologies (at Scaleway's booth on Wednesday and Thursday from 3pm to 5pm, registration on site), register on site or here: https://lnkd.in/eHgVSK2b - Alexandre Défossez talking about our contribution to CMA Media & La Provence's Media Audio Experience -MAX- solution, with our new real-time speech generation technology (CMA CGM's booth, Wednesday at 4pm) - Patrick Pérez discussing the challenges of building and sharing multimodal foundation models (panel at NVIDIA's GTC, Wednesday at 4pm, and at KPMG's booth, Thursday at 10am).
-
-
Kyutai reposted this
We will be at VivaTech with: - Interactive demos of our groundbreaking voice AI technologies (at Scaleway's booth on Wednesday and Thursday from 3pm to 5pm, registration on site), register on site or here: https://lnkd.in/eHgVSK2b - Alexandre Défossez talking about our contribution to CMA Media & La Provence's Media Audio Experience -MAX- solution, with our new real-time speech generation technology (CMA CGM's booth, Wednesday at 4pm) - Patrick Pérez discussing the challenges of building and sharing multimodal foundation models (panel at NVIDIA's GTC, Wednesday at 4pm, and at KPMG's booth, Thursday at 10am).
-
-
We will be at VivaTech with: - Interactive demos of our groundbreaking voice AI technologies (at Scaleway's booth on Wednesday and Thursday from 3pm to 5pm, registration on site), register on site or here: https://lnkd.in/eHgVSK2b - Alexandre Défossez talking about our contribution to CMA Media & La Provence's Media Audio Experience -MAX- solution, with our new real-time speech generation technology (CMA CGM's booth, Wednesday at 4pm) - Patrick Pérez discussing the challenges of building and sharing multimodal foundation models (panel at NVIDIA's GTC, Wednesday at 4pm, and at KPMG's booth, Thursday at 10am).
-
-
Talk to https://unmute.sh, the most modular voice AI around. Empower any text LLM with voice, instantly, by wrapping it with our new speech-to-text and text-to-speech. Any personality, any voice. Interruptible, smart turn-taking. We’ll open-source everything within the next few weeks. “But what about Moshi?” Last year we unveiled Moshi, the first audio-native model. While Moshi provides unmatched latency and naturalness, it doesn’t yet match the extended abilities of text models such as function-calling, stronger reasoning capabilities, and in-context learning. Unmute allows us to directly bring all of these from text to real-time voice conversations. More details in out blog post: https://lnkd.in/dV3xYJnP
-
Kyutai CEO just appeared on Matthieu Stefani's "Generation DIY" podcast. If you speak French and want to learn more about Kyutai, modern AI, multimodal foundation models, voice-based interaction with machines, open science, robotaxis, and much more, listen to this insightful discussion between two enthusiasts. Big thanks to the GDIY team! https://lnkd.in/eRhvc632