Show HN: Three new Kitten TTS models – smallest less than 25MB
Kitten TTS has released three new open-source text-to-speech models with parameter sizes of 80M, 40M, and a compact 14M, designed for on-device applications without the need for a GPU. The 14M model sets a new standard in expressivity for its size, aiming to enhance the capabilities of on-device AI and bridge the gap with cloud-based solutions. This advancement is significant as it enables the development of production-ready voice applications on low-resource devices, potentially expanding accessibility and functionality in various tech environments.

Kitten TTS ( https://github.com/KittenML/KittenTTS ) is an open-source series of tiny and expressive text-to-speech models for on-device applications. We had a thread last year here: https://news.ycombinator.com/item?id=44807868 . Today we're releasing three new models with 80M, 40M and 14M parameters. The largest model (80M) has the highest quality. The 14M variant reaches new SOTA in expressivity among similar sized models, despite being Here's a short demo: https://www.youtube.com/watch?v=ge3u5qblqZA . Most models are quantized to int8 + fp16, and they use ONNX for runtime. Our models are designed to run anywhere eg. raspberry pi, low-end smartphones, wearables, browsers etc. No GPU required! This release aims to bridge the gap between on-device and cloud models for tts applications. Multi-lingual model release is coming soon. On-device AI is bottlenecked by one thing: a lack of tiny models that actually perform. Our goal is to open-source more models to run production-ready voice agents and apps entirely on-device. We would love your feedback! Comments URL: https://news.ycombinator.com/item?id=47441546 Points: 512 # Comments: 172