A Pittsburgh-area man is not just giving people living with ALS a voice, he's giving them their own voice.
The best audio processing library built on Apple's MLX framework, providing fast and efficient text-to-speech (TTS), speech-to-text (STT), and speech-to-speech (STS) on Apple Silicon. Kokoro Fast, ...
Abstract: Recent advances in deep learning technology have enabled high-quality speech synthesis, and text-to-speech models are widely used in a variety of applications. However, even state-of-the-art ...
Small and fast: only 123M parameters. High-quality voice cloning: state-of-the-art performance in speaker similarity, intelligibility, and naturalness. Multi-lingual: support Chinese and English.
Abstract: Recent advances in automatic speech recognition (ASR) have led to substantial improvements in system accuracy and robustness, particularly in converting speech signals into text sequences.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results