Audio Classification Model Python

DePasqualeOrg/mlx-audio-plus

The best audio processing library built on Apple's MLX framework, providing fast and efficient text-to-speech (TTS), speech-to-text (STT), and speech-to-speech (STS) on Apple Silicon. Kokoro Fast, ...

IEEE

Meta-Learning with Pretrained Audio Representations Enables One-Shot Acoustic Signal Classification

Few-shot acoustic signal classification remains a challenging problem due to the high diversity and variability of acoustic data and limited availability of labeled samples. While pretrained audio ...

WinBuzzer

Google Launches Gemini 3.1 Flash-Lite for Enterprise Scale

Lite, its fastest and most cost-efficient AI model, at $0.25 per million tokens and 2.5x faster than Gemini 2.5 Flash.

IEEE

SHMamba: Structured Hyperbolic State Space Model for Audio-Visual Question Answering

Abstract: The Audio-Visual Question Answering (AVQA) task holds significant potential for applications. Compared to traditional unimodal approaches, the multi-modal input of AVQA makes feature ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results