Audio Dataset Manager
Audio Dataset Manager is a comprehensive toolkit designed to streamline the preparation of audio datasets for TTS (Text-to-Speech) training and voice cloning projects.
Born from the need to efficiently process dozens of hours of audiobook data for voice cloning, this tool bridges the gap between raw audio files and training-ready datasets.
Features:
- Audio analysis and silence detection
- Format conversion (MP3/WAV)
- Intelligent splitting into training-ready clips (0.6–11s)
- Auto-transcription powered by OpenAI Whisper
- Interactive dataset review UI
- JSON management with automatic backup and versioning