📄️ GPT-SoVITS: AI Cloning with 1-Minute Voice Samples
GPT-SoVITS is an advanced model capable of voice transformation and text-to-speech timbre cloning with minimal sample input. It facilitates voice inference in Mandarin, English, and Japanese. According to developer tests, a voice sample as brief as five seconds allows for the creation of a voice clone with 80% to 95% similarity. Providing a one-minute voice sample significantly enhances the quality, closely mimicking a real human voice and enabling the development of superior text-to-speech models.