Zonos uses 128-float embeddings for voices and it seems so much nicer. Because you can just mix and match voices without changing the model.
Zonos uses 128-float embeddings for voices and it seems so much nicer. Because you can just mix and match voices without changing the model.