English
Acoustic Models
| Model | SR (kHz) | Tokenizer | Dataset |
|---|---|---|---|
| FastSpeech2 EN v2 | 22.05 | Character-based | Azure |
| FastSpeech2 EN v3 | 44.10 | g2p_en (ARPA) | Azure |
| FastSpeech2 MFA EN v2 | 22.05 | g2p_en (ARPA) | Azure |
| FastSpeech2 MFA EN v3 | 44.10 | gruut (IPA) | Azure |
| FastSpeech2 MFA EN v4 | 44.10 | gruut (IPA) | Azure (Mastered) |
| FastSpeech2 MFA EN ESD Angry | 44.10 | gruut (IPA) | Emotional Speech Dataset - Angry |
| LightSpeech MFA EN | 44.10 | gruut (IPA) | Azure (Mastered) |
| LightSpeech MFA EN v2 | 44.10 | gruut (IPA) | Azure (Mastered) |
| LightSpeech MFA EN v3 | 44.10 | gruut (IPA) | Azure (Mastered) |
| LightSpeech MFA EN ESD | 44.10 | gruut (IPA) | Emotional Speech Dataset - 0013 |
Vocoder Models
| Model | SR (kHz) | Dataset |
|---|---|---|
| MB-MelGAN EN | 22.05 | Azure |
| MB-MelGAN HiFi EN | 22.05 | Azure |
| MB-MelGAN HiFi PostNets EN | 22.05 | Azure |
| MB-MelGAN HiFi PostNets EN v2 | 22.05 | Azure |
| MB-MelGAN HiFi PostNets EN v3 | 44.10 | Azure |
| MB-MelGAN HiFi PostNets EN v5 | 44.10 | Azure |
| MB-MelGAN HiFi PostNets EN v6 | 44.10 | Azure (Mastered) |
| MB-MelGAN HiFi PostNets EN v7 | 44.10 | Azure (Mastered) |
| MB-MelGAN HiFi PostNets EN v8 | 44.10 | Azure (Mastered) |
| MB-MelGAN HiFi PostNets EN ESD Angry | 44.10 | Emotional Speech Dataset - Angry |
| MB-MelGAN HiFi PostNets EN ESD | 44.10 | Emotional Speech Dataset - 0013 |