Audio samples - QuickVC: Any-To-Many Voice Conversion Using Inverse Short-Time Fourier Transform for Faster Conversion
VC performance of different methods, where QuickVC-nosr and QuickVC-sr is our methods. The code: https://github.com/quickvc/QuickVC-VoiceConversion
All sources are from Librispeech dataset, which are unseen during training:
Target reference utterance- p226_006
No. |
Source |
QuickVC-nosr (proposed) |
QuickVC-sr (proposed) |
Diff-VCTK |
BNE-PPG-VC |
VQMIVC |
1 |
7433-89646-0027
|
|
|
|
|
|
2 |
7607-89899-0000
|
|
|
|
|
|
3 |
7769-99397-0053
|
|
|
|
|
|
4 |
8200-278197-0010
|
|
|
|
|
|
Target reference utterance- p229_006
No. |
Source |
QuickVC-nosr (proposed) |
QuickVC-sr (proposed) |
Diff-VCTK |
BNE-PPG-VC |
VQMIVC |
1 |
572-126495-0022
|
|
|
|
|
|
2 |
807-124223-0065
|
|
|
|
|
|
3 |
1168-134958-0091
|
|
|
|
|
|
4 |
1691-142296-0022
|
|
|
|
|
|
Target reference utterance- p246_011
No. |
Source |
QuickVC-nosr (proposed) |
QuickVC-sr (proposed) |
Diff-VCTK |
BNE-PPG-VC |
VQMIVC |
1 |
265-136853-0007
|
|
|
|
|
|
2 |
428-125879-0020
|
|
|
|
|
|
3 |
727-124443-0119
|
|
|
|
|
|
4 |
1691-142296-0075
|
|
|
|
|
|
Target reference utterance- p312_019
No. |
Source |
QuickVC-nosr (proposed) |
QuickVC-sr (proposed) |
Diff-VCTK |
BNE-PPG-VC |
VQMIVC |
1 |
811-130148-0027
|
|
|
|
|
|
2 |
876-126411-0093
|
|
|
|
|
|
3 |
1094-157767-0044
|
|
|
|
|
|
4 |
6978-92936-0012
|
|
|
|
|
|