🎙️ Benchmarking Spoof-SUPERB Classifiers Built on S3PRL Embeddings
Overview. Comparison of models across all datasets. Lower EER is better.
wav2vec 2.0 Large | 17.409 | 11.693 | 14.096 | 11.527 | 14.394 | 20.073 | 45.392 | 29.598 | 12.089 |
TTS stress-test. Lower TNR means harder; higher TNR means easier.
Avg Mean per attack | ASVSpoof 5 Eval | tts_models_bn_custom_vits-male | 0.0982 | ASVSpoof 2019LA | STYLETTS2 | 1 |
APC | MLAAD | facebook_mms-tts-swe | 0.197 | ASVSpoof 2019LA | A13 | 1 |
Audio Albert | Famous Figures | MASKGCT | 0.0982 | ASVSpoof 2019LA | A09 | 1 |
Avg Mean per attack | ASVSpoof 5 Eval | A31 | 0.93 | ASVSpoof 2019LA | A07 | 1 |
Byol-Audio | MLAAD | tts_models_bg_cv_vits | 0.3949 | ASVSpoof 2019LA | A07 | 1 |
Data2Vec | ASVSpoof 2019LA | A19 | 0.8458 | ASVSpoof 2019LA | A09 | 1 |
DeCoAR 2.0 | ASVSpoof 2019LA | A19 | 0.2257 | ASVSpoof 2019LA | A13 | 1 |
FBANK | Famous Figures | XTTSV2 | 0.7425 | ASVSpoof 2019LA | A07 | 1 |
HuBERT Base | MLAAD | MatchaTTS | 0.18 | ASVSpoof 2019LA | A09 | 1 |
HuBERTLarge | MLAAD | tts_models_bn_custom_vits-male | 0.459 | ASVSpoof 2019LA | A07 | 1 |
MAE_AST_FRAME | ASVSpoof 2019LA | A18 | 0.9607 | ASVSpoof 2019LA | A09 | 1 |
MR - HUBERT | MLAAD | SSRSPEECH | 0.386 | ASVSpoof 2019LA | A09 | 1 |
Mockingjay | ASVSpoof 2021LA | A18 | 0.2419 | SpoofCeleb | A15 | 1 |
NPC | ASVSpoof 2019LA | A19 | 0.0975 | Famous Figures | STYLETTS2 | 1 |
SSAST | ASVSpoof 2019LA | A18 | 0.8505 | ASVSpoof 2019LA | A17 | 1 |
TERA | ASVSpoof 2019LA | A19 | 0.1313 | ASVSpoof 2019LA | A13 | 1 |
Unispeech-SAT | MLAAD | E2TTS | 0.492 | ASVSpoof 2019LA | A07 | 1 |
VQ-APC | ASVSpoof 2019LA | A19 | 0.267 | ASVSpoof 2019LA | A09 | 1 |
WAVLABLM | ASVSpoof 2019LA | A07 | 0.9512 | ASVSpoof 2019LA | A07 | 1 |
WAVLM Large | MLAAD | microsoft_speecht5_tts | 0.448 | ASVSpoof 2019LA | A09 | 1 |
wav2vec | MLAAD | tts_models_bn_custom_vits-male | 0.149 | ASVSpoof 2019LA | A13 | 1 |
wav2vec 2.0 Base | MLAAD | optispeech | 0.094 | ASVSpoof 2019LA | A09 | 1 |
wav2vec 2.0 Large | MLAAD | tts_models_en_jenny_jenny | 0.002 | ASVSpoof 2019LA | A13 | 1 |
Codec robustness. Compare models under compression/bitrates.
wav2vec 2.0 Large | 31.340054 | 49.625579 | 49.781484 | 50.316419 | 49.912558 | 49.887509 | 37.222036 | 49.594901 | 49.712076 | 49.529384 | 30.543694 | 49.80951727 |
FBANK | 49.49707 | 49.625579 | 49.781484 | 50.316419 | 49.912558 | 49.887509 | 49.97767 | 49.594901 | 49.712076 | 49.529384 | 50.07004 | 49.80951727 |
APC | 31.340054 | 33.188461 | 34.305018 | 36.543327 | 31.858528 | 31.215447 | 37.222036 | 33.110337 | 32.18822 | 34.301134 | 30.543694 | 33.25602327 |
VQ-APC | 28.655992 | 30.029592 | 30.772724 | 32.306456 | 29.061595 | 29.089432 | 33.165765 | 30.078582 | 30.676141 | 33.080142 | 29.150769 | 30.55156273 |
NPC | 36.876927 | 37.923511 | 38.32099 | 39.29323 | 37.017766 | 37.453546 | 39.14647 | 37.144912 | 38.479985 | 38.718392 | 37.317 | 37.97206627 |
Mockingjay | 38.913719 | 40.655588 | 40.43468 | 41.37652 | 40.472461 | 39.259936 | 41.728594 | 39.352952 | 40.517112 | 40.804753 | 39.518609 | 40.27590218 |
mockingjay_960hr | 36.817624 | 38.118549 | 37.979941 | 39.222939 | 37.428867 | 37.288646 | 39.280355 | 37.353949 | 38.850529 | 38.94321 | 37.069719 | 38.03221164 |
Audio Albert | 35.325287 | 35.86034 | 36.003732 | 37.182635 | 35.438535 | 35.396253 | 38.075301 | 34.677749 | 35.6688 | 36.430656 | 34.120705 | 35.83454482 |
TERA | 33.607549 | 31.034993 | 34.014535 | 35.604541 | 29.321341 | 32.018628 | 36.588002 | 34.019623 | 33.611353 | 37.930085 | 31.019404 | 33.52455036 |
DeCoAR 2.0 | 28.529431 | 28.31857 | 29.680875 | 32.264786 | 26.557057 | 27.061129 | 32.921556 | 32.169792 | 32.154674 | 33.040949 | 28.581392 | 30.11638282 |
wav2vec | 29.998023 | 30.307099 | 30.349986 | 34.026447 | 26.833724 | 27.938603 | 34.755441 | 34.265228 | 30.931286 | 32.782108 | 28.851757 | 31.00360927 |
modified CPC | 48.355427 | 50.40452 | 49.619413 | 49.355483 | 48.322884 | 48.909254 | 48.84481 | 48.828152 | 49.868115 | 49.529384 | 47.736571 | 49.07036482 |
wav2vec 2.0 Base | 16.928347 | 15.815289 | 19.939844 | 22.719462 | 13.318978 | 14.538094 | 25.141123 | 23.548239 | 17.958425 | 21.275621 | 15.313364 | 18.77243509 |
wav2vec 2.0 Large | 20.043501 | 15.142413 | 20.269212 | 22.306708 | 13.28765 | 15.220144 | 25.848676 | 25.038288 | 17.84456 | 23.095327 | 15.227672 | 19.39310464 |
HuBERT Base | 22.79217 | 21.235014 | 24.028436 | 26.111305 | 18.671342 | 21.367636 | 27.584063 | 28.025267 | 24.896198 | 27.458965 | 22.089328 | 24.02361127 |
HuBERTLarge | 19.794356 | 18.293455 | 20.804762 | 24.729829 | 14.980429 | 17.568071 | 26.691438 | 26.280078 | 22.259837 | 25.121576 | 19.183824 | 21.42796864 |
MR - HUBERT | 22.267483 | 20.530716 | 23.513635 | 28.698476 | 14.946459 | 17.549257 | 31.363388 | 27.091603 | 24.637987 | 28.989773 | 18.695907 | 23.48042582 |
XLS-R | 15.532257 | 10.036913 | 13.935981 | 20.297657 | 4.40485 | 8.990686 | 24.755176 | 19.904461 | 9.660012 | 15.165494 | 5.850531 | 13.50309255 |
Unispeech-SAT | 13.45858 | 9.684763 | 14.240682 | 20.479909 | 5.891419 | 8.367183 | 24.961253 | 20.648158 | 10.896627 | 17.698967 | 7.696082 | 14.00214755 |
Data2Vec | 25.623887 | 24.963341 | 25.890528 | 27.409441 | 22.933958 | 24.253064 | 28.420297 | 31.222611 | 29.040037 | 30.640804 | 27.473697 | 27.07924227 |
WAVLABLM | 20.387567 | 17.454317 | 20.31071 | 25.318366 | 14.517115 | 16.553153 | 28.554181 | 25.313826 | 19.643795 | 24.469242 | 17.21729 | 20.88541473 |
WavLM Large | 15.732645 | 14.719593 | 17.465663 | 23.164785 | 11.131556 | 12.455848 | 25.692462 | 23.559102 | 18.734018 | 21.937092 | 14.368833 | 18.08741791 |
SSAST | 28.782553 | 29.003244 | 29.744428 | 33.458746 | 25.977552 | 22.303658 | 34.588724 | 28.602049 | 27.864956 | 31.975527 | 24.723138 | 28.82041591 |
Byol-Audio | 32.422394 | 33.022198 | 34.386707 | 37.376621 | 31.093698 | 31.460778 | 37.980777 | 35.37667 | 34.578063 | 35.996648 | 32.822775 | 34.22884809 |
MAE_AST_FRAME | 25.382637 | 25.730484 | 27.663168 | 29.863815 | 22.850417 | 23.438173 | 30.780541 | 29.067554 | 26.628341 | 30.476098 | 25.280308 | 27.01468509 |
Per-attack winners. Shows top-performing model per attack.
ASVSpoof_2019LA | tts_models_multilingual_multi-dataset_xtts_v1.1 | Unispeech-SAT | 0.9998 |
ASVSpoof_2019LA | A07 | FBANK | 1 |
ASVSpoof_2019LA | A08 | FBANK | 1 |
ASVSpoof_2019LA | A09 | FBANK | 1 |
ASVSpoof_2019LA | A10 | FBANK | 1 |
ASVSpoof_2019LA | A11 | FBANK | 1 |
ASVSpoof_2019LA | A12 | FBANK | 1 |
ASVSpoof_2019LA | A13 | FBANK | 1 |
ASVSpoof_2019LA | A14 | FBANK | 1 |
ASVSpoof_2019LA | A15 | FBANK | 1 |
ASVSpoof_2019LA | A16 | FBANK | 1 |
ASVSpoof_2019LA | A17 | FBANK | 1 |
ASVSpoof_2019LA | A18 | FBANK | 0.9998 |
ASVSpoof_2019LA | A19 | FBANK | 1 |
ASVSpoof_2021DF | A07 | FBANK | 1 |
ASVSpoof_2021DF | A08 | FBANK | 1 |
ASVSpoof_2021DF | A09 | FBANK | 1 |
ASVSpoof_2021DF | A10 | FBANK | 1 |
ASVSpoof_2021DF | A11 | FBANK | 1 |
ASVSpoof_2021DF | A12 | FBANK | 1 |
ASVSpoof_2021DF | A13 | FBANK | 1 |
ASVSpoof_2021DF | A14 | FBANK | 1 |
ASVSpoof_2021DF | A15 | FBANK | 1 |
ASVSpoof_2021DF | A16 | FBANK | 1 |
ASVSpoof_2021DF | A17 | FBANK | 1 |
ASVSpoof_2021DF | A18 | FBANK | 1 |
ASVSpoof_2021DF | A19 | FBANK | 1 |
ASVSpoof_2021DF | HUB-B00 | FBANK | 1 |
ASVSpoof_2021DF | HUB-B01 | FBANK | 1 |
ASVSpoof_2021DF | HUB-D01 | FBANK | 1 |
ASVSpoof_2021DF | HUB-D02 | FBANK | 1 |
ASVSpoof_2021DF | HUB-D03 | FBANK | 1 |
ASVSpoof_2021DF | HUB-D04 | FBANK | 1 |
ASVSpoof_2021DF | HUB-D05 | FBANK | 1 |
ASVSpoof_2021DF | HUB-N03 | FBANK | 1 |
ASVSpoof_2021DF | HUB-N04 | FBANK | 1 |
ASVSpoof_2021DF | HUB-N05 | FBANK | 1 |
ASVSpoof_2021DF | HUB-N06 | FBANK | 1 |
ASVSpoof_2021DF | HUB-N07 | FBANK | 1 |
ASVSpoof_2021DF | HUB-N08 | FBANK | 1 |
ASVSpoof_2021DF | HUB-N09 | FBANK | 1 |
ASVSpoof_2021DF | HUB-N10 | FBANK | 1 |
ASVSpoof_2021DF | HUB-N11 | FBANK | 1 |
ASVSpoof_2021DF | HUB-N12 | FBANK | 1 |
ASVSpoof_2021DF | HUB-N13 | FBANK | 1 |
ASVSpoof_2021DF | HUB-N14 | FBANK | 1 |
ASVSpoof_2021DF | HUB-N15 | FBANK | 1 |
ASVSpoof_2021DF | HUB-N16 | FBANK | 1 |
ASVSpoof_2021DF | HUB-N17 | FBANK | 1 |
ASVSpoof_2021DF | HUB-N18 | FBANK | 1 |
ASVSpoof_2021DF | HUB-N19 | FBANK | 1 |
ASVSpoof_2021DF | HUB-N20 | FBANK | 1 |
ASVSpoof_2021DF | SPO-B00 | FBANK | 1 |
ASVSpoof_2021DF | SPO-B01 | FBANK | 1 |
ASVSpoof_2021DF | SPO-N03 | FBANK | 1 |
ASVSpoof_2021DF | SPO-N04 | FBANK | 1 |
ASVSpoof_2021DF | SPO-N05 | FBANK | 1 |
ASVSpoof_2021DF | SPO-N06 | FBANK | 1 |
ASVSpoof_2021DF | SPO-N10 | FBANK | 1 |
ASVSpoof_2021DF | SPO-N11 | FBANK | 1 |
ASVSpoof_2021DF | SPO-N12 | FBANK | 1 |
ASVSpoof_2021DF | SPO-N13 | FBANK | 1 |
Pick a Dataset then (optionally) pick a specific label.
Rows = Models, Columns = ALL labels for that dataset (attacks, means, TTS names, etc.).
Rows = Models, Columns = ALL labels for that dataset (attacks, means, TTS names, etc.).