Model Evaluation

Generate Audio Samples
HuggingFace model ID. For Orpheus: a fine-tuned model repo. For Chatterbox: your fine-tuned repo (with t3_finetuned.pt) or ResembleAI/chatterbox for the base model. For Router TTS: model ID is filled automatically (requires Router API key). For Kokoro: enter kokoro.
Pre-trained voices from the Trelis Piper collection. Select "Custom" to enter your own model ID (e.g. a fine-tuned model).
79 voices across 9 languages. See VOICES.md for samples.

One prompt per line. If empty, uses dataset or default test prompts.
Dataset with a text (or transcription) column. Prompts field takes priority if filled.
Limited to 50 samples. Sign in for more.
Used for TTS generation (Chatterbox/Piper) and ASR round-trip normalization
Orpheus only — must match the speaker name used during training

Router ASR model for round-trip evaluation. Transcribes generated audio back to text and computes WER/CER. Leave empty to skip.
Dataset column containing the ASR transcription of the ground-truth voice recording. CER/WER is computed against this instead of the 'text' column — useful when the speaker didn't read the prompt verbatim. Auto-detected if a reference_asr column exists.
ElevenLabs only. Controls whether numbers, units, and abbreviations are expanded to spoken words before synthesis.

Controls max audio length (~84 tokens/sec). 2560 ~ 30s, 5040 ~ 60s.

Required for private models or pushing results. Get token
Saves generated audio samples as a playable HF dataset
Current Job
No job

No evaluation running. Submit a job to see progress here.

Evaluation History
Time Model Samples WER / CER Output Cost Status
No evaluations yet