This error occurs because you selected the "clone" role for TTS (voice cloning) but did not provide a reference audio file for the model to clone.
Solution:
Check your TTS settings: Go to the "TTS Settings" tab and select your TTS channel (e.g., F5-TTS, Qwen-TTS(local), CosyVoice, GPT-SoVITS).
For "clone" role: If you want to use the "clone" role to mimic the original video's voice, you must provide a reference audio file.
Method A (Recommended for video translation): In the main translation task interface, under "Advanced Options" -> "TTS Parameters", ensure the "Use original video audio as reference for cloning" option is checked. This will automatically use a segment of the original video's audio.
Method B (Manual): In the TTS channel settings, manually select a reference audio file (WAV format, 3-10 seconds of clear speech) in the designated field.
If you don't have a reference audio: Do not select the clone role. Instead, choose a specific built-in voice provided by your TTS service (e.g., "Vivian" for Qwen-TTS, or any other named voice for Edge-TTS/F5-TTS).
Important Note for Qwen-TTS(local):
The clone role requires you to have started the Voice Clone model (Qwen3-TTS-12Hz-0.6B-Base or Qwen3-TTS-12Hz-1.7B-Base). Starting the Custom Voice model or Voice Design model will cause this error when using clone.
If you are testing in the TTS settings panel and get this error, try clearing the "Reference Audio" text box first, then test. It will use the built-in voice "Vivian" to connect to the Custom Voice model.
Open source and free maintenance is not easy. If this project is helpful to you, please consider making a small donation to help the project continue to maintain and update.