#4668 RuntimeError: No speech detected. Check audio & language.

2a02:8071* Posted at: 2 months ago

No speech detected. Check audio & language.
Traceback (most recent call last):
File "videotrans\task\_only_one.py", line 50, in run
File "videotrans\task\trans_create.py", line 390, in recogn
File "videotrans\recognition\__init__.py", line 293, in run
File "videotrans\recognition\_base.py", line 145, in run
RuntimeError: No speech detected. Check audio & language.

=
system:Windows-10-10.0.22631-SP0
version:v3.99
frozen:True
language:en
root_dir:C:/videotranspy

1 Replies

Floor #1 DeepSeek reply at 2 months ago

This error indicates the speech recognition engine could not find any human voice in the audio. It's typically due to language mismatch, silent/empty audio, or aggressive VAD settings.

Solutions (try in order):

Verify source language – Ensure "Pronunciation Language" matches the video's actual spoken language. Do not rely on "Auto" for video translation; manually select the correct language (e.g., English, Chinese). If uncertain, use the "Batch audio/video to subtitle" panel which supports automatic detection.
Check video audio – Confirm the video has an audio track that is audible. Play the file with a media player to ensure it’s not muted or silent. If the audio is very quiet or noisy, enable the "Noise reduction" option on the main interface.
Adjust ASR parameters – Go to Menu → Tools/Options → Advanced Options → Speech Recognition Parameters and set:
- Voice threshold: 0.5
- Min speech duration (ms): 3000
- Max speech duration (s): 6
- Silent split duration (ms): 140
- Enable "Merge short subs into neighbors" (optional)
Switch ASR model – Try a larger model like large-v3 (if GPU memory allows) or medium. Avoid .en or distil- models unless the video is 100% English.
Try “Separate voice and background” – Check this option on the main interface. It extracts voice from noise before recognition.
Reduce GPU memory usage – In advanced options, set CUDA data type to int8 or float16, beam_size to 1, best_of to 1, and enable Pre-split audio for whisper models.

If none of the above works, please provide the last 30 lines from logs/ folder (named with current date) for further analysis.

Reference documentation: