#3191 TaskCfg(cache_folder='C:/Users/ROG/OneDrive/Desktop/Pyvd/tmp/23396/speech2text', target_dir='c:/users/rog/onedrive/deskt

82.121* Posted at: 9 hours ago 👁10

ASR Error:[faster-whisper (Local)] Runtime error: 84_Monday_Group_A_QA-MT_205min_P1.mp3No subs recognized. Check Audio/Lang.:
Traceback (most recent call last):
File "videotrans\task\job.py", line 113, in run
File "videotrans\task\_speech2text.py", line 161, in recogn
RuntimeError: 84_Monday_Group_A_QA-MT_205min_P1.mp3No subs recognized. Check Audio/Lang.

TaskCfg(cache_folder='C:/Users/ROG/OneDrive/Desktop/Pyvd/tmp/23396/speech2text', target_dir='c:/users/rog/onedrive/desktop/pyvd/output/recogn', remove_noise=False, is_separate=False, detect_language='zh-cn', subtitle_language=None, source_language=None, target_language=None, source_language_code=None, target_language_code=None, source_sub=None, target_sub='c:/users/rog/onedrive/desktop/pyvd/output/recogn/84_Monday_Group_A_QA-MT_205min_P1.srt', source_wav=None, source_wav_output=None, target_wav=None, target_wav_output=None, subtitles=None, novoice_mp4=None, noextname='84_Monday_Group_A_QA-MT_205min_P1', shibie_audio='C:/Users/ROG/OneDrive/Desktop/Pyvd/tmp/23396/speech2text/84_Monday_Group_A_QA-MT_205min_P1-1770641097.0838702.wav', targetdir_mp4=None, instrument=None, vocal=None, back_audio=None, background_music=None, app_mode='biaozhun', subtitle_type=0, volume='+0%', pitch='+0Hz', voice_rate='+0%', voice_role=None, copysrt_rawvideo=False, clear_cache=False, translate_type=None, tts_type=None, recogn_type=0, model_name='large-v3-turbo', split_type=0, voice_autorate=False, video_autorate=False, cuda=True, name='C:/Users/ROG/OneDrive/Desktop/Doing sub/audio.7z/84_Monday_Group_A_QA-MT_205min_P1.mp3', basename='84_Monday_Group_A_QA-MT_205min_P1.mp3', ext='mp3', dirname='C:/Users/ROG/OneDrive/Desktop/Doing sub/audio.7z', shound_del_name=None, uuid='fde49fe3a4', remove_silent_mid=False, align_sub_audio=True, enable_diariz=False, nums_diariz=0, rephrase=0, only_out_mp4=False)


=
system:Windows-10-10.0.19045-SP0
version:v3.89
frozen:True
language:en
root_dir:C:/Users/ROG/OneDrive/Desktop/Pyvd

1 Replies

The error indicates that the faster-whisper model failed to recognize any speech in your audio file. This is typically caused by incorrect language settings, audio quality issues, or model download/loading problems.

  1. Check the source language setting: In the task configuration (detect_language='zh-cn'), the source language is set to Chinese. Ensure your audio file 84_Monday_Group_A_QA-MT_205min_P1.mp3 is actually in Chinese. If it's in a different language, manually select the correct source language instead of relying on auto-detection.
  2. Verify audio content and quality: Open the MP3 file in a media player to confirm it contains clear speech and is not corrupted, silent, or mostly music/noise. If there is heavy background noise, try enabling the "Remove Noise" option in the software's advanced settings before recognition.
  3. Ensure the model is correctly downloaded: The error can occur if the large-v3-turbo model failed to download. Check the models/ directory in your pyVideoTrans installation folder. The model folder should be named precisely (e.g., models--Systran--faster-whisper-large-v3-turbo). If missing or incomplete, you may need to manually download it from a HuggingFace mirror and place it correctly.
  4. Try a different model or settings: As a test, switch to a smaller model like large-v2 or medium in the software's recognition settings. Also, try disabling CUDA acceleration temporarily to rule out GPU-related loading issues.

Please refer to the relevant documentation:

Post Your Reply
Open source and free maintenance is not easy. If this project is helpful to you, please consider making a small donation to help the project continue to maintain and update.

Related resource

Similar issues already exist