#4803 TaskCfgVTT(uuid='76acf79050', name='D:/Downloads/AEDS-152_restored.mp4', dirname='D:/Downloads', noextname='AEDS-152_res

2409:8a6a* Posted at: 2 hours ago

语音识别阶段出错[faster-whisper(本地)] No transcription results returned. Please check the original audio/video or model and try again.
Traceback (most recent call last):

File "videotrans\task\job.py", line 54, in run

File "videotrans\task\job.py", line 119, in process_task

File "videotrans\task\trans_create.py", line 320, in recogn

File "videotrans\recognition\__init__.py", line 190, in run

File "videotrans\recognition\_base.py", line 93, in run

File "videotrans\recognition\_whisper.py", line 35, in _exec

File "videotrans\recognition\_whisper.py", line 108, in _faster

File "videotrans\configure\base.py", line 252, in _new_process

videotrans.configure.excepts.VideoTransError: No transcription results returned. Please check the original audio/video or model and try again.
TaskCfgVTT(uuid='76acf79050', name='D:/Downloads/AEDS-152_restored.mp4', dirname='D:/Downloads', noextname='AEDS-152_restored', basename='AEDS-152_restored.mp4', ext='mp4', target_dir='D:/Downloads/_video_out/AEDS-152_restored-mp4', cache_folder='F:/pyvideotrans/tmp/12308/76acf79050', is_cuda=True, source_language='日语', source_language_code='ja', source_sub='D:/Downloads/_video_out/AEDS-152_restored-mp4/ja.srt', source_wav='F:/pyvideotrans/tmp/12308/76acf79050/ja.wav', source_wav_output='D:/Downloads/_video_out/AEDS-152_restored-mp4/ja.m4a', target_language='日语', target_language_code='ja', target_sub='D:/Downloads/_video_out/AEDS-152_restored-mp4/ja.srt', target_wav='F:/pyvideotrans/tmp/12308/76acf79050/target.wav', target_wav_output='D:/Downloads/_video_out/AEDS-152_restored-mp4/ja.m4a', translate_type=0, tts_type=0, volume='+0%', pitch='+0Hz', voice_rate='+0%', voice_role='No', voice_autorate=False, video_autorate=False, remove_silent_mid=False, align_sub_audio=True, detect_language='ja', recogn_type=0, model_name='large-v3', shibie_audio=None, remove_noise=False, enable_diariz=False, nums_diariz=0, rephrase=0, fix_punc=False, subtitle_language=None, app_mode='tiqu', subtitles='', targetdir_mp4='D:/Downloads/_video_out/AEDS-152_restored-mp4/AEDS-152_restored.mp4', novoice_mp4='F:/pyvideotrans/tmp/12308/76acf79050/novoice.mp4', is_separate=False, embed_bgm=True, instrument=None, vocal='F:/pyvideotrans/tmp/12308/76acf79050/vocal.wav', clear_cache=True, background_music='', subtitle_type=0, only_out_mp4=False, recogn2pass=True, output_srt=0, copysrt_rawvideo=False, loop_backaudio=1, backaudio_volume=0.8)


=
system:Windows-10-10.0.26200-SP0
version:v4.00
frozen:True
language:zh
root_dir:F:/pyvideotrans

1 Replies

该错误的核心原因是 faster-whisper 模型(large-v3)在显卡上加载或推理失败,导致没有返回任何转录结果。最常见的原因是 显存不足CUDA 环境不兼容

解决方案(按顺序尝试):

  1. 关闭 CUDA 加速(临时测试)
    在主界面取消勾选“CUDA加速”,使用 CPU 模式重新执行。若能成功,则说明问题出在显卡环境(显存不足 / CUDA版本不匹配 / cuDNN缺失)。

    • 若 CPU 模式正常,请检查 CUDA 版本(需 ≥12.8)、cuDNN 版本(需 ≥9.11),并确认显卡驱动已更新。
    • 若显存不足(large-v3 最低需 8GB 可用显存),请改用 mediumsmall 模型。
  2. 检查视频音轨和降噪

    • 确认视频包含人声音轨(可用播放器预览)。
    • 在主界面尝试勾选“降噪”,或在“设置更多参数”中勾选“分离人声背景声”,排除背景噪音干扰。
  3. 调整 VAD 参数
    在“菜单 → 工具 → 高级选项 → 语音识别参数”中,将 语音阈值 设为 0.5最短语音持续毫秒 设为 1000静音分割持续毫秒 设为 500,避免过强的静音过滤导致无结果。
  4. 更换模型
    若仍失败,将识别模型从 large-v3 临时换成 basesmall,以验证是否为模型体积问题。

如果以上步骤均无效,请提供 logs/ 目录下最新日志文件的底部30行,以便进一步分析。

请查阅相关文档:

Post Your Reply
Open source and free maintenance is not easy. If this project is helpful to you, please consider making a small donation to help the project continue to maintain and update.

Related resource