#5079 TaskCfgSTT(uuid='5c1ad1ebbe', name='L:/花花姐说险/20221210_160000_200万入住的养老院适合你吗.mp4', dirname='L:/花花姐说险', noextname='2022121

183.227* Posted at: 4 days ago

语音识别阶段出错[faster-whisper(本地)] No transcription results returned. Please check the original audio/video or model and try again.
Traceback (most recent call last):

File "videotrans\task\job.py", line 35, in run

File "videotrans\task\job.py", line 100, in process_task

File "videotrans\task\speech2text.py", line 126, in recogn

File "videotrans\recognition\__init__.py", line 190, in run

File "videotrans\recognition\_base.py", line 94, in run

File "videotrans\recognition\_whisper.py", line 36, in _exec

File "videotrans\recognition\_whisper.py", line 109, in _faster

File "videotrans\configure\base.py", line 253, in _new_process

videotrans.configure.excepts.VideoTransError: No transcription results returned. Please check the original audio/video or model and try again.
TaskCfgSTT(uuid='5c1ad1ebbe', name='L:/花花姐说险/20221210_160000_200万入住的养老院适合你吗.mp4', dirname='L:/花花姐说险', noextname='20221210_160000_200万入住的养老院适合你吗', basename='20221210_160000_200万入住的养老院适合你吗.mp4', ext='mp4', target_dir='M:/AItools/win-pyvideotrans-v4.01-0602/output/recogn', cache_folder='M:/AItools/win-pyvideotrans-v4.01-0602/tmp/20460/5c1ad1ebbe', is_cuda=True, source_language=None, source_language_code=None, source_sub=None, source_wav=None, source_wav_output=None, target_language=None, target_language_code=None, target_sub='M:/AItools/win-pyvideotrans-v4.01-0602/output/recogn/20221210_160000_200万入住的养老院适合你吗.srt', target_wav=None, target_wav_output=None, detect_language='zh-cn', recogn_type=0, model_name='large-v3-turbo', shibie_audio='M:/AItools/win-pyvideotrans-v4.01-0602/tmp/20460/5c1ad1ebbe/20221210_160000_200万入住的养老院适合你吗-1780970290.3422587.wav', remove_noise=False, enable_diariz=False, nums_diariz=0, rephrase=0, fix_punc=True)


=
system:Windows-10-10.0.26200-SP0
version:v4.01
frozen:True
language:zh
root_dir:M:/AItools/win-pyvideotrans-v4.01-0602

1 Replies

这个错误说明 faster-whisper 模型从音频中提取不到任何有效文字。常见原因是 视频文件没有音轨、无人声、模型不完整原始语言选择不匹配

排查步骤

  1. 确认视频包含人声音频
    用本地播放器(如 VLC、PotPlayer)播放该文件,听一下是否有说话声。

    • 如果完全无声,说明视频可能不含音轨(尤其是从 YouTube 下载的某些视频需手动合并音画)。
    • 如果只有背景音乐,软件无法识别出字幕,因为本工具仅处理人类语音。
  2. 检查模型是否完整
    使用的模型是 large-v3-turbo,模型文件夹路径应为:
    软件目录/models/models--mobiuslabsgmbh--faster-whisper-large-v3-turbo
    确认该文件夹内有 .bin/.json/.txt 等文件,如果缺少文件,请手动下载完整模型(参考知识库第20节)。
  3. 临时换用更小模型
    在语音识别下拉框中选择 basesmall 模型(这些模型对显存要求较低),重新尝试。如果小模型成功,说明是模型本身问题。
  4. 明确指定原始语言
    当前检测语言为 zh-cn,但建议在主界面 “发音语言” 中手动选择 “简体中文”,不要依赖 auto 检测。
  5. 尝试开启降噪
    如果音频背景嘈杂,可勾选主界面 “降噪” 选项,或点击 设置更多参数 → 分离人声背景声,能提升识别率。

请查阅相关文档:

Post Your Reply
Open source and free maintenance is not easy. If this project is helpful to you, please consider making a small donation to help the project continue to maintain and update. Dondate: ko-fi.com/jianchang512

Related resource