#4981 TaskCfgVTT(uuid='724416a535', name='C:/Users/yue/Desktop/1000.mp4', dirname='C:/Users/yue/Desktop', noextname='1000', ba

240e:3a1* Posted at: 10 hours ago

No transcription results returned. Please check the original audio/video or model and try again.语音识别阶段出错:faster-whisper(本地), 翻译字幕阶段出错: Google(免费), 配音阶段出错:Edge-TTS(免费)
Traceback (most recent call last):
File "videotrans\task\only_one.py", line 47, in run
File "videotrans\task\trans_create.py", line 322, in recogn
File "videotrans\recognition\__init__.py", line 190, in run
File "videotrans\recognition\_base.py", line 94, in run
File "videotrans\recognition\_whisper.py", line 36, in _exec
File "videotrans\recognition\_whisper.py", line 109, in _faster
File "videotrans\configure\base.py", line 253, in _new_process
videotrans.configure.excepts.VideoTransError: No transcription results returned. Please check the original audio/video or model and try again.
TaskCfgVTT(uuid='724416a535', name='C:/Users/yue/Desktop/1000.mp4', dirname='C:/Users/yue/Desktop', noextname='1000', basename='1000.mp4', ext='mp4', target_dir='C:/Users/yue/Desktop/_video_out/1000-mp4', cache_folder='D:/pyVideoTrans/tmp/52584/724416a535', is_cuda=False, source_language='简体中文', source_language_code='zh-cn', source_sub='C:/Users/yue/Desktop/_video_out/1000-mp4/zh-cn.srt', source_wav='D:/pyVideoTrans/tmp/52584/724416a535/remove_noise.wav', source_wav_output='C:/Users/yue/Desktop/_video_out/1000-mp4/zh-cn.m4a', target_language='英语', target_language_code='en', target_sub='C:/Users/yue/Desktop/_video_out/1000-mp4/en.srt', target_wav='D:/pyVideoTrans/tmp/52584/724416a535/target.wav', target_wav_output='C:/Users/yue/Desktop/_video_out/1000-mp4/en.m4a', translate_type=0, tts_type=0, volume='+0%', pitch='+0Hz', voice_rate='+0%', voice_role='No', voice_autorate=True, video_autorate=False, remove_silent_mid=False, align_sub_audio=True, detect_language='zh-cn', recogn_type=0, model_name='base', shibie_audio=None, remove_noise=True, enable_diariz=False, nums_diariz=0, rephrase=0, fix_punc=False, subtitle_language=None, app_mode='biaozhun', subtitles='', targetdir_mp4='C:/Users/yue/Desktop/_video_out/1000-mp4/1000.mp4', novoice_mp4='D:/pyVideoTrans/tmp/52584/724416a535/novoice.mp4', is_separate=True, embed_bgm=True, instrument='D:/pyVideoTrans/tmp/52584/724416a535/instrument.wav', vocal='D:/pyVideoTrans/tmp/52584/724416a535/vocal.wav', clear_cache=True, background_music='', subtitle_type=1, only_out_mp4=False, recogn2pass=False, output_srt=0, copysrt_rawvideo=False, loop_backaudio=1, backaudio_volume=0.8)


=
system:Windows-10-10.0.26200-SP0
version:v4.01
frozen:True
language:zh
root_dir:D:/pyVideoTrans

1 Replies

语音识别阶段失败的根本原因是 faster-whisper 返回了空结果,通常由以下情况导致:音频中无有效人声、VAD 过滤掉了所有语音片段、或人声背景分离后生成的音频文件为空。

解决方案(按推荐顺序尝试):

  1. 检查视频是否包含清晰的人声
    播放视频确认确实有说话声音,且背景噪音不过大。
  2. 暂时关闭降噪和人声背景分离
    您同时勾选了“降噪”和“分离人声背景声”,根据软件逻辑两者同时选中时只执行分离。请先取消勾选这两项,直接用原始音频进行识别。
  3. 调整 VAD 参数(在菜单 → 工具/选项 → 高级选项 → 语音识别参数中设置)

    • 语音阈值:0.5
    • 最短语音持续毫秒:1000(可先设为 500 再观察)
    • 最长语音持续秒数:6
    • 静音分割持续毫秒:140
      注意:如果使用 clone 角色,最短语音需要设为 3000,但您当前配音角色为 No,无需此设置。
  4. 更换更大的模型
    您当前使用 base 模型,准确度偏低。建议尝试 mediumlarge-v3(注意 large-v3 无 CUDA 时极慢,需耐心等待)。
  5. 启用 CUDA 加速(如果您的 NVIDIA 显卡已安装 CUDA 12.8+ 和 cuDNN 9.x)
    勾选主界面的“CUDA 加速”可大幅提升识别速度。

若以上均无效,请提供以下信息以便进一步分析:

  • 软件安装目录下的 logs/ 文件夹中最新的 .log 文件(复制底部约50行内容)。
  • 视频文件大小及编码格式(可用 ffprobe 查看)。

请查阅相关文档:

Post Your Reply
Open source and free maintenance is not easy. If this project is helpful to you, please consider making a small donation to help the project continue to maintain and update. Dondate: ko-fi.com/jianchang512

Related resource

Similar issues already exist