#4747 TaskCfgVTT(is_cuda=False, uuid='e1f96262ff', cache_folder='D:/CNTV/tmp/29660/e1f96262ff', target_dir='D:/B0DJQTDJD9-prod

38.135* Posted at: 1 day ago

语音识别阶段出错 [faster-whisper(本地)] Traceback (most recent call last):
File "videotrans\process\stt_fun.py", line 489, in faster_whisper
File "videotrans\process\stt_fun.py", line 188, in _resegment
ZeroDivisionError: division by zero

Traceback (most recent call last):
File "videotrans\task\job.py", line 105, in run
File "videotrans\task\trans_create.py", line 390, in recogn
File "videotrans\recognition\__init__.py", line 293, in run
File "videotrans\recognition\_base.py", line 143, in run
File "videotrans\recognition\_overall.py", line 33, in _exec
File "videotrans\recognition\_overall.py", line 106, in _faster
File "videotrans\configure\_base.py", line 289, in _new_process
RuntimeError: Traceback (most recent call last):
File "videotrans\process\stt_fun.py", line 489, in faster_whisper
File "videotrans\process\stt_fun.py", line 188, in _resegment
ZeroDivisionError: division by zero
TaskCfgVTT(is_cuda=False, uuid='e1f96262ff', cache_folder='D:/CNTV/tmp/29660/e1f96262ff', target_dir='D:/B0DJQTDJD9-product-images/ssstik.io_1779869698520-mp4', source_language='英语', source_language_code='en', source_sub='D:/B0DJQTDJD9-product-images/ssstik.io_1779869698520-mp4/en.srt', source_wav='D:/CNTV/tmp/29660/e1f96262ff/en.wav', source_wav_output='D:/B0DJQTDJD9-product-images/ssstik.io_1779869698520-mp4/en.m4a', target_language='西班牙语', target_language_code='es', target_sub='D:/B0DJQTDJD9-product-images/ssstik.io_1779869698520-mp4/es.srt', target_wav='D:/CNTV/tmp/29660/e1f96262ff/target.wav', target_wav_output='D:/B0DJQTDJD9-product-images/ssstik.io_1779869698520-mp4/es.m4a', name='D:/mxg/ssstik.io_1779869698520.mp4', noextname='ssstik.io_1779869698520', basename='ssstik.io_1779869698520.mp4', ext='mp4', dirname='D:/mxg', shound_del_name=None, translate_type=0, tts_type=0, volume='+0%', pitch='+0Hz', voice_rate='+0%', voice_role='Ximena(Female/ES)', voice_autorate=True, video_autorate=False, remove_silent_mid=False, align_sub_audio=True, detect_language='en', recogn_type=0, model_name='base', shibie_audio=None, remove_noise=False, enable_diariz=False, nums_diariz=0, rephrase=0, fix_punc=False, subtitle_language=None, app_mode='biaozhun', subtitles='', targetdir_mp4='D:/B0DJQTDJD9-product-images/ssstik.io_1779869698520-mp4/ssstik.io_1779869698520.mp4', novoice_mp4='D:/CNTV/tmp/29660/e1f96262ff/novoice.mp4', is_separate=False, embed_bgm=True, instrument=None, vocal='D:/CNTV/tmp/29660/e1f96262ff/vocal.wav', back_audio='', clear_cache=False, background_music=None, subtitle_type=0, only_out_mp4=True, recogn2pass=False, output_srt=0, copysrt_rawvideo=False, loop_backaudio=1, backaudio_volume=0.8)


=
system:Windows-10-10.0.26200-SP0
version:v3.99
frozen:True
language:zh
root_dir:D:/CNTV

1 Replies

核心原因是音频片段分割时出现了零时长计算,这通常是由于VAD(语音活动检测)参数设置不当或音频文件过短/无有效语音导致。

解决方案:

  1. 调整语音识别参数(最常见原因)
    进入菜单 → 工具/选项高级选项语音识别参数,将以下两项设为非零值:

    • 最短语音持续毫秒 → 设为 1000(或更大)
    • 静音分割持续毫秒 → 设为 500(避免为0)
    • 同时确保 最长语音持续秒数 不小于 3
  2. 检查音频文件
    确认视频包含有效人类语音,且主界面 发音语言 正确选择了“英语”。若音频极短(如不足1秒),请更换视频。
  3. 尝试禁用“Whisper预分割音频”
    若上述调整无效,在同一个 语音识别参数 区域取消勾选 Whisper预分割音频?,然后重新处理。
  4. 更换更大的识别模型
    当前使用模型 base 较小,可尝试换用 smallmedium 以提高稳定性。

请查阅相关文档:

Post Your Reply
Open source and free maintenance is not easy. If this project is helpful to you, please consider making a small donation to help the project continue to maintain and update.

Related resource