#3564 TaskCfgVTT(is_cuda=True, uuid='4eec5068f4', cache_folder='D:/project/Videotrans/win-pyvideotrans-v3.97-0304/tmp/14092/4e

113.108* Posted at: 13 hours ago 👁11

语音识别阶段出错 [faster-whisper(本地)] Traceback (most recent call last):
File "videotrans\process\stt_fun.py", line 390, in faster_whisper
File "faster_whisper\transcribe.py", line 460, in transcribe
File "faster_whisper\utils.py", line 123, in format_timestamp
AssertionError: non-negative timestamp expected

Traceback (most recent call last):
File "videotrans\task\job.py", line 105, in run
File "videotrans\task\trans_create.py", line 353, in recogn
File "videotrans\recognition\__init__.py", line 265, in run
File "videotrans\recognition\_base.py", line 143, in run
File "videotrans\recognition\_overall.py", line 33, in _exec
File "videotrans\recognition\_overall.py", line 105, in _faster
File "videotrans\configure\_base.py", line 288, in _new_process
RuntimeError: Traceback (most recent call last):
File "videotrans\process\stt_fun.py", line 390, in faster_whisper
File "faster_whisper\transcribe.py", line 460, in transcribe
File "faster_whisper\utils.py", line 123, in format_timestamp
AssertionError: non-negative timestamp expected
TaskCfgVTT(is_cuda=True, uuid='4eec5068f4', cache_folder='D:/project/Videotrans/win-pyvideotrans-v3.97-0304/tmp/14092/4eec5068f4', target_dir='C:/Users/savior/Desktop/1/_video_out/p23 11. Day 2 - Building and Testing an Ensemble Agent with Multiple LLM Calls-mp4', source_language='英语', source_language_code='en', source_sub='C:/Users/savior/Desktop/1/_video_out/p23 11. Day 2 - Building and Testing an Ensemble Agent with Multiple LLM Calls-mp4/en.srt', source_wav='D:/project/Videotrans/win-pyvideotrans-v3.97-0304/tmp/14092/4eec5068f4/remove_noise.wav', source_wav_output='C:/Users/savior/Desktop/1/_video_out/p23 11. Day 2 - Building and Testing an Ensemble Agent with Multiple LLM Calls-mp4/en.m4a', target_language='简体中文', target_language_code='zh-cn', target_sub='C:/Users/savior/Desktop/1/_video_out/p23 11. Day 2 - Building and Testing an Ensemble Agent with Multiple LLM Calls-mp4/zh-cn.srt', target_wav='D:/project/Videotrans/win-pyvideotrans-v3.97-0304/tmp/14092/4eec5068f4/target.wav', target_wav_output='C:/Users/savior/Desktop/1/_video_out/p23 11. Day 2 - Building and Testing an Ensemble Agent with Multiple LLM Calls-mp4/zh-cn.m4a', name='C:/Users/savior/Desktop/1/p23 11. Day 2 - Building and Testing an Ensemble Agent with Multiple LLM Calls.mp4', noextname='p23 11. Day 2 - Building and Testing an Ensemble Agent with Multiple LLM Calls', basename='p23 11. Day 2 - Building and Testing an Ensemble Agent with Multiple LLM Calls.mp4', ext='mp4', dirname='C:/Users/savior/Desktop/1', shound_del_name=None, translate_type=5, tts_type=10, volume='+0%', pitch='+0Hz', voice_rate='+0%', voice_role='aozi.wav', voice_autorate=True, video_autorate=False, remove_silent_mid=False, align_sub_audio=True, detect_language='en', recogn_type=0, model_name='large-v3-turbo', shibie_audio=None, remove_noise=True, enable_diariz=False, nums_diariz=0, rephrase=0, fix_punc=False, subtitle_language=None, app_mode='biaozhun', subtitles='', targetdir_mp4='C:/Users/savior/Desktop/1/_video_out/p23 11. Day 2 - Building and Testing an Ensemble Agent with Multiple LLM Calls-mp4/p23 11. Day 2 - Building and Testing an Ensemble Agent with Multiple LLM Calls.mp4', novoice_mp4='D:/project/Videotrans/win-pyvideotrans-v3.97-0304/tmp/14092/4eec5068f4/novoice.mp4', is_separate=False, instrument=None, vocal=None, back_audio='', clear_cache=False, background_music=None, subtitle_type=4, only_out_mp4=False, recogn2pass=True, output_srt=2, copysrt_rawvideo=False)


=
system:Windows-10-10.0.26200-SP0
version:v3.97
frozen:True
language:zh
root_dir:D:/project/Videotrans/win-pyvideotrans-v3.97-0304

1 Replies

语音识别阶段出现 AssertionError: non-negative timestamp expected 错误,通常是由于音频预处理(如降噪)或语音活动检测(VAD)参数设置不当,导致生成的音频片段时间戳出现负值。

解决方案:

  1. 检查降噪设置:从日志中看到 remove_noise=True。请尝试在软件主界面取消勾选 降噪 选项,然后重新运行任务。
  2. 调整语音识别参数:如果取消降噪后问题依旧,请调整 VAD 相关参数。

    • 进入 菜单 -> 工具 -> 高级选项 -> 语音识别参数
    • 语音阈值 适当调高(例如从默认的 0.5 调至 0.6 或 0.7)。
    • 静音分割持续毫秒 适当调大(例如从默认的 500 调至 800 或 1000)。
    • 最短语音持续毫秒 适当调大(例如从默认的 1000 调至 2000)。
  3. 检查音频文件:确认源视频/音频文件本身没有损坏,且包含有效的音轨。
  4. 清理缓存并重试:在软件主界面勾选 清理已生成 选项,然后重新处理视频,以确保使用新的参数重新生成所有中间文件。

如果以上步骤均无效,请提供软件安装目录下 logs/ 文件夹中最新日志文件的底部约30行报错信息,以便进一步分析。

请查阅相关文档:

Post Your Reply
Open source and free maintenance is not easy. If this project is helpful to you, please consider making a small donation to help the project continue to maintain and update.

Related resource