#4009 TaskCfgVTT(is_cuda=True, uuid='a5a832de62', cache_folder='D:/fanyi/tmp/44060/a5a832de62', target_dir='C:/Users/admin/Des

183.11* Posted at: 3 hours ago 👁8

语音识别阶段出错 [字节语音大模型极速版] 静音音频
Traceback (most recent call last):
File "videotrans\task\job.py", line 105, in run
File "videotrans\task\trans_create.py", line 360, in recogn
File "videotrans\recognition\__init__.py", line 240, in run
File "videotrans\recognition\_base.py", line 143, in run
File "videotrans\recognition\_zijiemodel.py", line 72, in _exec
RuntimeError: 静音音频
TaskCfgVTT(is_cuda=True, uuid='a5a832de62', cache_folder='D:/fanyi/tmp/44060/a5a832de62', target_dir='C:/Users/admin/Desktop/新建文件夹 (4)/20260127_2fdbd8ab15433531_551760379214_395470417824448_published_mp4_264_hd_taobao-mp4', source_language='简体中文', source_language_code='zh-cn', source_sub='C:/Users/admin/Desktop/新建文件夹 (4)/20260127_2fdbd8ab15433531_551760379214_395470417824448_published_mp4_264_hd_taobao-mp4/zh-cn.srt', source_wav='D:/fanyi/tmp/44060/a5a832de62/zh-cn.wav', source_wav_output='C:/Users/admin/Desktop/新建文件夹 (4)/20260127_2fdbd8ab15433531_551760379214_395470417824448_published_mp4_264_hd_taobao-mp4/zh-cn.m4a', target_language='马来西亚语', target_language_code='ms', target_sub='C:/Users/admin/Desktop/新建文件夹 (4)/20260127_2fdbd8ab15433531_551760379214_395470417824448_published_mp4_264_hd_taobao-mp4/ms.srt', target_wav='D:/fanyi/tmp/44060/a5a832de62/target.wav', target_wav_output='C:/Users/admin/Desktop/新建文件夹 (4)/20260127_2fdbd8ab15433531_551760379214_395470417824448_published_mp4_264_hd_taobao-mp4/ms.m4a', name='C:/Users/admin/Desktop/20260127_2fdbd8ab15433531_551760379214_395470417824448_published_mp4_264_hd_taobao.mp4', noextname='20260127_2fdbd8ab15433531_551760379214_395470417824448_published_mp4_264_hd_taobao', basename='20260127_2fdbd8ab15433531_551760379214_395470417824448_published_mp4_264_hd_taobao.mp4', ext='mp4', dirname='C:/Users/admin/Desktop', shound_del_name=None, translate_type=15, tts_type=0, volume='+0%', pitch='+0Hz', voice_rate='+0%', voice_role='No', voice_autorate=False, video_autorate=False, remove_silent_mid=False, align_sub_audio=True, detect_language='zh-cn', recogn_type=8, model_name='1.7B', shibie_audio=None, remove_noise=False, enable_diariz=False, nums_diariz=0, rephrase=0, fix_punc=False, subtitle_language=None, app_mode='tiqu', subtitles='', targetdir_mp4='C:/Users/admin/Desktop/新建文件夹 (4)/20260127_2fdbd8ab15433531_551760379214_395470417824448_published_mp4_264_hd_taobao-mp4/20260127_2fdbd8ab15433531_551760379214_395470417824448_published_mp4_264_hd_taobao.mp4', novoice_mp4='D:/fanyi/tmp/44060/a5a832de62/novoice.mp4', is_separate=True, embed_bgm=True, instrument='D:/fanyi/tmp/44060/a5a832de62/instrument.wav', vocal='D:/fanyi/tmp/44060/a5a832de62/vocal.wav', back_audio='', clear_cache=False, background_music=None, subtitle_type=0, only_out_mp4=True, recogn2pass=True, output_srt=1, copysrt_rawvideo=False)


=
system:Windows-10-10.0.26200-SP0
version:v3.98
frozen:True
language:zh
root_dir:D:/fanyi

1 Replies

这是由于视频中的人声被分离后,用于语音识别的音频片段被判断为静音或有效语音过少导致的。

解决方案:

  1. 检查视频音轨:确认原始视频是否确实包含清晰的人声。如果视频本身声音嘈杂或人声微弱,可能导致分离后的人声音频无效。
  2. 调整人声分离设置:在“设置更多参数”中,尝试取消勾选“分离人声背景声”选项,直接使用原始视频的混合音频进行识别。
  3. 使用降噪功能:如果视频背景噪音较大,可以在语音识别阶段勾选“降噪”选项,这有助于提升语音识别的准确性。
  4. 更换语音识别渠道:如果问题持续,可以尝试切换到其他语音识别渠道,例如“faster-whisper(本地)”或“Qwen-ASR(本地)”,并确保“发音语言”设置正确(此处为“简体中文”)。
  5. 简化文件路径和名称:从日志中看到,你的视频文件路径和名称非常长(C:/Users/admin/Desktop/20260127_2fdbd8ab15433531_551760379214_395470417824448_published_mp4_264_hd_taobao.mp4)。这可能导致处理过程中出现意外问题。请将视频文件移动到一个简单的路径(如 D:\video),并将其重命名为一个简短的英文名称(如 test.mp4),然后重新尝试处理。

请查阅相关文档:

Post Your Reply
Open source and free maintenance is not easy. If this project is helpful to you, please consider making a small donation to help the project continue to maintain and update.

Related resource