#3589 对tiny模型返回的断句结果重新修正

2409:8a1e* Posted at: 3 days ago 👁26

Unknow error


=
system:Windows-10-10.0.26100-SP0
version:v3.98
frozen:True
language:zh
root_dir:G:/pyVideoTrans

[DEBUG] 最终配置信息:self.cfg=TaskCfgVTT(is_cuda=False, uuid='8dae837ea5', cache_folder='G:/pyVideoTrans/tmp/1164/8dae837ea5', target_dir='C:/Users/Administrator/Downloads/VidBee/Videos/_video_out/55-mp4', source_language='英语', source_language_code='en', source_sub='C:/Users/Administrator/Downloads/VidBee/Videos/_video_out/55-mp4/en.srt', source_wav='G:/pyVideoTrans/tmp/1164/8dae837ea5/en.wav', source_wav_output='C:/Users/Administrator/Downloads/VidBee/Videos/_video_out/55-mp4/en.m4a', target_language='简体中文', target_language_code='zh-cn', target_sub='C:/Users/Administrator/Downloads/VidBee/Videos/_video_out/55-mp4/zh-cn.srt', target_wav='G:/pyVideoTrans/tmp/1164/8dae837ea5/target.wav', target_wav_output='C:/Users/Administrator/Downloads/VidBee/Videos/_video_out/55-mp4/zh-cn.m4a', name='C:/Users/Administrator/Downloads/VidBee/Videos/55.mp4', noextname='55', basename='55.mp4', ext='mp4', dirname='C:/Users/Administrator/Downloads/VidBee/Videos', shound_del_name=None, translate_type=1, tts_type=0, volume='+0%', pitch='+0Hz', voice_rate='+0%', voice_role='No', voice_autorate=True, video_autorate=False, remove_silent_mid=False, align_sub_audio=True, detect_language='en', recogn_type=0, model_name='tiny', shibie_audio=None, remove_noise=False, enable_diariz=False, nums_diariz=0, rephrase=0, fix_punc=False, subtitle_language=None, app_mode='biaozhun', subtitles='', targetdir_mp4='C:/Users/Administrator/Downloads/VidBee/Videos/_video_out/55-mp4/55.mp4', novoice_mp4='G:/pyVideoTrans/tmp/1164/8dae837ea5/novoice.mp4', is_separate=False, instrument=None, vocal=None, back_audio='', clear_cache=True, background_music=None, subtitle_type=1, only_out_mp4=False, recogn2pass=True, output_srt=0, copysrt_rawvideo=False)

[DEBUG] [recognition]__init__:kwargs={'detect_language': 'en', 'audio_file': 'G:/pyVideoTrans/tmp/1164/8dae837ea5/en.wav', 'cache_folder': 'G:/pyVideoTrans/tmp/1164/8dae837ea5', 'model_name': 'tiny', 'uuid': '8dae837ea5', 'is_cuda': False, 'subtitle_type': 1, 'recogn_type': 0, 'max_speakers': -1, 'llm_post': False, 'recogn2pass': False}
[DEBUG] BaseRecogn 初始化

[DEBUG] CPU进程池:max_workers=4
[DEBUG] faster-whisper模式下,对tiny模型返回的断句结果重新修正
[DEBUG] [语音识别]渠道0,tiny:共耗时:4s

1 Replies

该错误是由于语音识别模型选择不当导致的。日志显示使用了 faster-whisper(本地) 渠道的 tiny 模型进行识别,该模型尺寸最小、速度最快,但识别准确度极低,尤其对于非英语视频,可能导致识别结果为空、乱码或后续处理步骤出错。

解决方案:

  1. 更换语音识别模型:在软件主界面的“语音识别”下拉列表中,将模型从 tiny 更换为更准确的模型。对于英语视频,建议至少使用 basesmall 模型;最佳效果推荐使用 large-v3 模型(需确保计算机内存≥16G或显存≥10G)。
  2. 启用GPU加速:如果您有NVIDIA显卡,请确保已正确安装CUDA 12.8和cuDNN,并在主界面勾选“CUDA加速”,以大幅提升识别速度。
  3. 明确指定发音语言:确认“发音语言”已正确设置为“英语”(en),与视频内容一致。
  4. 检查高级设置:如果问题依旧,请进入“菜单-工具-高级选项-语音识别参数”,尝试调整“语音阈值”(默认0.5)、“最长语音持续秒数”等参数,或开启“降噪”选项。

请查阅相关文档:

Post Your Reply
Open source and free maintenance is not easy. If this project is helpful to you, please consider making a small donation to help the project continue to maintain and update.

Related resource