#2942 TaskCfg(cache_folder='C:/win-pyvideotrans-v3.92/tmp/33664/3df55bd0da', target_dir='C:/Users/75638/Desktop/_video_out/hun

67.159* Posted at: 1 hour ago 👁6

语音识别阶段出错 [faster-whisper(本地)] Traceback (most recent call last):
File "videotrans\process\stt_fun.py", line 256, in faster_whisper
File "faster_whisper\transcribe.py", line 1851, in restore_speech_timestamps
File "faster_whisper\transcribe.py", line 1213, in generate_segments
File "faster_whisper\transcribe.py", line 1446, in generate_with_fallback
RuntimeError: CUDA failed with error out of memory

Traceback (most recent call last):
File "videotrans\task\job.py", line 106, in run
File "videotrans\task\trans_create.py", line 358, in recogn
File "videotrans\recognition\__init__.py", line 276, in run
File "videotrans\recognition\_base.py", line 140, in run
File "videotrans\recognition\_overall.py", line 63, in _exec
File "videotrans\recognition\_overall.py", line 142, in _faster
File "videotrans\configure\_base.py", line 276, in _new_process
RuntimeError: Traceback (most recent call last):
File "videotrans\process\stt_fun.py", line 256, in faster_whisper
File "faster_whisper\transcribe.py", line 1851, in restore_speech_timestamps
File "faster_whisper\transcribe.py", line 1213, in generate_segments
File "faster_whisper\transcribe.py", line 1446, in generate_with_fallback
RuntimeError: CUDA failed with error out of memory
TaskCfg(cache_folder='C:/win-pyvideotrans-v3.92/tmp/33664/3df55bd0da', target_dir='C:/Users/75638/Desktop/_video_out/huntb-484_vocals-wav', remove_noise=True, is_separate=False, detect_language='ja', subtitle_language=None, source_language='日语', target_language='简体中文', source_language_code='ja', target_language_code='zh-cn', source_sub='C:/Users/75638/Desktop/_video_out/huntb-484_vocals-wav/ja.srt', target_sub='C:/Users/75638/Desktop/_video_out/huntb-484_vocals-wav/zh-cn.srt', source_wav='C:/win-pyvideotrans-v3.92/tmp/33664/3df55bd0da/remove_noise.wav', source_wav_output='C:/Users/75638/Desktop/_video_out/huntb-484_vocals-wav/ja.m4a', target_wav='C:/win-pyvideotrans-v3.92/tmp/33664/3df55bd0da/target.wav', target_wav_output='C:/Users/75638/Desktop/_video_out/huntb-484_vocals-wav/zh-cn.m4a', subtitles='', novoice_mp4='C:/win-pyvideotrans-v3.92/tmp/33664/3df55bd0da/novoice.mp4', noextname='huntb-484_vocals', shibie_audio=None, targetdir_mp4='C:/Users/75638/Desktop/_video_out/huntb-484_vocals-wav/huntb-484_vocals.mp4', instrument=None, vocal=None, back_audio='', background_music=None, app_mode='tiqu', subtitle_type=0, volume='+0%', pitch='+0Hz', voice_rate='+0%', voice_role='No', copysrt_rawvideo=False, clear_cache=False, translate_type=4, tts_type=0, recogn_type=0, model_name='large-v3', voice_autorate=False, video_autorate=False, cuda=True, name='C:/Users/75638/Desktop/huntb-484_vocals.wav', basename='huntb-484_vocals.wav', ext='wav', dirname='C:/Users/75638/Desktop', shound_del_name=None, uuid='3df55bd0da', remove_silent_mid=False, align_sub_audio=True, enable_diariz=False, nums_diariz=0, rephrase=0, only_out_mp4=False, fix_punc=False, recogn2pass=True)


=
system:Windows-10-10.0.26200-SP0
version:v3.95
frozen:True
language:zh
root_dir:C:/win-pyvideotrans-v3.92

1 Replies

这是由于显存不足导致的。在使用 faster-whisper (large-v3) 模型进行语音识别时,GPU 内存耗尽。

解决方案:

  1. 降低模型精度:在软件界面,进入“语音识别参数”设置,将模型从 large-v3 更换为更小的模型,例如 mediumsmall
  2. 启用量化:如果仍想使用 large-v3 模型,尝试在“语音识别参数”中开启 int8 量化选项,这能显著减少显存占用。
  3. 调整识别参数:降低 beam_size 参数(例如设为 1 或 2),可以减少计算量。
  4. 关闭其他 GPU 应用:确保没有其他程序(如游戏、浏览器)占用大量显存。
  5. 检查视频时长:如果处理的音频/视频文件非常长,可以尝试先将其分割为较短的片段进行处理。

请查阅相关文档:

Post Your Reply
Open source and free maintenance is not easy. If this project is helpful to you, please consider making a small donation to help the project continue to maintain and update.

Related resource