#3294 TaskCfg(cache_folder='D:/win-pyvideotrans/tmp/6604/d53264c2a9', target_dir='D:/win-pyvideotrans/output/recogn', remove_n

103.151* Posted at: 4 months ago

语音识别阶段出错 [faster-whisper(本地)] Traceback (most recent call last):
File "videotrans\process\stt_fun.py", line 179, in faster_whisper
File "faster_whisper\transcribe.py", line 689, in init
RuntimeError: CUDA failed with error out of memory

Traceback (most recent call last):
File "videotrans\task\job.py", line 106, in run
File "videotrans\task\_speech2text.py", line 156, in recogn
File "videotrans\recognition\__init__.py", line 282, in run
File "videotrans\recognition\_base.py", line 141, in run
File "videotrans\recognition\_overall.py", line 63, in _exec
File "videotrans\recognition\_overall.py", line 142, in _faster
File "videotrans\configure\_base.py", line 276, in _new_process
RuntimeError: Traceback (most recent call last):
File "videotrans\process\stt_fun.py", line 179, in faster_whisper
File "faster_whisper\transcribe.py", line 689, in init
RuntimeError: CUDA failed with error out of memory
TaskCfg(cache_folder='D:/win-pyvideotrans/tmp/6604/d53264c2a9', target_dir='D:/win-pyvideotrans/output/recogn', remove_noise=True, is_separate=False, detect_language='zh-cn', subtitle_language=None, source_language=None, target_language=None, source_language_code=None, target_language_code=None, source_sub=None, target_sub='D:/win-pyvideotrans/output/recogn/魔尊大师姐驾到，统统闪开！.srt', source_wav=None, source_wav_output=None, target_wav=None, target_wav_output=None, subtitles='', novoice_mp4=None, noextname='魔尊大师姐驾到，统统闪开！', shibie_audio='D:/win-pyvideotrans/tmp/6604/d53264c2a9/removed_noise_1771494925.9226604.wav', targetdir_mp4=None, instrument=None, vocal=None, back_audio=None, background_music=None, app_mode='biaozhun', subtitle_type=0, volume='+0%', pitch='+0Hz', voice_rate='+0%', voice_role=None, copysrt_rawvideo=False, clear_cache=False, translate_type=None, tts_type=None, recogn_type=0, model_name='large-v3', voice_autorate=False, video_autorate=False, cuda=True, name='U:/shuchu/魔尊大师姐驾到，统统闪开！.mp4', basename='魔尊大师姐驾到，统统闪开！.mp4', ext='mp4', dirname='U:/shuchu', shound_del_name=None, uuid='d53264c2a9', remove_silent_mid=False, align_sub_audio=True, enable_diariz=True, nums_diariz=0, rephrase=0, only_out_mp4=False, fix_punc=False, recogn2pass=False)

=
system:Windows-10-10.0.19045-SP0
version:v3.96
frozen:True
language:zh
root_dir:D:/win-pyvideotrans

Floor #1 DeepSeek reply at 4 months ago

这是由于显存不足导致的。在使用 faster-whisper 的 large-v3 模型进行语音识别时，GPU 显存耗尽。

解决方案：

降低模型精度：在“语音识别参数”设置中，将“模型精度”从 float16 改为 int8。这能显著减少显存占用，但可能略微影响识别精度。
使用更小的模型：将“模型名称”从 large-v3 改为 medium 或 small。模型越小，显存占用和速度越快，但中文识别率会相应降低。
关闭 GPU 加速：在“语音识别参数”设置中，取消勾选“使用 CUDA 加速”。这将强制使用 CPU 进行识别，速度会变慢，但不会占用显存。
关闭其他占用显存的程序：确保没有其他大型程序（如游戏、其他AI软件）在后台运行，以释放显存。
调整视频处理方式：如果视频很长，可以尝试在“高级选项”中启用“整体识别”模式，这可能会改变内存/显存的分配方式。

#3294 TaskCfg(cache_folder='D:/win-pyvideotrans/tmp/6604/d53264c2a9', target_dir='D:/win-pyvideotrans/output/recogn', remove_n

1 Replies

请查阅相关文档：

Post Your Reply

Related resource