#5555 cfg=TaskCfgSTT(uuid='37205e5f3c', name='C:/Users/DAWN/Desktop/Temp/wwdc-Spatial Design.mp4', dirname='C:/Users/DAWN/Desk

60.28* Posted at: 15 hours ago

语音识别阶段出错[faster-whisper(本地内置)] CUDA failed with error out of memory:Traceback (most recent call last):
File "videotrans\process\stt_fun.py", line 306, in faster_whisper
File "faster_whisper\transcribe.py", line 1851, in restore_speech_timestamps
File "faster_whisper\transcribe.py", line 1190, in generate_segments
File "faster_whisper\transcribe.py", line 1400, in encode
RuntimeError: CUDA failed with error out of memory

Traceback (most recent call last):

File "videotrans\task\job.py", line 35, in run

File "videotrans\task\job.py", line 100, in process_task

File "videotrans\task\speech2text.py", line 97, in recogn

File "videotrans\recognition\__init__.py", line 191, in run

File "videotrans\recognition\_base.py", line 90, in run

File "videotrans\recognition\_whisper.py", line 38, in _exec

File "videotrans\recognition\_whisper.py", line 124, in _faster

File "videotrans\configure\base.py", line 272, in _new_process

videotrans.configure.excepts.VideoTransError: CUDA failed with error out of memory:Traceback (most recent call last):
File "videotrans\process\stt_fun.py", line 306, in faster_whisper
File "faster_whisper\transcribe.py", line 1851, in restore_speech_timestamps
File "faster_whisper\transcribe.py", line 1190, in generate_segments
File "faster_whisper\transcribe.py", line 1400, in encode
RuntimeError: CUDA failed with error out of memory
cfg=TaskCfgSTT(uuid='37205e5f3c', name='C:/Users/DAWN/Desktop/Temp/wwdc-Spatial Design.mp4', dirname='C:/Users/DAWN/Desktop/Temp', noextname='wwdc-Spatial Design', basename='wwdc-Spatial Design.mp4', ext='mp4', target_dir='D:/Programs/win-pyvideotrans-v4.03-0626/recogn', cache_folder='D:/Programs/win-pyvideotrans-v4.03-0626/tmp/16936/37205e5f3c', is_cuda=True, source_language=None, source_language_code=None, source_sub=None, source_wav=None, source_wav_output=None, target_language=None, target_language_code=None, target_sub='D:/Programs/win-pyvideotrans-v4.03-0626/recogn/wwdc-Spatial Design.srt', target_wav=None, target_wav_output=None, detect_language='en', recogn_type=0, model_name='large-v3-turbo', shibie_audio='D:/Programs/win-pyvideotrans-v4.03-0626/tmp/16936/37205e5f3c/wwdc-Spatial Design-1782822423.9806714.wav', remove_noise=False, enable_diariz=False, nums_diariz=0, rephrase=0, fix_punc=0)


=
system:Windows-10-10.0.22621-SP0
version:v4.03
frozen:True
language:zh
root_dir:D:/Programs/win-pyvideotrans-v4.03-0626

1 Replies

这是由于显卡显存(VRAM)不足导致的 CUDA out of memory 错误。large-v3-turbo 模型在 GPU 加速下需要至少 6-8GB 空闲显存,如果您的显卡显存较小(如 4GB 或 6GB)就会触发该错误。

解决方法(按推荐顺序尝试):

  1. 更换为更小的模型
    ASR Channel 中选择 faster-whisper(本地内置) 后,将模型改为 smallbase,这两个模型显存需求低很多,同时 small 的识别准确度对多数场景足够。
  2. 降低精度和搜索深度
    进入 菜单 → 工具 → 高级选项 → 语音识别参数,将 计算数据类型 设为 int8(最省显存),将 beam_sizebest_of 都改为 1
  3. 关闭上下文感知
    在同一设置页面取消勾选 启用上下文感知,该功能会额外占用显存。
  4. 切换为 CPU 模式
    如果以上调整后仍然报错,请取消勾选主界面的 CUDA加速,软件将用 CPU 运行,速度会慢很多但不会因显存不足崩溃。

如果您的显卡显存确实很小(如 2-4GB),建议使用 tinybase 模型并关闭 CUDA 加速,或考虑使用在线 ASR 渠道(如 OpenAI 语音识别 API)来绕过本地显存限制。

请查阅相关文档:

Post Your Reply
Open source and free maintenance is not easy. If this project is helpful to you, please consider making a small donation to help the project continue to maintain and update. Dondate: ko-fi.com/jianchang512

Related resource