#5538 语音识别阶段出错[阿里FunASR(本地)(本地内置)] CUDA out of memory. Tried to allocate 4.11 GiB. GPU 0 has a total capacity of 7.96 GiB of w

38.207* Posted at: 1 hour ago

语音识别阶段出错[阿里FunASR(本地)(本地内置)] CUDA out of memory. Tried to allocate 4.11 GiB. GPU 0 has a total capacity of 7.96 GiB of which 0 bytes is free. Of the allocated memory 11.89 GiB is allocated by PyTorch, and 83.73 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables):Traceback (most recent call last):
File "videotrans\process\stt_fun.py", line 636, in funasr_mlt
File "funasr\auto\auto_model.py", line 324, in generate

return self.inference(

File "funasr\auto\auto_model.py", line 381, in inference

res = model.inference(**batch, **kwargs)

File "D:\Tool/win-pyvideotrans-v4.03-0622/videotrans/codes\model.py", line 610, in inference

return self.inference_llm(

File "D:\Tool/win-pyvideotrans-v4.03-0622/videotrans/codes\model.py", line 628
......
line 90, in run

File "videotrans\recognition\_funasr.py", line 61, in _exec

File "videotrans\configure\base.py", line 272, in _new_process

videotrans.configure.excepts.VideoTransError: CUDA out of memory. Tried to allocate 4.11 GiB. GPU 0 has a total capacity of 7.96 GiB of which 0 bytes is free. Of the allocated memory 11.89 GiB is allocated by PyTorch, and 83.73 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables):Traceback (most recent call last):
File "videotrans\process\stt_fun.py", line 636, in funasr_mlt
File "funasr\auto\auto_model.py", line 324, in generate

return self.inference(

File "funasr\auto\auto_model.py", line 381, in inference

res = model.inference(**batch, **kwargs)

File "D:\Tool/win-pyvideotrans-v4.03-0622/videotrans/codes\model.py", line 610, in inference

return self.inference_llm(

File "D:\Tool/win-pyvideotrans-v4.03-0622/videotrans/codes\model.py", line 628, in inference_llm

inputs_embeds, contents, batch, source_ids, meta_data = self.inference_prepare(

File "D:\Tool/win-pyvideotrans-v4.03-0622/videotrans/codes\model.py", line 493, in inference_prepare

encoder_out, encoder_out_lens = self.encode(speech, speech_lengths)

File "D:\Tool/win-pyvideotrans-v4.03-0622/videotrans/codes\model.py", line 272, in encode

encoder_out, encoder_out_lens = self.audio_encoder(speech, speech_lengths)

File "torch\nn\modules\module.py", line 1751, in _wrapped_call_impl

return self._call_impl(*args, **kwargs)

File "torch\nn\modules\module.py", line 1762, in _call_impl

return forward_call(*args, **kwargs)

File "D:\Tool\win-pyvideotrans-v4.03-0622\_internal\funasr\models\sense_voice\model.py", line 568, in forward

encoder_outs = encoder_layer(xs_pad, masks)

File "torch\nn\modules\module.py", line 1751, in _wrapped_call_impl

return self._call_impl(*args, **kwargs)

File "torch\nn\modules\module.py", line 1762, in _call_impl

return forward_call(*args, **kwargs)

File "D:\Tool\win-pyvideotrans-v4.03-0622\_internal\funasr\models\sense_voice\model.py", line 388, in forward

self.self_attn(

File "torch\nn\modules\module.py", line 1751, in _wrapped_call_impl

return self._call_impl(*args, **kwargs)

File "torch\nn\modules\module.py", line 1762, in _call_impl

return forward_call(*args, **kwargs)

File "D:\Tool\win-pyvideotrans-v4.03-0622\_internal\funasr\models\sense_voice\model.py", line 232, in forward

att_outs = self.forward_attention(v_h, scores, mask, mask_att_chunk_encoder)

File "D:\Tool\win-pyvideotrans-v4.03-0622\_internal\funasr\models\sense_voice\model.py", line 200, in forward_attention

attn = torch.softmax(scores, dim=-1).masked_fill(

torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 4.11 GiB. GPU 0 has a total capacity of 7.96 GiB of which 0 bytes is free. Of the allocated memory 11.89 GiB is allocated by PyTorch, and 83.73 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
cfg=[TaskCfgVTT]当前工作模式: 转录并翻译字幕
原始输入文件名: G:/七天(1)/2020.11.11黄先生代班,风骚兔女郎,小作精黑丝诱惑淫荡尤物,激情啪啪淫声浪语撸管必备.mp4,
输出结果保存到文件夹: G:/七天(1)/_video_out/2020.11.11黄先生代班,风骚兔女郎,小作精黑丝诱惑淫荡尤物,激情啪啪淫声浪语撸管必备-mp4,
临时文件夹: D:/Tool/win-pyvideotrans-v4.03-0622/tmp/16364/9e053a4f73
已选 清理已存在
已选 启用CUDA加速
未选 降噪
语音识别:阿里FunASR(本地)(本地内置), model_name: Fun-ASR-Nano-2512, 发音语言: 简体中文, 断句方式:默认断句
发音语言和目标语言相同,不翻译字幕
已选 将生成的字幕复制到视频目录下
输出字幕格式: 单字幕
代理地址:


=
system:Windows-10-10.0.26200-SP0
version:v4.03
frozen:True
language:zh
root_dir:D:/Tool/win-pyvideotrans-v4.03-0622

1 Replies

这是由于 GPU 显存不足(8GB 已耗尽)导致的。你的显卡可能被其他进程占用,或者 FunASR 模型在 8GB 显存下超出容量。

解决方案:

  1. 关闭其他占用显存的程序(如浏览器、其他 AI 工具、游戏等),重启软件后重试。
  2. 切换 ASR 渠道:使用 faster-whisper(本地内置) 并选择 smallbase 模型,它们对显存要求更低。

    • 进入「菜单 → 工具 → 高级选项 → 语音识别参数」,将 计算数据类型 改为 int8(更省显存)。
  3. 临时取消 CUDA 加速:在主界面取消勾选 CUDA加速,让软件使用 CPU 处理(速度较慢但可避免显存溢出)。
  4. 若仍失败:请将视频文件移至不含中文或空格的目录(例如 D:\video\test.mp4),避免路径问题影响 FFmpeg 中间过程。

请查阅相关文档:

Post Your Reply
Open source and free maintenance is not easy. If this project is helpful to you, please consider making a small donation to help the project continue to maintain and update. Dondate: ko-fi.com/jianchang512

Related resource