#2037 TaskCfg(cache_folder=

2409:8a55* Posted at: 12 hours ago 👁11

语音识别阶段出错:[faster-whisper(本地)] 运行时错误:Traceback (most recent call last):
File "videotrans\process\_overall.py", line 128, in run
File "faster_whisper\transcribe.py", line 876, in transcribe
File "faster_whisper\audio.py", line 69, in decode_audio
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 443. MiB for an array with shape (116071424,) and data type float32
:
Traceback (most recent call last):
File "videotrans\task\job.py", line 113, in run
File "videotrans\task\trans_create.py", line 365, in recogn
File "videotrans\recognition\__init__.py", line 239, in run
File "videotrans\recognition\_base.py", line 80, in run
File "videotrans\recognition\_overall.py", line 182, in _exec
RuntimeError: Traceback (most recent call last):
File "videotrans\process\_overall.py", line 128, in run
File "faster_whisper\transcribe.py", line 876, in transcribe
File "faster_whisper\audio.py", line 69, in decode_audio
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 443. MiB for an array with shape (116071424,) and data type float32

1 Replies

这个错误是典型的内存/显存不足问题。faster-whisper 在解码音频时需要分配大量内存(此处需要443MB),而您的系统无法满足。请按以下步骤解决:

  1. 更换更小的识别模型:在主界面第2行“语音识别”中,将模型从 large-v2/large-v3 更换为 mediumsmallbase。模型越小,内存占用越低。
  2. 修改高级设置以降低内存占用

    • 进入菜单栏 工具 -> 高级选项
    • Faster/OpenAI/Whisper 语音识别 区域,进行以下调整:

      • CUDA数据类型:从 float32 改为 float16int8int8 占用最小)。
      • 识别准确度 (beam_size):将数值从默认的 5 改为 1
      • 识别准确度 (best_of):同样改为 1
      • 启用上下文感知:设置为 False
  3. 切换识别模式:在主界面第2行,将识别模式从 整体识别 改为 批量推理。批量模式会将长音频分段处理,降低单次内存需求。
  4. 检查视频/音频文件:如果文件异常大(如数小时),可尝试先用第三方工具将其分割成较短片段(如30分钟一段)再处理。

如果以上调整后问题依旧,请提供您电脑的配置信息(如内存大小、是否有独立显卡)以及视频文件的时长,以便进一步分析。

Post Your Reply
Open source and free maintenance is not easy. If this project is helpful to you, please consider making a small donation to help the project continue to maintain and update.

Donate: https://ko-fi.com/jianchang512

Similar issues already exist

Trending Questions