#3866 TaskCfgSTT(is_cuda=True, uuid='15a41a886e', cache_folder='D:/win-pyvideotrans-v3.98-323/tmp/8/15a41a886e', target_dir='D

180.109* Posted at: 3 months ago

语音识别阶段出错 [openai-whisper(本地)] 出错了，可能内存或显存不足 Model:medium.en GPU0
Traceback (most recent call last):
File "videotrans\configure\_base.py", line 281, in _new_process
File "videotrans\process\signelobj.py", line 80, in submit_task_gpu
File "concurrent\futures\process.py", line 720, in submit
concurrent.futures.process.BrokenProcessPool: A child process terminated abruptly, the process pool is not usable anymore

Traceback (most recent call last):
File "videotrans\configure\_base.py", line 281, in _new_process
File "videotrans\process\signelobj.py", line 80, in submit_task_gpu
File "concurrent\futures\process.py", line 720, in submit
concurrent.futures.process.BrokenProcessPool: A child process terminated abruptly, the process pool is not usable anymore

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "videotrans\task\job.py", line 105, in run
File "videotrans\task\_speech2text.py", line 159, in recogn
File "videotrans\recognition\__init__.py", line 272, in run
File "videotrans\recognition\_base.py", line 143, in run
File "videotrans\recognition\_overall.py", line 31, in _exec
File "videotrans\recognition\_overall.py", line 73, in _openai
File "videotrans\configure\_base.py", line 303, in _new_process
RuntimeError: 出错了，可能内存或显存不足 Model:medium.en GPU0
Traceback (most recent call last):
File "videotrans\configure\_base.py", line 281, in _new_process
File "videotrans\process\signelobj.py", line 80, in submit_task_gpu
File "concurrent\futures\process.py", line 720, in submit
concurrent.futures.process.BrokenProcessPool: A child process terminated abruptly, the process pool is not usable anymore
TaskCfgSTT(is_cuda=True, uuid='15a41a886e', cache_folder='D:/win-pyvideotrans-v3.98-323/tmp/8/15a41a886e', target_dir='D:/win-pyvideotrans-v3.98-323/output/recogn', source_language=None, source_language_code=None, source_sub=None, source_wav=None, source_wav_output=None, target_language=None, target_language_code=None, target_sub='D:/win-pyvideotrans-v3.98-323/output/recogn/NaughtyAmericaVR - Wedding Orgy 9.srt', target_wav=None, target_wav_output=None, name='D:/Qwen3_ASR/Qwen3_ASR/QQQQQ/NaughtyAmericaVR - Wedding Orgy 9.wav', noextname='NaughtyAmericaVR - Wedding Orgy 9', basename='NaughtyAmericaVR - Wedding Orgy 9.wav', ext='wav', dirname='D:/Qwen3_ASR/Qwen3_ASR/QQQQQ', shound_del_name=None, detect_language='en', recogn_type=1, model_name='medium.en', shibie_audio='D:/win-pyvideotrans-v3.98-323/tmp/8/15a41a886e/NaughtyAmericaVR - Wedding Orgy 9-1774870990.5595922.wav', remove_noise=False, enable_diariz=False, nums_diariz=0, rephrase=0, fix_punc=False)

=
system:Windows-10-10.0.26220-SP0
version:v3.98
frozen:True
language:zh
root_dir:D:/win-pyvideotrans-v3.98-323

Floor #1 DeepSeek reply at 3 months ago

这是由于显存或内存不足导致的。在使用 openai-whisper(本地) 的 medium.en 模型进行语音识别时，GPU 进程池因资源耗尽而崩溃。

解决方案：

使用更小的模型：将语音识别模型从 medium.en 更换为 base.en 或 small.en，以降低显存和内存占用。
调整高级设置：在菜单栏 工具 -> 高级选项 中，进入“语音识别参数”区域，进行如下修改以降低资源消耗：
- CUDA数据类型：将 float32 改为 float16 或 int8。
- 识别准确度beam_size：将 5 改为 1。
- 识别准确度best_of：将 5 改为 1。
- 启用上下文感知：确保此项为 false（不勾选）。
检查文件路径：确保输入文件（如 NaughtyAmericaVR - Wedding Orgy 9.wav）的路径和名称不包含特殊符号、表情符或过长。建议将文件移动到浅层目录（如 D:/videos）并重命名为简短英文名（如 video.wav）。
确保正确选择语言：确认视频中的语音是英语，并在“发音语言”中正确选择了“英语”。.en 结尾的模型仅支持英语识别。
关闭其他占用GPU的程序：在处理前，关闭其他可能大量占用GPU资源的软件（如游戏、其他AI工具）。

如果以上步骤仍无法解决，可能是硬件资源确实不足，请考虑使用在线语音识别API（如 OpenAI 语音识别API）或使用纯CPU模式（取消勾选主界面的“CUDA加速”），但后者速度会非常慢。

请查阅相关文档：

Floor #2 180.109* reply at 3 months ago

5060ti8G应该够用吧，有时候能用有时候跳提示

Floor #3 DeepSeek reply at 3 months ago

5060ti 8G 显存对于 medium.en 模型可能处于临界状态，尤其是在处理较长音频或同时运行其他任务时，可能导致间歇性显存不足。

解决方案：

强制使用更小模型：将语音识别模型从 medium.en 降级为 base.en 或 small.en，以稳定运行。
优化高级设置：在 菜单 -> 工具 -> 高级选项 -> 语音识别参数 中，进行以下调整以显著降低显存占用：
- CUDA数据类型：改为 int8。
- 识别准确度beam_size：改为 1。
- 识别准确度best_of：改为 1。
- 启用上下文感知：确保未勾选（设为 false）。
确保单任务运行：在 菜单 -> 工具 -> 高级选项 -> 通用设置 中，确认 GPU同时任务数 设置为 1。避免多个识别任务同时抢占显存。
清理缓存并重试：在主界面左上角勾选 清理已生成，然后重新处理视频，以排除缓存文件干扰。
检查文件路径：确保待处理的音频/视频文件路径简短且无特殊字符（如 ?* 或表情符号）。建议将文件重命名为简短英文名并放在浅层目录（如 D:/video）。

如果调整后问题依旧，可能是特定音频文件过长或复杂导致瞬时显存需求超出8G，可考虑使用在线识别API（如OpenAI语音识别API）或使用纯CPU模式（取消勾选“CUDA加速”）。

请查阅相关文档：

Floor #4 Admin reply at 3 months ago

如果同时翻译多个视频，关掉主界面右上角的二次识别

#3866 TaskCfgSTT(is_cuda=True, uuid='15a41a886e', cache_folder='D:/win-pyvideotrans-v3.98-323/tmp/8/15a41a886e', target_dir='D

4 Replies

请查阅相关文档：

请查阅相关文档：

Post Your Reply

Related resource