#4068 TaskCfgSTT(is_cuda=True, uuid='437f3ba732', cache_folder='D:/Software/pyvideotrans/tmp/50340/437f3ba732', target_dir='D:

146.19* Posted at: 4 hours ago 👁11

语音识别阶段出错 [faster-whisper(本地)] 出错了,可能内存或显存不足 Model:large-v3 GPU0
Traceback (most recent call last):
File "videotrans\configure\_base.py", line 285, in _new_process
File "concurrent\futures\_base.py", line 458, in result
File "concurrent\futures\_base.py", line 403, in __get_result
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.

Traceback (most recent call last):
File "videotrans\configure\_base.py", line 285, in _new_process
File "concurrent\futures\_base.py", line 458, in result
File "concurrent\futures\_base.py", line 403, in __get_result
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "videotrans\task\job.py", line 105, in run
File "videotrans\task\_speech2text.py", line 159, in recogn
File "videotrans\recognition\__init__.py", line 292, in run
File "videotrans\recognition\_base.py", line 143, in run
File "videotrans\recognition\_overall.py", line 33, in _exec
File "videotrans\recognition\_overall.py", line 105, in _faster
File "videotrans\configure\_base.py", line 303, in _new_process
RuntimeError: 出错了,可能内存或显存不足 Model:large-v3 GPU0
Traceback (most recent call last):
File "videotrans\configure\_base.py", line 285, in _new_process
File "concurrent\futures\_base.py", line 458, in result
File "concurrent\futures\_base.py", line 403, in __get_result
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
TaskCfgSTT(is_cuda=True, uuid='437f3ba732', cache_folder='D:/Software/pyvideotrans/tmp/50340/437f3ba732', target_dir='D:/Software/pyvideotrans/output/recogn', source_language=None, source_language_code=None, source_sub=None, source_wav=None, source_wav_output=None, target_language=None, target_language_code=None, target_sub='D:/Software/pyvideotrans/output/recogn/402.srt', target_wav=None, target_wav_output=None, name='C:/Users/Acer/Music/402.mp3', noextname='402', basename='402.mp3', ext='mp3', dirname='C:/Users/Acer/Music', shound_del_name=None, detect_language='ru', recogn_type=0, model_name='large-v3', shibie_audio='D:/Software/pyvideotrans/tmp/50340/437f3ba732/402-1776183057.49953.wav', remove_noise=False, enable_diariz=False, nums_diariz=0, rephrase=0, fix_punc=False)


=
system:Windows-10-10.0.22631-SP0
version:v3.98
frozen:True
language:zh
root_dir:D:/Software/pyvideotrans

1 Replies

这是由于显存或内存不足导致的。使用 faster-whisper(本地)large-v3 模型进行语音识别时,该模型需要较大的显存(建议10GB以上)或内存(建议16GB以上)。错误信息显示进程池被意外终止,这是资源耗尽的典型表现。

解决方案:

  1. 使用更小的模型:在主界面的语音识别渠道右侧,将模型从 large-v3 更换为 mediumsmallbaselarge-v3 模型对硬件要求最高。
  2. 调整高级设置以降低资源占用:点击顶部菜单 工具/选项 -> 高级选项,在“语音识别参数”区域进行以下调整:

    • CUDA数据类型:将 float32 改为 float16int8
    • 识别准确度beam_size:将 5 改为 1
    • 识别准确度best_of:将 5 改为 1
    • 启用上下文感知:确保此项为 false(未勾选)。
  3. 检查并限制并发任务:在“高级选项”的“通用设置”区域:

    • 确保 GPU同时任务数[重启生效] 设置为 1(除非单卡显存大于20GB)。
    • 如果同时处理多个视频,可以勾选 批量翻译时强制串行,避免多个任务同时竞争GPU资源。
  4. 关闭二次识别:如果主界面右上角勾选了“二次识别”,请取消勾选。同时翻译多个视频时,二次识别会显著增加显存需求。
  5. 确保CUDA环境正确:虽然错误提示了GPU0,但仍需确认CUDA版本为12.8,且已正确安装cuDNN。在CMD中输入 nvcc -Vnvidia-smi 进行验证。

如果以上调整后问题依旧,说明当前硬件确实无法承载 large-v3 模型,请务必更换为更小的模型。

请查阅相关文档:

Post Your Reply
Open source and free maintenance is not easy. If this project is helpful to you, please consider making a small donation to help the project continue to maintain and update.

Related resource