#4320 TaskCfgVTT(is_cuda=True, uuid='12e409df79', cache_folder='F:/win-pyvideotrans-v3.99-418/tmp/30472/12e409df79', target_di

220.246* Posted at: 6 hours ago 👁10

语音识别阶段出错 [openai-whisper(本地)] Traceback (most recent call last):
File "videotrans\process\stt_fun.py", line 114, in openai_whisper
File "whisper\transcribe.py", line 295, in transcribe
File "whisper\transcribe.py", line 201, in decode_with_fallback
File "torch\utils\_contextlib.py", line 116, in decorate_context

return func(*args, **kwargs)

File "whisper\decoding.py", line 824, in decode
File "torch\utils\_contextlib.py", line 116, in decorate_context

return func(*args, **kwargs)

File "whisper\decoding.py", line 737, in run
File "whisper\decoding.py", line 687, in _main_loop
File "whisper\decoding.py", line 163, in logits
File "torch n\modules\module.py", line 1751, in _wrapped_call_impl

return self._call_impl(*args, **kwargs)

File "torch n\modules\module.py", line 1762, in _call_impl

return forward_call(*args, **kwargs)

File "whisper\model.py", line 242, in forward
File "torch n\modules\module.py", line 1751, in wrapped_call
......
_contextlib.py", line 116, in decorate_context

return func(*args, **kwargs)

File "whisper\decoding.py", line 824, in decode
File "torch\utils\_contextlib.py", line 116, in decorate_context

return func(*args, **kwargs)

File "whisper\decoding.py", line 737, in run
File "whisper\decoding.py", line 687, in _main_loop
File "whisper\decoding.py", line 163, in logits
File "torch n\modules\module.py", line 1751, in _wrapped_call_impl

return self._call_impl(*args, **kwargs)

File "torch n\modules\module.py", line 1762, in _call_impl

return forward_call(*args, **kwargs)

File "whisper\model.py", line 242, in forward
File "torch n\modules\module.py", line 1751, in _wrapped_call_impl

return self._call_impl(*args, **kwargs)

File "torch n\modules\module.py", line 1762, in _call_impl

return forward_call(*args, **kwargs)

File "whisper\model.py", line 167, in forward
File "torch n\modules\module.py", line 1751, in _wrapped_call_impl

return self._call_impl(*args, **kwargs)

File "torch n\modules\module.py", line 1762, in _call_impl

return forward_call(*args, **kwargs)

File "whisper\model.py", line 112, in forward
File "torch n\modules\module.py", line 1751, in _wrapped_call_impl

return self._call_impl(*args, **kwargs)

File "torch n\modules\module.py", line 1762, in _call_impl

return forward_call(*args, **kwargs)

File "whisper\model.py", line 46, in forward
RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling cublasLtMatmul with transpose_mat1 1 transpose_mat2 0 m 1280 n 1 k 1280 mat1_ld 1280 mat2_ld 1280 result_ld 1280 abcType 2 computeType 68 scaleType 0
TaskCfgVTT(is_cuda=True, uuid='12e409df79', cache_folder='F:/win-pyvideotrans-v3.99-418/tmp/30472/12e409df79', target_dir='F:/AAA/The Art of Negotiating the Best Deal/_video_out/17 - Managing Emotions and Psychological Traps-mp4', source_language='英语', source_language_code='en', source_sub='F:/AAA/The Art of Negotiating the Best Deal/_video_out/17 - Managing Emotions and Psychological Traps-mp4/en.srt', source_wav='F:/win-pyvideotrans-v3.99-418/tmp/30472/12e409df79/en.wav', source_wav_output='F:/AAA/The Art of Negotiating the Best Deal/_video_out/17 - Managing Emotions and Psychological Traps-mp4/en.m4a', target_language='简体中文', target_language_code='zh-cn', target_sub='F:/AAA/The Art of Negotiating the Best Deal/_video_out/17 - Managing Emotions and Psychological Traps-mp4/zh-cn.srt', target_wav='F:/win-pyvideotrans-v3.99-418/tmp/30472/12e409df79/target.wav', target_wav_output='F:/AAA/The Art of Negotiating the Best Deal/_video_out/17 - Managing Emotions and Psychological Traps-mp4/zh-cn.m4a', name='F:/AAA/The Art of Negotiating the Best Deal/17 - Managing Emotions and Psychological Traps.mp4', noextname='17 - Managing Emotions and Psychological Traps', basename='17 - Managing Emotions and Psychological Traps.mp4', ext='mp4', dirname='F:/AAA/The Art of Negotiating the Best Deal', shound_del_name=None, translate_type=13, tts_type=0, volume='+0%', pitch='+0Hz', voice_rate='+0%', voice_role='No', voice_autorate=False, video_autorate=False, remove_silent_mid=False, align_sub_audio=True, detect_language='en', recogn_type=1, model_name='large-v3', shibie_audio=None, remove_noise=False, enable_diariz=False, nums_diariz=0, rephrase=0, fix_punc=False, subtitle_language=None, app_mode='biaozhun', subtitles='', targetdir_mp4='F:/AAA/The Art of Negotiating the Best Deal/_video_out/17 - Managing Emotions and Psychological Traps-mp4/17 - Managing Emotions and Psychological Traps.mp4', novoice_mp4='F:/win-pyvideotrans-v3.99-418/tmp/30472/12e409df79/novoice.mp4', is_separate=False, embed_bgm=True, instrument=None, vocal=None, back_audio='', clear_cache=False, background_music=None, subtitle_type=3, only_out_mp4=True, recogn2pass=True, output_srt=2, copysrt_rawvideo=False)


=
system:Windows-10-10.0.26200-SP0
version:v3.99
frozen:True
language:zh
root_dir:F:/win-pyvideotrans-v3.99-418

1 Replies

这是CUDA计算库执行失败,通常与显存不足、CUDA/cuDNN环境版本不兼容或驱动异常有关。当前使用了 openai-whisper(本地)large-v3 模型,且开启了CUDA加速(is_cuda=True),很可能因显存压力或库冲突导致 cublasLtMatmul 运算失败。

请按以下步骤逐一排查:

  1. 降低显存与计算负载

    • 进入 菜单→工具→高级选项→语音识别参数
    • CUDA数据类型 改为 float16int8
    • 识别准确度beam_size 设为 1识别准确度best_of 设为 1
    • 关闭 启用上下文感知
    • 保存后重试。
  2. 更换为更小的识别模型

    • 在主界面将语音识别模型从 large-v3 降为 mediumsmall
  3. 临时关闭二次识别

    • 取消勾选主界面右上角的 二次识别,减少显存占用。
  4. 检查CUDA与驱动

    • 确保已安装 CUDA 12.8cuDNN 9.11(或更高版本),并将最新NVIDIA显卡驱动更新至推荐版本。可在命令行运行 nvidia-sminvcc -V 确认。
  5. 尝试切换语音识别渠道

    • 优先改用 faster-whisper(本地) 并配合相同模型,其显存效率通常更好。若仍报错,可暂时关闭CUDA加速(取消 CUDA加速 勾选,纯CPU运行,速度会大幅降低)测试是否通过,以定位是否为CUDA环境问题。

若上述步骤仍无法解决,请收集 logs/ 目录下最新的日志文件底部约30行错误信息,提供更多细节。

请查阅相关文档:

Post Your Reply
Open source and free maintenance is not easy. If this project is helpful to you, please consider making a small donation to help the project continue to maintain and update.

Related resource