#3012 TaskCfg(cache_folder='/Users/mac/pyvideotrans/tmp/65591/cb4cd1bbc8', target_dir='/Users/mac/pyvideotrans/output/recogn',

23.177* Posted at: 16 days ago 👁33

语音识别阶段出错 [openai-whisper(本地)] Traceback (most recent call last):
File "/Users/mac/pyvideotrans/videotrans/process/stt_fun.py", line 61, in openai_whisper

model = whisper.load_model(

File "/Users/mac/pyvideotrans/.venv/lib/python3.10/site-packages/whisper/__init__.py", line 161, in load_model

return model.to(device)

File "/Users/mac/pyvideotrans/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1355, in to

return self._apply(convert)

File "/Users/mac/pyvideotrans/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1003, in _apply

self._buffers[key] = fn(buf)

File "/Users/mac/pyvideotrans/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1341, in convert

return t.to(

NotImplementedError: Could not run 'aten::_sparse_coo_tensor_with_dims_and_tensors' with arguments from the 'SparseMPS' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selectiv
......
riableType_2.cpp:20142 [autograd kernel]
AutogradNestedTensor: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
Tracer: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/TraceType_2.cpp:17801 [kernel]
AutocastCPU: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/autocast_mode.cpp:322 [backend fallback]
AutocastMTIA: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/autocast_mode.cpp:466 [backend fallback]
AutocastXPU: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/autocast_mode.cpp:504 [backend fallback]
AutocastMPS: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/autocast_mode.cpp:209 [backend fallback]
AutocastCUDA: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/autocast_mode.cpp:165 [backend fallback]
FuncTorchBatched: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/LegacyBatchingRegistrations.cpp:731 [backend fallback]
BatchedNestedTensor: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/LegacyBatchingRegistrations.cpp:758 [backend fallback]
FuncTorchVmapMode: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/VmapModeRegistrations.cpp:27 [backend fallback]
Batched: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/LegacyBatchingRegistrations.cpp:1075 [backend fallback]
VmapMode: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback]
FuncTorchGradWrapper: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/TensorWrapper.cpp:208 [backend fallback]
PythonTLSSnapshot: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:202 [backend fallback]
FuncTorchDynamicLayerFrontMode: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:475 [backend fallback]
PreDispatch: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:206 [backend fallback]
PythonDispatcher: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:198 [backend fallback]
TaskCfg(cache_folder='/Users/mac/pyvideotrans/tmp/65591/cb4cd1bbc8', target_dir='/Users/mac/pyvideotrans/output/recogn', remove_noise=False, is_separate=False, detect_language='auto', subtitle_language=None, source_language=None, target_language=None, source_language_code=None, target_language_code=None, source_sub=None, target_sub='/Users/mac/pyvideotrans/output/recogn/1-15 关键词浓度的测试以及规范.srt', source_wav=None, source_wav_output=None, target_wav=None, target_wav_output=None, subtitles='', novoice_mp4=None, noextname='1-15 关键词浓度的测试以及规范', shibie_audio='/Users/mac/pyvideotrans/tmp/65591/cb4cd1bbc8/1-15 关键词浓度的测试以及规范-1769855230.533411.wav', targetdir_mp4=None, instrument=None, vocal=None, back_audio=None, background_music=None, app_mode='biaozhun', subtitle_type=0, volume='+0%', pitch='+0Hz', voice_rate='+0%', voice_role=None, copysrt_rawvideo=False, clear_cache=False, translate_type=None, tts_type=None, recogn_type=1, model_name='medium', voice_autorate=False, video_autorate=False, cuda=False, name='/Users/mac/Desktop/seo/1-15 关键词浓度的测试以及规范.mp4', basename='1-15 关键词浓度的测试以及规范.mp4', ext='mp4', dirname='/Users/mac/Desktop/seo', shound_del_name=None, uuid='cb4cd1bbc8', remove_silent_mid=False, align_sub_audio=True, enable_diariz=False, nums_diariz=0, rephrase=0, only_out_mp4=False, fix_punc=False, recogn2pass=False)


=
system:macOS-26.2-arm64-arm-64bit
version:v3.96
frozen:False
language:zh
root_dir:/Users/mac/pyvideotrans
Python: 3.10.19 (main, Jan 14 2026, 23:38:29) [Clang 21.1.4 ]

1 Replies

这是由于在 macOS 系统上使用 openai-whisper(本地) 模型时,PyTorch 的 MPS(Metal Performance Shaders)后端不支持该模型所需的某些稀疏张量操作导致的。

解决方案:

  1. 更换语音识别渠道:这是最直接的解决方法。在软件界面的“语音识别渠道”中,将 openai-whisper(本地) 更换为 faster-whisperfaster-whisper 对 macOS MPS 的支持通常更好。
  2. 强制使用 CPU:如果必须使用 openai-whisper,可以尝试在软件启动前设置环境变量,强制其使用 CPU 进行计算。在终端中执行以下命令后,再启动软件:

    export PYTORCH_ENABLE_MPS_FALLBACK=0
    cd /Users/mac/pyvideotrans
    uv run sp.py

    注意:这将导致识别速度显著变慢。

  3. 使用在线识别渠道:如果网络条件允许,可以考虑使用在线语音识别渠道(如 OpenAI API、阿里百炼 Qwen3-ASR 等),完全避免本地模型兼容性问题。

请查阅相关文档:

Post Your Reply
Open source and free maintenance is not easy. If this project is helpful to you, please consider making a small donation to help the project continue to maintain and update.

Related resource