点击查看 Edge-TTS 配音渠道无法使用的解决办法!!!

#1962 TaskCfg(cache_folder='D:/win-pyvideotrans-v3.86/tmp18332/speech2text', target_dir='d:/浏览器下载/win-pyvideotrans-v3.86/outpu

197.149* Posted at: 7 hours ago 👁7

语音识别阶段出错:[faster-whisper(本地)] 运行时错误:Traceback (most recent call last):
File "videotrans\process\_overall.py", line 149, in run
File "faster_whisper\transcribe.py", line 1851, in restore_speech_timestamps
File "faster_whisper\transcribe.py", line 1213, in generate_segments
File "faster_whisper\transcribe.py", line 1446, in generate_with_fallback
RuntimeError: CUDA failed with error out of memory
:
Traceback (most recent call last):
File "videotrans\task\job.py", line 113, in run
File "videotrans\task\_speech2text.py", line 140, in recogn
File "videotrans\recognition\__init__.py", line 236, in run
File "videotrans\recognition\_base.py", line 78, in run
File "videotrans\recognition\_overall.py", line 193, in _exec
RuntimeError: Traceback (most recent call last):
File "videotrans\process\_overall.py", line 149, in run
File "faster_whisper\transcribe.py", line 1851, in restore_speech_timestamps
File "faster_whisper\transcribe.py", line 1213, in generate_segments
File "faster_whisper\transcribe.py", line 1446, in generate_with_fallback
RuntimeError: CUDA failed with error out of memory

TaskCfg(cache_folder='D:/win-pyvideotrans-v3.86/tmp18332/speech2text', target_dir='d:/浏览器下载/win-pyvideotrans-v3.86/output/recogn', remove_noise=False, is_separate=False, detect_language='fr', subtitle_language=None, source_language=None, target_language=None, source_language_code=None, target_language_code=None, source_sub=None, target_sub='d:/浏览器下载/win-pyvideotrans-v3.86/output/recogn/小王子 (1).srt', source_wav=None, source_wav_output=None, target_wav=None, target_wav_output=None, subtitles=None, novoice_mp4=None, noextname='小王子 (1)', shibie_audio='D:/win-pyvideotrans-v3.86/tmp18332/speech2text/小王子 (1)-1765464438.7408297.wav', targetdir_mp4=None, instrument=None, vocal=None, back_audio=None, background_music=None, app_mode='biaozhun', subtitle_type=0, volume='+0%', pitch='+0Hz', voice_rate='+0%', voice_role=None, copysrt_rawvideo=False, clear_cache=False, translate_type=None, tts_type=None, recogn_type=0, model_name='large-v1', split_type=0, voice_autorate=False, video_autorate=False, cuda=True, name='D:/calibre_vault/le_petit_prince_zh_fr/audio/小王子 (1).mp3', basename='小王子 (1).mp3', ext='mp3', dirname='D:/calibre_vault/le_petit_prince_zh_fr/audio', shound_del_name=None, uuid='a2adf66d8c', remove_silent_mid=False, align_sub_audio=True, enable_diariz=False, nums_diariz=0, auto_fix=True)


=
system:Windows-10-10.0.26200-SP0
version:v3.86
frozen:True
language:zh
root_dir:D:/win-pyvideotrans-v3.86

1 Replies

错误信息明确显示“CUDA failed with error out of memory”,即显存不足。这是使用 faster-whisper 大模型(如您使用的 large-v1)进行语音识别时的常见问题。

请按以下步骤逐一尝试解决:

1. 降低模型精度(最有效)

  • 进入菜单栏:工具 -> 高级选项
  • Faster/OpenAI/Whisper 语音识别微调 区域,找到 CUDA数据类型
  • 将其从默认的 float32 改为 float16int8int8 占用显存最小,推荐优先尝试。

2. 降低识别复杂度(节省显存)

  • 在同一区域,将 识别准确度 (beam_size)识别准确度 (best_of) 的值都从默认的 5 改为 1。这会轻微影响识别质量,但能显著减少显存占用。

3. 更换更小的识别模型

  • 在主界面 第2行:语音识别模型 下拉框中,将 large-v1 更换为更小的模型,例如 mediumsmallbase。模型越小,显存占用越少,速度越快,但识别精度会下降。

4. 切换识别模式

  • 在主界面 第2行:语音识别识别模式 下拉框中,将 整体识别 改为 批量推理批量推理 模式对显存的要求通常更低。

5. (可选)关闭上下文感知

  • 在高级选项的同一区域,将 启用上下文感知 设置为 False。这也能节省一部分显存。

执行顺序建议:首先尝试 步骤1步骤2,如果问题依旧,再尝试 步骤3步骤4。通常调整前两项即可解决大部分显存不足的问题。

如果以上所有方法都尝试后仍然报错,则可能是您的显卡显存确实过小,无法承载 large-v1 模型。此时唯一的解决方案是使用更小的模型(如 tiny),或者使用CPU进行识别(取消勾选主界面第6行的“CUDA加速”),但CPU识别速度会非常慢。

Post Your Reply
Open source and free maintenance is not easy. If this project is helpful to you, please consider making a small donation to help the project continue to maintain and update.

Donate: https://ko-fi.com/jianchang512

Trending Questions