#2382 TaskCfg(cache_folder='D:/BaiduNetdiskDownload/据说可以翻译字幕的软件/win-pyvideotrans-v3.92/tmp/28592/cba9f368f4', target_dir='D:/B

221.0* Posted at: 8 hours ago 👁13

语音识别阶段出错 [faster-whisper(本地)] Traceback (most recent call last):
File "videotrans\process\_overall.py", line 208, in run
File "faster_whisper\transcribe.py", line 890, in transcribe
File "faster_whisper\vad.py", line 89, in get_speech_timestamps
File "faster_whisper\vad.py", line 335, in call
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 810. MiB for an array with shape (368560, 576) and data type float32

Traceback (most recent call last):
File "videotrans\task\job.py", line 113, in run
File "videotrans\task\_speech2text.py", line 146, in recogn
File "videotrans\recognition\__init__.py", line 245, in run
File "videotrans\recognition\_base.py", line 80, in run
File "videotrans\recognition\_overall.py", line 182, in _exec
RuntimeError: Traceback (most recent call last):
File "videotrans\process\_overall.py", line 208, in run
File "faster_whisper\transcribe.py", line 890, in transcribe
File "faster_whisper\vad.py", line 89, in get_speech_timestamps
File "faster_whisper\vad.py", line 335, in call
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 810. MiB for an array with shape (368560, 576) and data type float32

TaskCfg(cache_folder='D:/BaiduNetdiskDownload/据说可以翻译字幕的软件/win-pyvideotrans-v3.92/tmp/28592/cba9f368f4', target_dir='D:/BaiduNetdiskDownload/据说可以翻译字幕的软件/win-pyvideotrans-v3.92/output/recogn', remove_noise=False, is_separate=False, detect_language='ja', subtitle_language=None, source_language=None, target_language=None, source_language_code=None, target_language_code=None, source_sub=None, target_sub='D:/BaiduNetdiskDownload/据说可以翻译字幕的软件/win-pyvideotrans-v3.92/output/recogn/HOWS-003-uncensored-nyap2p.com.srt', source_wav=None, source_wav_output=None, target_wav=None, target_wav_output=None, subtitles=None, novoice_mp4=None, noextname='HOWS-003-uncensored-nyap2p.com', shibie_audio='D:/BaiduNetdiskDownload/据说可以翻译字幕的软件/win-pyvideotrans-v3.92/tmp/28592/cba9f368f4/HOWS-003-uncensored-nyap2p.com-1767144115.3077657.wav', targetdir_mp4=None, instrument=None, vocal=None, back_audio=None, background_music=None, app_mode='biaozhun', subtitle_type=0, volume='+0%', pitch='+0Hz', voice_rate='+0%', voice_role=None, copysrt_rawvideo=False, clear_cache=False, translate_type=None, tts_type=None, recogn_type=0, model_name='distil-large-v2', split_type=0, voice_autorate=False, video_autorate=False, cuda=False, name='D:/迅雷下载/HOWS-003-uncensored-HD/HOWS-003-uncensored-nyap2p.com.mp4', basename='HOWS-003-uncensored-nyap2p.com.mp4', ext='mp4', dirname='D:/迅雷下载/HOWS-003-uncensored-HD', shound_del_name=None, uuid='cba9f368f4', remove_silent_mid=False, align_sub_audio=True, enable_diariz=False, nums_diariz=0, rephrase=0, only_out_mp4=False)


=
system:Windows-10-10.0.26120-SP0
version:v3.92
frozen:True
language:zh
root_dir:D:/BaiduNetdiskDownload/据说可以翻译字幕的软件/win-pyvideotrans-v3.92

1 Replies

这是典型的显存/内存不足错误。faster-whisper 在处理长音频时,VAD(语音活动检测)需要大量内存。请按以下步骤解决:

  1. 降低内存占用(首选)

    • 在软件主界面,将语音识别模式从 整体识别 改为 批量推理
    • 在菜单栏选择 工具/选项 -> 高级选项
    • 在“Faster/OpenAI/Whisper 语音识别微调”区域:

      • CUDA数据类型float32 改为 int8
      • 识别准确度 (beam_size)识别准确度 (best_of) 都改为 1
      • 启用上下文感知 设为 False
  2. 使用更小的模型

    • 在主界面语音识别模型选择中,将 distil-large-v2 更换为更小的模型,如 mediumsmall
  3. 检查并启用CUDA(如果可用)

    • 确保主界面第6行的 CUDA加速 已勾选(如果你有NVIDIA显卡)。
  4. 处理文件路径

    • 你的视频文件路径较长且包含中文字符(D:/迅雷下载/HOWS-003-uncensored-HD)。建议将视频文件复制到一个路径更短、无中文/特殊字符的目录(如 D:/tmp/video.mp4)再进行处理。

如果以上步骤仍无法解决,请提供 D:/BaiduNetdiskDownload/据说可以翻译字幕的软件/win-pyvideotrans-v3.92/logs/ 目录下的最新日志文件。

请查看更多参考文档:

Post Your Reply
Open source and free maintenance is not easy. If this project is helpful to you, please consider making a small donation to help the project continue to maintain and update.

Donate: https://ko-fi.com/jianchang512

Trending Questions