#3455 TaskCfgSTT(is_cuda=True, uuid='59f00a83a7', cache_folder='E:/ytrans/tmp/4868/59f00a83a7', target_dir='E:/ytrans/output/r

202.150* Posted at: 1 day ago 👁15

ASR Error [faster-whisper (Local)] Traceback (most recent call last):
File "videotrans\process\stt_fun.py", line 257, in faster_whisper
File "faster_whisper\transcribe.py", line 586, in _batched_segments_generator
File "faster_whisper\transcribe.py", line 120, in forward
File "faster_whisper\transcribe.py", line 209, in generate_segment_batched
File "faster_whisper\transcribe.py", line 1400, in encode
RuntimeError: CUDA failed with error out of memory

Traceback (most recent call last):
File "videotrans\task\job.py", line 106, in run
File "videotrans\task\_speech2text.py", line 153, in recogn
File "videotrans\recognition\__init__.py", line 266, in run
File "videotrans\recognition\_base.py", line 142, in run
File "videotrans\recognition\_overall.py", line 63, in _exec
File "videotrans\recognition\_overall.py", line 136, in _faster
File "videotrans\configure\_base.py", line 290, in _new_process
RuntimeError: Traceback (most recent call last):
File "videotrans\process\stt_fun.py", line 257, in faster_whisper
File "faster_whisper\transcribe.py", line 586, in _batched_segments_generator
File "faster_whisper\transcribe.py", line 120, in forward
File "faster_whisper\transcribe.py", line 209, in generate_segment_batched
File "faster_whisper\transcribe.py", line 1400, in encode
RuntimeError: CUDA failed with error out of memory
TaskCfgSTT(is_cuda=True, uuid='59f00a83a7', cache_folder='E:/ytrans/tmp/4868/59f00a83a7', target_dir='E:/ytrans/output/recogn', source_language=None, source_language_code=None, source_sub=None, source_wav=None, source_wav_output=None, target_language=None, target_language_code=None, target_sub='E:/ytrans/output/recogn/2.srt', target_wav=None, target_wav_output=None, name='E:/PH/2.m4a', noextname='2', basename='2.m4a', ext='m4a', dirname='E:/PH', shound_del_name=None, detect_language='ko', recogn_type=0, model_name='large-v3-turbo', shibie_audio='E:/ytrans/tmp/4868/59f00a83a7/2-1772619698.5026176.wav', remove_noise=False, enable_diariz=False, nums_diariz=0, rephrase=0, fix_punc=False)


=
system:Windows-10-10.0.26100-SP0
version:v3.97
frozen:True
language:en
root_dir:E:/ytrans

1 Replies

This error is caused by insufficient GPU memory (VRAM) when using the faster-whisper model with CUDA acceleration.

To resolve this, follow these steps:

  1. Reduce GPU memory usage:

    • Go to Menu -> Tools/Options -> Advanced Options.
    • In the [General Settings] section, set GPU同时任务数[重启生效] (GPU Concurrent Tasks) to 1 (if it's not already).
    • In the [Speech Recognition Parameters] section, lower the 识别准确度beam_size (Recognition Accuracy beam_size) and 识别准确度best_of (Recognition Accuracy best_of) values to 1.
    • In the same section, set CUDA数据类型 (CUDA Data Type) to int8. This uses less memory but may slightly reduce accuracy.
    • Restart the software for the GPU task setting to take effect.
  2. Use a smaller model:

    • In the main interface, for the Speech Recognition channel, select faster-whisper (Local).
    • In the model dropdown, choose a smaller model than large-v3-turbo, such as medium or small. The large-v3 model requires more than 10GB of VRAM.
  3. Disable CUDA acceleration (as a last resort):

    • If the above steps fail, you can temporarily disable GPU acceleration. In the main interface, for the Speech Recognition channel, uncheck the CUDA checkbox next to the model selection. This will use the CPU, which is much slower but will not run out of memory.

Please refer to the relevant documentation:

Post Your Reply
Open source and free maintenance is not easy. If this project is helpful to you, please consider making a small donation to help the project continue to maintain and update.

Related resource