#3401 TaskCfgSTT(is_cuda=True, uuid='bb6af76f14', cache_folder='E:/ytrans/tmp/7320/bb6af76f14', target_dir='E:/ytrans/output/r

202.150* Posted at: 2 hours ago 👁8

ASR Error [faster-whisper (Local)] Traceback (most recent call last):
File "videotrans\process\stt_fun.py", line 257, in faster_whisper
File "faster_whisper\transcribe.py", line 586, in _batched_segments_generator
File "faster_whisper\transcribe.py", line 120, in forward
File "faster_whisper\transcribe.py", line 209, in generate_segment_batched
File "faster_whisper\transcribe.py", line 1400, in encode
RuntimeError: CUDA failed with error out of memory

Traceback (most recent call last):
File "videotrans\task\job.py", line 106, in run
File "videotrans\task\_speech2text.py", line 153, in recogn
File "videotrans\recognition\__init__.py", line 266, in run
File "videotrans\recognition\_base.py", line 142, in run
File "videotrans\recognition\_overall.py", line 63, in _exec
File "videotrans\recognition\_overall.py", line 136, in _faster
File "videotrans\configure\_base.py", line 290, in _new_process
RuntimeError: Traceback (most recent call last):
File "videotrans\process\stt_fun.py", line 257, in faster_whisper
File "faster_whisper\transcribe.py", line 586, in _batched_segments_generator
File "faster_whisper\transcribe.py", line 120, in forward
File "faster_whisper\transcribe.py", line 209, in generate_segment_batched
File "faster_whisper\transcribe.py", line 1400, in encode
RuntimeError: CUDA failed with error out of memory
TaskCfgSTT(is_cuda=True, uuid='bb6af76f14', cache_folder='E:/ytrans/tmp/7320/bb6af76f14', target_dir='E:/ytrans/output/recogn', source_language=None, source_language_code=None, source_sub=None, source_wav=None, source_wav_output=None, target_language=None, target_language_code=None, target_sub='E:/ytrans/output/recogn/1.srt', target_wav=None, target_wav_output=None, name='E:/PH/1.m4a', noextname='1', basename='1.m4a', ext='m4a', dirname='E:/PH', shound_del_name=None, detect_language='ko', recogn_type=0, model_name='large-v3-turbo', shibie_audio='E:/ytrans/tmp/7320/bb6af76f14/1-1772206555.833591.wav', remove_noise=False, enable_diariz=False, nums_diariz=0, rephrase=0, fix_punc=False)


=
system:Windows-10-10.0.26100-SP0
version:v3.97
frozen:True
language:en
root_dir:E:/ytrans

1 Replies

This error is caused by insufficient GPU memory (VRAM) when running the faster-whisper model with CUDA acceleration.

To resolve this issue, follow these steps:

  1. Reduce the model size or precision: In the software's settings for the faster-whisper recognition channel, switch to a smaller model (e.g., from large-v3-turbo to large-v2 or medium) or enable int8 quantization to lower memory usage.
  2. Adjust processing parameters: In the "Advanced Options" -> "Speech Recognition Parameters", reduce the beam_size value (e.g., to 1 or 2). This decreases the search space during decoding, reducing memory consumption.
  3. Disable CUDA acceleration temporarily: If the above steps do not work, you can disable CUDA acceleration for the recognition task by setting is_cuda=False in the task configuration or selecting a CPU-only mode in the interface. Note that this will significantly slow down processing.
  4. Close other GPU-intensive applications: Ensure no other programs (like games, browsers with many tabs, or other AI tools) are using your GPU's memory concurrently.
  5. Consider hardware limitations: If your NVIDIA GPU has less than 8GB of VRAM, using large models like large-v3-turbo may be inherently problematic. Sticking to smaller models (medium, small) is recommended for such hardware.

Please refer to the relevant documentation:

Post Your Reply
Open source and free maintenance is not easy. If this project is helpful to you, please consider making a small donation to help the project continue to maintain and update.

Related resource