#3405 TaskCfgSTT(is_cuda=True, uuid='1480fd8f81', cache_folder='E:/ytrans/tmp/704/1480fd8f81', target_dir='E:/ytrans/output/re

202.150* Posted at: 18 hours ago 👁18

ASR Error [faster-whisper (Local)] Traceback (most recent call last):
File "videotrans\process\stt_fun.py", line 257, in faster_whisper
File "faster_whisper\transcribe.py", line 586, in _batched_segments_generator
File "faster_whisper\transcribe.py", line 120, in forward
File "faster_whisper\transcribe.py", line 209, in generate_segment_batched
File "faster_whisper\transcribe.py", line 1400, in encode
RuntimeError: CUDA failed with error out of memory

Traceback (most recent call last):
File "videotrans\task\job.py", line 106, in run
File "videotrans\task\_speech2text.py", line 153, in recogn
File "videotrans\recognition\__init__.py", line 266, in run
File "videotrans\recognition\_base.py", line 142, in run
File "videotrans\recognition\_overall.py", line 63, in _exec
File "videotrans\recognition\_overall.py", line 136, in _faster
File "videotrans\configure\_base.py", line 290, in _new_process
RuntimeError: Traceback (most recent call last):
File "videotrans\process\stt_fun.py", line 257, in faster_whisper
File "faster_whisper\transcribe.py", line 586, in _batched_segments_generator
File "faster_whisper\transcribe.py", line 120, in forward
File "faster_whisper\transcribe.py", line 209, in generate_segment_batched
File "faster_whisper\transcribe.py", line 1400, in encode
RuntimeError: CUDA failed with error out of memory
TaskCfgSTT(is_cuda=True, uuid='1480fd8f81', cache_folder='E:/ytrans/tmp/704/1480fd8f81', target_dir='E:/ytrans/output/recogn', source_language=None, source_language_code=None, source_sub=None, source_wav=None, source_wav_output=None, target_language=None, target_language_code=None, target_sub='E:/ytrans/output/recogn/6.srt', target_wav=None, target_wav_output=None, name='E:/PH/6.m4a', noextname='6', basename='6.m4a', ext='m4a', dirname='E:/PH', shound_del_name=None, detect_language='ko', recogn_type=0, model_name='large-v3-turbo', shibie_audio='E:/ytrans/tmp/704/1480fd8f81/6-1772250480.2670305.wav', remove_noise=False, enable_diariz=False, nums_diariz=0, rephrase=0, fix_punc=False)


=
system:Windows-10-10.0.26100-SP0
version:v3.97
frozen:True
language:en
root_dir:E:/ytrans

1 Replies

The error is caused by insufficient GPU memory (CUDA out of memory) when running the faster-whisper model.

To resolve this issue, you can try the following steps:

  1. Reduce the model size: In the ASR settings, switch from large-v3-turbo to a smaller model, such as large-v2, medium, or small.
  2. Enable model quantization: In the ASR settings, check the option for int8 quantization. This significantly reduces VRAM usage.
  3. Adjust processing parameters: In the "Advanced Options" -> "Speech Recognition Parameters", reduce the beam_size value (e.g., from 5 to 2 or 1).
  4. Close other GPU-intensive applications: Ensure no other programs are using your NVIDIA GPU to free up VRAM.
  5. Check video/audio length: If the source file (6.m4a) is very long, consider splitting it into shorter segments before processing.

If the above steps do not work, you may need to process the file using the CPU instead of GPU, though this will be much slower. You can do this by unchecking the "Use GPU" option in the ASR settings for faster-whisper.

Please refer to the relevant documentation:

Post Your Reply
Open source and free maintenance is not easy. If this project is helpful to you, please consider making a small donation to help the project continue to maintain and update.

Related resource