#5369 cfg=TaskCfgSTT(uuid='32be0d0a93', name='D:/oo/fenhongjun/840860667b45.mp4', dirname='D:/oo/fenhongjun', noextname='84086

64.118* Posted at: 1 day ago

语音识别阶段出错[阿里FunASR(本地)(本地内置)] 处理列表或序列时索引越界:list index out of range
Traceback (most recent call last):

File "videotrans\task\job.py", line 35, in run

File "videotrans\task\job.py", line 100, in process_task

File "videotrans\task\speech2text.py", line 97, in recogn

File "videotrans\recognition\__init__.py", line 191, in run

File "videotrans\recognition\_base.py", line 90, in run

File "videotrans\recognition\_funasr.py", line 46, in _exec

File "videotrans\recognition\_base.py", line 214, in cut_audio

IndexError: list index out of range
cfg=TaskCfgSTT(uuid='32be0d0a93', name='D:/oo/fenhongjun/840860667b45.mp4', dirname='D:/oo/fenhongjun', noextname='840860667b45', basename='840860667b45.mp4', ext='mp4', target_dir='D:/Tool/win-pyvideotrans-v4.03-0622/output/recogn', cache_folder='D:/Tool/win-pyvideotrans-v4.03-0622/tmp/12368/32be0d0a93', is_cuda=True, source_language=None, source_language_code=None, source_sub=None, source_wav=None, source_wav_output=None, target_language=None, target_language_code=None, target_sub='D:/Tool/win-pyvideotrans-v4.03-0622/output/recogn/840860667b45.srt', target_wav=None, target_wav_output=None, detect_language='zh-cn', recogn_type=3, model_name='Fun-ASR-Nano-2512', shibie_audio='D:/Tool/win-pyvideotrans-v4.03-0622/tmp/12368/840860667b45-134309094-removed_noise.wav', remove_noise=True, enable_diariz=False, nums_diariz=0, rephrase=0, fix_punc=0)


=
system:Windows-10-10.0.26200-SP0
version:v4.03
frozen:True
language:zh
root_dir:D:/Tool/win-pyvideotrans-v4.03-0622

1 Replies

IndexError: list index out of range 表明语音识别在切割音频片段时,没有生成有效的语音片段列表。这通常是因为音频文件过短、静音过多,或降噪后音频被破坏。

解决方法:

  1. 检查音频有效性:用播放器打开 D:/Tool/win-pyvideotrans-v4.03-0622/tmp/12368/840860667b45-134309094-removed_noise.wav,确认是否包含清晰人声,且时长足够(建议 > 3 秒)。
  2. 关闭降噪:在主界面“设置更多参数”中,取消勾选“降噪”后再试。
  3. 调整语音识别参数:进入 菜单 -> 工具 -> 高级选项 -> 语音识别参数

    • 将“最长语音持续秒数”调低至 6 秒
    • 将“最短语音持续毫秒”调低至 500
    • 勾选“合并过短字幕到邻近”
  4. 更换识别渠道:尝试使用 faster-whisper(本地) + large-v3 模型,看是否正常。
  5. 重新下载模型:删除 models/models--iic--funasr-* 目录后重启软件,让模型重新下载。

如果仍失败,请提供 logs/ 目录下最新日志文件(底部约30行)。

Post Your Reply
Open source and free maintenance is not easy. If this project is helpful to you, please consider making a small donation to help the project continue to maintain and update. Dondate: ko-fi.com/jianchang512

Related resource