#5482 cfg=TaskCfgSTT(uuid='68ac55afdf', name='D:/张建光/姜胡说/2021年/@姜胡说_20211024_这是我理解的_蕞简单的投资_姜胡说_打造用的....mp4', dirname='D:/张建光/姜

106.35* Posted at: 1 day ago

语音识别阶段出错[阿里FunASR(本地)(本地内置)] 'sentence_info':Traceback (most recent call last):
File "videotrans\process\stt_fun.py", line 493, in paraformer
KeyError: 'sentence_info'

Traceback (most recent call last):

File "videotrans\task\job.py", line 35, in run

File "videotrans\task\job.py", line 100, in process_task

File "videotrans\task\speech2text.py", line 97, in recogn

File "videotrans\recognition\__init__.py", line 191, in run

File "videotrans\recognition\_base.py", line 90, in run

File "videotrans\recognition\_funasr.py", line 61, in _exec

File "videotrans\configure\base.py", line 272, in _new_process

videotrans.configure.excepts.VideoTransError: 'sentence_info':Traceback (most recent call last):
File "videotrans\process\stt_fun.py", line 493, in paraformer
KeyError: 'sentence_info'
cfg=TaskCfgSTT(uuid='68ac55afdf', name='D:/张建光/姜胡说/2021年/@姜胡说_20211024_这是我理解的_蕞简单的投资_姜胡说_打造用的....mp4', dirname='D:/张建光/姜胡说/2021年', noextname='@姜胡说_20211024_这是我理解的_蕞简单的投资_姜胡说_打造用的...', basename='@姜胡说_20211024_这是我理解的_蕞简单的投资_姜胡说_打造用的....mp4', ext='mp4', target_dir='D:/download/pyVideoTrans/output/recogn', cache_folder='D:/download/pyVideoTrans/tmp/7908/68ac55afdf', is_cuda=True, source_language=None, source_language_code=None, source_sub=None, source_wav=None, source_wav_output=None, target_language=None, target_language_code=None, target_sub='D:/download/pyVideoTrans/output/recogn/@姜胡说_20211024_这是我理解的_蕞简单的投资_姜胡说_打造用的....srt', target_wav=None, target_wav_output=None, detect_language='zh-cn', recogn_type=3, model_name='paraformer-zh', shibie_audio='D:/download/pyVideoTrans/tmp/7908/68ac55afdf/@姜胡说_20211024_这是我理解的_蕞简单的投资_姜胡说_打造用的...-1782436576.0386875.wav', remove_noise=False, enable_diariz=False, nums_diariz=0, rephrase=0, fix_punc=0)


=
system:Windows-10-10.0.26200-SP0
version:v4.03
frozen:True
language:zh
root_dir:D:/download/pyVideoTrans

1 Replies

语音识别阶段出错[阿里FunASR(本地内置)] KeyError: 'sentence_info'

原因:FunASR 本地模型在处理该音频时返回的数据结构中缺少 sentence_info 字段,通常因为音频中未检测到有效的人声片段、音频格式不兼容或模型异常导致。

解决方案:

  1. 检查音频质量:确认视频中是否包含清晰连续的人声。若音视频文件本身只有背景音乐或静音,可改用其他测试文件。
  2. 更换 ASR 渠道:建议临时切换为 faster-whisper(本地内置) + large-v3 模型,或 Qwen-ASR(本地内置),避免此问题。
  3. 更新软件版本:当前版本为 v4.03,最新版本为 v4.03-0622,可能已修复该 Bug,请下载完整包覆盖升级。
  4. 重新下载 FunASR 模型:删除 models/models--iic--speech_paraformer-large-vad-punc-spk_asr_nat-zh-cn 文件夹(或类似名称),重新运行软件使其自动下载。

请查阅相关文档:

Post Your Reply
Open source and free maintenance is not easy. If this project is helpful to you, please consider making a small donation to help the project continue to maintain and update. Dondate: ko-fi.com/jianchang512

Related resource