#2887 TaskCfg(cache_folder='E:/ASR/VideoTrans3.95/tmp/7032/04365bbdd2', target_dir='E:/ASR/VideoTrans3.95/output/recogn', remo

61.144* Posted at: 1 day ago 👁12

语音识别阶段出错 [阿里FunASR(本地)] 程序内部错误:'float' object cannot be interpreted as an integer
Traceback (most recent call last):
File "videotrans\task\job.py", line 106, in run
File "videotrans\task\_speech2text.py", line 156, in recogn
File "videotrans\recognition\__init__.py", line 252, in run
File "videotrans\recognition\_base.py", line 140, in run
File "videotrans\recognition\_funasr.py", line 54, in _exec
File "videotrans\recognition\_base.py", line 308, in cut_audio
File "videotrans\recognition\_base.py", line 118, in _vad_split
File "videotrans\configure\_base.py", line 265, in _new_process
File "videotrans\process\signelobj.py", line 73, in submit_task_cpu
File "videotrans\process\signelobj.py", line 56, in get_executor_cpu
File "concurrent\futures\process.py", line 650, in init
File "concurrent\futures\process.py", line 165, in init
File "multiprocessing\queues.py", line 49, in init
File "multiprocessing\context.py", line 88, in BoundedSemaphore
File "multiprocessing\synchronize.py", line 145, in init
File "multiprocessing\synchronize.py", line 57, in init
TypeError: 'float' object cannot be interpreted as an integer
TaskCfg(cache_folder='E:/ASR/VideoTrans3.95/tmp/7032/04365bbdd2', target_dir='E:/ASR/VideoTrans3.95/output/recogn', remove_noise=False, is_separate=False, detect_language='zh-cn', subtitle_language=None, source_language=None, target_language=None, source_language_code=None, target_language_code=None, source_sub=None, target_sub='E:/ASR/VideoTrans3.95/output/recogn/45买点真相1029.srt', source_wav=None, source_wav_output=None, target_wav=None, target_wav_output=None, subtitles='', novoice_mp4=None, noextname='45买点真相1029', shibie_audio='E:/ASR/VideoTrans3.95/tmp/7032/04365bbdd2/45买点真相1029-1769397426.1303196.wav', targetdir_mp4=None, instrument=None, vocal=None, back_audio=None, background_music=None, app_mode='biaozhun', subtitle_type=0, volume='+0%', pitch='+0Hz', voice_rate='+0%', voice_role=None, copysrt_rawvideo=False, clear_cache=False, translate_type=None, tts_type=None, recogn_type=2, model_name='Fun-ASR-Nano-2512', voice_autorate=False, video_autorate=False, cuda=False, name='E:/ASR/摩缠字幕/45买点真相1029.mp4', basename='45买点真相1029.mp4', ext='mp4', dirname='E:/ASR/摩缠字幕', shound_del_name=None, uuid='04365bbdd2', remove_silent_mid=False, align_sub_audio=True, enable_diariz=False, nums_diariz=0, rephrase=0, only_out_mp4=False, fix_punc=True, recogn2pass=False)


=
system:Windows-10-10.0.17763-SP0
version:v3.95
frozen:True
language:zh
root_dir:E:/ASR/VideoTrans3.95

1 Replies

这是由于 FunASR 语音识别在处理音频分割时,VAD(语音活动检测)参数 max_seconds 被错误地设置为浮点数类型,而程序内部期望是整数类型导致的。

解决方案

  1. 检查并修改 FunASR 的 VAD 参数 max_seconds。该参数应设置为整数(例如 600),而不是浮点数(如 600.0)。
  2. 具体操作路径:在软件界面中,进入“语音识别参数”或“高级选项”设置,找到 FunASR 相关的 VAD 配置项,确保 max_seconds 的值为整数。
  3. 如果无法在界面中找到该设置,可以尝试在软件根目录下的 configure 或相关配置文件中手动查找并修改 max_seconds 参数。

参考文档

Post Your Reply
Open source and free maintenance is not easy. If this project is helpful to you, please consider making a small donation to help the project continue to maintain and update.

Related resource