#4220 TaskCfgVTT(is_cuda=False, uuid='fbc11c931a', cache_folder='E:/win-pyvideotrans-v3.99-420/tmp/19252/fbc11c931a', target_d

117.189* Posted at: 2 hours ago 👁10

语音识别阶段出错 [OpenAI语音识别API] error parsing multipart form: multipart: NextPart: bufio: buffer full (request id: 20260423101916765528875dopO2ugr)
Traceback (most recent call last):
File "videotrans\task\job.py", line 105, in run
File "videotrans\task\trans_create.py", line 380, in recogn
File "videotrans\recognition\__init__.py", line 250, in run
File "videotrans\recognition\_base.py", line 143, in run
File "videotrans\recognition\_openairecognapi.py", line 37, in _exec
File "videotrans\recognition\_openairecognapi.py", line 91, in _thrid_api
File "openai\_utils\_utils.py", line 286, in wrapper
File "openai\resources\audio\transcriptions.py", line 483, in create
File "openai\_base_client.py", line 1297, in post
File "openai\_base_client.py", line 1070, in request
openai.InternalServerError: Error code: 500 - {'error': {'message': 'error parsing multipart form: multipart: NextPart: bufio: buffer full (request id: 20260423101916765528875dopO2ugr)', 'type': 'new_api_error', 'param': '', 'code': 'convert_request_failed'}}
TaskCfgVTT(is_cuda=False, uuid='fbc11c931a', cache_folder='E:/win-pyvideotrans-v3.99-420/tmp/19252/fbc11c931a', target_dir='E:/screen shot/output/vocal-wav', source_language='日语', source_language_code='ja', source_sub='E:/screen shot/output/vocal-wav/ja.srt', source_wav='E:/win-pyvideotrans-v3.99-420/tmp/19252/fbc11c931a/ja.wav', source_wav_output='E:/screen shot/output/vocal-wav/ja.m4a', target_language='简体中文', target_language_code='zh-cn', target_sub='E:/screen shot/output/vocal-wav/zh-cn.srt', target_wav='E:/win-pyvideotrans-v3.99-420/tmp/19252/fbc11c931a/target.wav', target_wav_output='E:/screen shot/output/vocal-wav/zh-cn.m4a', name='E:/screen shot/output/02_リラクゼーション_24-wav/vocal.wav', noextname='vocal', basename='vocal.wav', ext='wav', dirname='E:/screen shot/output/02_リラクゼーション_24-wav', shound_del_name=None, translate_type=3, tts_type=1, volume='+0%', pitch='+0Hz', voice_rate='+0%', voice_role='No', voice_autorate=False, video_autorate=False, remove_silent_mid=False, align_sub_audio=True, detect_language='ja', recogn_type=5, model_name='large-v3-turbo', shibie_audio=None, remove_noise=False, enable_diariz=False, nums_diariz=0, rephrase=1, fix_punc=False, subtitle_language=None, app_mode='tiqu', subtitles='', targetdir_mp4='E:/screen shot/output/vocal-wav/vocal.mp4', novoice_mp4='E:/win-pyvideotrans-v3.99-420/tmp/19252/fbc11c931a/novoice.mp4', is_separate=False, embed_bgm=False, instrument=None, vocal='E:/win-pyvideotrans-v3.99-420/tmp/19252/fbc11c931a/vocal.wav', back_audio='', clear_cache=False, background_music=None, subtitle_type=0, only_out_mp4=False, recogn2pass=True, output_srt=2, copysrt_rawvideo=True)


=
system:Windows-10-10.0.26200-SP0
version:v3.99
frozen:True
language:zh
root_dir:E:/win-pyvideotrans-v3.99-420

1 Replies

这是由于 OpenAI 语音识别 API 在处理音频文件时,服务器端解析请求数据失败导致的内部错误(500 错误)。错误信息 multipart: NextPart: bufio: buffer full 表明服务器在处理上传的音频文件时,其缓冲区已满。

解决方案:

  1. 检查音频文件大小:确认您要识别的音频文件(vocal.wav)是否过大。如果文件非常大,可以尝试先使用软件自带的“音频视频格式转换”工具将其分割或压缩。
  2. 调整高级设置:在软件菜单栏中,进入 工具/选项 -> 高级选项,在 语音识别参数 区域,尝试启用 Whisper预分割音频? 选项。这会将长音频预先切割成片段再发送,可能避免单次请求数据过大。
  3. 切换语音识别渠道:如果问题持续存在,可以暂时切换到其他语音识别渠道,例如本地的 faster-whisper(本地)qwen-asr(本地)(针对中文效果较好)。
  4. 联系服务商:此错误也可能源于 OpenAI 服务器端的临时问题。您可以稍后重试,或检查 OpenAI 的服务状态。

请注意:从日志中看到您使用了 recogn2pass=True(二次识别),这会在配音完成后再次进行语音识别。如果同时处理多个任务,可能会显著增加显存/内存占用。如果遇到资源不足的错误,可以尝试关闭此选项。

请查阅相关文档:

Post Your Reply
Open source and free maintenance is not easy. If this project is helpful to you, please consider making a small donation to help the project continue to maintain and update.

Related resource