#4443 TaskCfgSTT(is_cuda=True, uuid='37366495ed', cache_folder='E:/win-pyvideotrans-v3.99-428/tmp/80556/37366495ed', target_di

129.128* Posted at: 2 hours ago 👁8

预处理阶段出错 opening output file E:/win-pyvideotrans-v3.99-428/tmp/80556/37366495ed/5-1778044749.0274868.wav. opening output files: Invalid argument
Traceback (most recent call last):
File "videotrans\util\help_ffmpeg.py", line 57, in runffmpeg
File "subprocess.py", line 526, in run
subprocess.CalledProcessError: Command '['ffmpeg', '-hide_banner', '-nostdin', '-ignore_unknown', '-threads', '0', '-y', '-i', 'F:/Zge/5.mp4', '-ac', '1', '-ar', '16000', '-c:a', 'pcm_s16le', '-af', 'volume=2.0,alimiter=limit=1.0', 'E:/win-pyvideotrans-v3.99-428/tmp/80556/37366495ed/5-1778044749.0274868.wav']' returned non-zero exit status 4294967274.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "videotrans\task\job.py", line 61, in run
File "videotrans\task\_speech2text.py", line 54, in prepare
File "videotrans\util\help_ffmpeg.py", line 467, in conver_to_16k
File "videotrans\util\help_ffmpeg.py", line 86, in runffmpeg
RuntimeError: opening output file E:/win-pyvideotrans-v3.99-428/tmp/80556/37366495ed/5-1778044749.0274868.wav. opening output files: Invalid argument
TaskCfgSTT(is_cuda=True, uuid='37366495ed', cache_folder='E:/win-pyvideotrans-v3.99-428/tmp/80556/37366495ed', target_dir='E:/output/recogn', source_language=None, source_language_code=None, source_sub=None, source_wav=None, source_wav_output=None, target_language=None, target_language_code=None, target_sub='E:/output/recogn/5.srt', target_wav=None, target_wav_output=None, name='F:/Zge/5.mp4', noextname='5', basename='5.mp4', ext='mp4', dirname='F:/Zge', shound_del_name=None, detect_language='zh-cn', recogn_type=2, model_name='1.7B', shibie_audio='E:/win-pyvideotrans-v3.99-428/tmp/80556/37366495ed/5-1778044749.0274868.wav', remove_noise=False, enable_diariz=False, nums_diariz=0, rephrase=0, fix_punc=False)


=
system:Windows-10-10.0.26200-SP0
version:v3.99
frozen:True
language:zh
root_dir:E:/win-pyvideotrans-v3.99-428

3 Replies

这是 ffmpeg 打开输出文件失败,通常由路径过长、包含特殊字符、目标文件夹不存在或权限不足引起。你在预处理阶段尝试将视频转为 WAV 时,无法创建目标文件 5-1778044749.0274868.wav

解决办法

  1. 确保输入视频路径简短且不含特殊符号:例如将 F:/Zge/5.mp4 移动至 D:/video/5.mp4
  2. 输出目录也要简单:软件安装在非中文、非空格路径(你当前已安装在 E:/win-pyvideotrans-v3.99-428),但临时文件输出路径可能仍嵌套过多。尝试将整个软件目录移动到更浅层位置,如 E:/pyvideotrans
  3. 确保 E:/output/recogn 目录存在且有写入权限,或临时文件夹可被创建(可手动创建 E:/win-pyvideotrans-v3.99-428/tmp 测试)。
  4. 关闭杀毒软件或在本软件目录添加白名单,防止拦截文件创建。
  5. 以管理员身份运行 sp.exe 再试。

若仍报错,请提供 logs/ 目录下最新的日志文件末尾 30 行。

请查阅相关文档:

[WARNING] cut_audio 时长小于 2000ms 需要下个字幕左移开始时间,diff=1952,prev_diff=64,next_diff=0
[WARNING] cut_audio 时长小于 2000ms 需要下个字幕左移开始时间,diff=1952,prev_diff=64,next_diff=0
[WARNING] cut_audio 时长小于 2000ms 需要前面字幕延长结束时间,diff=1440,prev_diff=0,next_diff=736
[WARNING] cut_audio 时长小于 2000ms 需要前面字幕延长结束时间,diff=1440,prev_diff=0,next_diff=736
[WARNING] cut_audio 时长小于 2000ms 需要下个字幕左移开始时间,diff=1920,prev_diff=736,next_diff=0
[WARNING] cut_audio 时长小于 2000ms 需要下个字幕左移开始时间,diff=1920,prev_diff=736,next_diff=0
[WARNING] cut_audio 时长小于 2000ms 需要前面字幕延长结束时间,diff=1152,prev_diff=512,next_diff=576
[WARNING] cut_audio 时长小于 2000ms 需要前面字幕延长结束时间,diff=1152,prev_diff=512,next_diff=576
[WARNING] cut_audio 时长小于 2000ms 需要前面字幕延长结束时间,diff=1152,prev_diff=576,next_diff=9216
[WARNING] cut_audio 时长小于 2000ms 需要前面字幕延长结束时间,diff=1152,prev_diff=576,next_diff=9216
[WARNING] cut_audio 时长小于 2000ms 需要下个字幕左移开始时间,diff=1472,prev_diff=9216,next_diff=64
[WARNING] cut_audio 时长小于 2000ms 需要下个字幕左移开始时间,diff=1472,prev_diff=9216,next_diff=64
[WAR
......

[WARNING] cut_audio 时长小于 2000ms 需要下个字幕左移开始时间,diff=1984,prev_diff=0,next_diff=0
[WARNING] cut_audio 时长小于 2000ms 需要下个字幕左移开始时间,diff=1984,prev_diff=0,next_diff=0
[WARNING] cut_audio 时长小于 2000ms 需要前面字幕延长结束时间,diff=544,prev_diff=0,next_diff=352
[WARNING] cut_audio 时长小于 2000ms 需要前面字幕延长结束时间,diff=544,prev_diff=0,next_diff=352
[WARNING] cut_audio 时长小于 2000ms 需要前面字幕延长结束时间,diff=1952,prev_diff=352,next_diff=2816
[WARNING] cut_audio 时长小于 2000ms 需要前面字幕延长结束时间,diff=1952,prev_diff=352,next_diff=2816
[WARNING] cut_audio 时长小于 2000ms 需要下个字幕左移开始时间,diff=1440,prev_diff=2816,next_diff=0
[WARNING] cut_audio 时长小于 2000ms 需要下个字幕左移开始时间,diff=1440,prev_diff=2816,next_diff=0
[WARNING] cut_audio 时长小于 2000ms 需要下个字幕左移开始时间,diff=960,prev_diff=0,next_diff=0
[WARNING] cut_audio 时长小于 2000ms 需要下个字幕左移开始时间,diff=960,prev_diff=0,next_diff=0
[WARNING] cut_audio 时长小于 2000ms 需要下个字幕左移开始时间,diff=1264,prev_diff=0,next_diff=0
[WARNING] cut_audio 时长小于 2000ms 需要下个字幕左移开始时间,diff=1264,prev_diff=0,next_diff=0
[WARNING] cut_audio 时长小于 2000ms 需要前面字幕延长结束时间,diff=928,prev_diff=0,next_diff=352
[WARNING] cut_audio 时长小于 2000ms 需要前面字幕延长结束时间,diff=928,prev_diff=0,next_diff=352
[WARNING] cut_audio 时长小于 2000ms 需要下个字幕左移开始时间,diff=1216,prev_diff=352,next_diff=0
[WARNING] cut_audio 时长小于 2000ms 需要下个字幕左移开始时间,diff=1216,prev_diff=352,next_diff=0
[WARNING] cut_audio 时长小于 2000ms 需要前面字幕延长结束时间,diff=800,prev_diff=0,next_diff=1024
[WARNING] cut_audio 时长小于 2000ms 需要前面字幕延长结束时间,diff=800,prev_diff=0,next_diff=1024
[WARNING] cut_audio 时长小于 2000ms 需要下个字幕左移开始时间,diff=736,prev_diff=0,next_diff=0
[WARNING] cut_audio 时长小于 2000ms 需要下个字幕左移开始时间,diff=736,prev_diff=0,next_diff=0
[WARNING] cut_audio 时长小于 2000ms 需要下个字幕左移开始时间,diff=1808,prev_diff=0,next_diff=0
[WARNING] cut_audio 时长小于 2000ms 需要下个字幕左移开始时间,diff=1808,prev_diff=0,next_diff=0
[WARNING] cut_audio 时长小于 2000ms 需要前面字幕延长结束时间,diff=912,prev_diff=0,next_diff=3296
[WARNING] cut_audio 时长小于 2000ms 需要前面字幕延长结束时间,diff=912,prev_diff=0,next_diff=3296
[WARNING] cut_audio 时长小于 2000ms 需要下个字幕左移开始时间,diff=624,prev_diff=3296,next_diff=0
[WARNING] cut_audio 时长小于 2000ms 需要下个字幕左移开始时间,diff=624,prev_diff=3296,next_diff=0
[WARNING] cut_audio 时长小于 2000ms 需要下个字幕左移开始时间,diff=1136,prev_diff=0,next_diff=0
[WARNING] cut_audio 时长小于 2000ms 需要下个字幕左移开始时间,diff=1136,prev_diff=0,next_diff=0
[WARNING] cut_audio 时长小于 2000ms 需要前面字幕延长结束时间,diff=1904,prev_diff=0,next_diff=1792
[WARNING] cut_audio 时长小于 2000ms 需要前面字幕延长结束时间,diff=1904,prev_diff=0,next_diff=1792
[WARNING] cut_audio 时长小于 2000ms 需要前面字幕延长结束时间,diff=1024,prev_diff=1792,next_diff=2976
[WARNING] cut_audio 时长小于 2000ms 需要前面字幕延长结束时间,diff=1024,prev_diff=1792,next_diff=2976
[WARNING] cut_audio 时长小于 2000ms 需要下个字幕左移开始时间,diff=1328,prev_diff=2976,next_diff=0
[WARNING] cut_audio 时长小于 2000ms 需要下个字幕左移开始时间,diff=1328,prev_diff=2976,next_diff=0
[WARNING] cut_audio 时长小于 2000ms 需要前面字幕延长结束时间,diff=992,prev_diff=32,next_diff=None
[WARNING] cut_audio 时长小于 2000ms 需要前面字幕延长结束时间,diff=992,prev_diff=32,next_diff=None

[DEBUG] GPU进程池:max_workers=1

这是正常的内部预处理日志,软件检测到某些音频片段时长小于 2000ms,正在尝试自动延长前一条字幕的结束时间或左移后一条字幕的开始时间,以避免过短的片段影响后续步骤。

若这些警告过多(通常说明你的视频语音断句很碎),可以优化 VAD 及分割参数来减少短片段:

  1. 在菜单 工具/选项 -> 高级选项 -> 语音识别参数 中:

    • 最短语音持续毫秒 适当提高(例如3000),强制合并短时语音片段。
    • 合并过短字幕到邻近 勾选启用。
    • Whisper预分割音频 勾选(尤其需配音时)。
  2. 若你将来需要配音或克隆,参考音频需落在 3~10s,可同时设置:

    • 最长语音持续秒数 设为 10。
    • 最短语音持续毫秒 设为 3000。
    • 并勾选 Whisper预分割音频合并过短字幕到邻近

软件默认的“字幕间延长/左移”逻辑仅在处理过程中实时调整,不会影响最终结果,功能正常。若仍觉断句过多,务必按上述调整 VAD 阈值与时长限制。

请查阅相关文档:

Post Your Reply
Open source and free maintenance is not easy. If this project is helpful to you, please consider making a small donation to help the project continue to maintain and update.

Related resource