#5283 TaskCfgVTT(uuid='2d58939cdd', name='E:/videoplayback.m4a', dirname='E:/', noextname='videoplayback', basename='videoplay

2a09:bac5* Posted at: 2 days ago

Batch size mismatch: audio=8, context=0:Traceback (most recent call last):
File "videotrans\process\stt_fun.py", line 559, in qwen3asr_fun
File "torch\utils\_contextlib.py", line 116, in decorate_context

return func(*args, **kwargs)

File "E:\A4\_internal\qwen_asr\inference\qwen3_asr.py", line 345, in transcribe

raise ValueError(f"Batch size mismatch: audio={n}, context={len(ctxs)}")

ValueError: Batch size mismatch: audio=8, context=0
[Qwen-ASR(本地内置), 兼容AI/本地模型, Qwen3-TTS(本地内置)]
Traceback (most recent call last):
File "videotrans\task\only_one.py", line 47, in run
File "videotrans\task\trans_create.py", line 317, in recogn
File "videotrans\recognition\__init__.py", line 190, in run
File "videotrans\recognition\_base.py", line 94, in run
File "videotrans\recognition\_qwenasrlocal.py", line 45, in _exec
File "videotrans\configure\base.py", line 268, in _new_process
videotrans.configure.excepts.VideoTransError: Batch size mismatch: audio=8, context=0:Traceback (most recent call last):
File "videotrans\process\stt_fun.py", line 559, in qwen3asr_fun
File "torch\utils\_contextlib.py", line 116, in decorate_context

return func(*args, **kwargs)

File "E:\A4\_internal\qwen_asr\inference\qwen3_asr.py", line 345, in transcribe

raise ValueError(f"Batch size mismatch: audio={n}, context={len(ctxs)}")

ValueError: Batch size mismatch: audio=8, context=0
TaskCfgVTT(uuid='2d58939cdd', name='E:/videoplayback.m4a', dirname='E:/', noextname='videoplayback', basename='videoplayback.m4a', ext='m4a', target_dir='E:/_video_out/videoplayback-m4a', cache_folder='E:/A4/tmp/24968/2d58939cdd', is_cuda=True, source_language='英语', source_language_code='en', source_sub='E:/_video_out/videoplayback-m4a/en.srt', source_wav='E:/A4/tmp/24968/2d58939cdd/en.wav', source_wav_output='E:/_video_out/videoplayback-m4a/en.m4a', target_language='简体中文', target_language_code='zh-cn', target_sub='E:/_video_out/videoplayback-m4a/zh-cn.srt', target_wav='E:/A4/tmp/24968/2d58939cdd/target.wav', target_wav_output='E:/_video_out/videoplayback-m4a/zh-cn.m4a', translate_type=8, tts_type=1, volume='+0%', pitch='+0Hz', voice_rate='+0%', voice_role='clone', voice_autorate=False, video_autorate=False, remove_silent_mid=False, align_sub_audio=True, detect_language='en', recogn_type=2, model_name='1.7B', shibie_audio=None, remove_noise=False, enable_diariz=False, nums_diariz=0, rephrase=0, fix_punc=0, subtitle_language=None, app_mode='biaozhun', subtitles='', targetdir_mp4='E:/_video_out/videoplayback-m4a/videoplayback.mp4', novoice_mp4='E:/A4/tmp/24968/2d58939cdd/novoice.mp4', is_separate=False, embed_bgm=True, instrument='E:/A4/tmp/24968/2d58939cdd/instrument.wav', vocal='E:/A4/tmp/24968/2d58939cdd/vocal.wav', clear_cache=False, background_music='', subtitle_type=0, only_out_mp4=False, recogn2pass=False, output_srt=0, copysrt_rawvideo=False, loop_backaudio=1, backaudio_volume=0.8)


=
system:Windows-10-10.0.26200-SP0
version:v4.02
frozen:True
language:zh
root_dir:E:/A4

1 Replies

Qwen-ASR(本地内置) 返回 Batch size mismatch: audio=8, context=0 错误,原因是模型在识别的子进程中拿到了音频片段(batch 数为 8),但上下文信息(context)为空,导致数据对不齐。这通常是因为 语音预分割或 VAD 参数设置不当,使得截取的音频片段无法关联到有效的识别上下文,或者模型加载时缓存了不完整的状态。

排查步骤(请按顺序尝试):

  1. 关闭“Whisper预分割音频”
    进入 菜单 → 工具 → 高级选项 → 语音识别参数,找到 Whisper预分割音频?,取消勾选。该选项会提前将音频裁剪为句子片段,若片段过短或静音分割异常,会导致 context 缺失。
  2. 调整 VAD 参数以生成更稳定的片段
    在同样的高级设置中:

    • 最长语音持续(秒) 设为 6~10(不要过大)
    • 最短语音持续(毫秒) 设为 3000~4000(3~4 秒)
    • 确保 合并过短字幕到邻近合并过短字幕 已勾选
      这些设置能让 Qwen-ASR 获得更完整的上下文信息。
  3. 检查音频文件是否有效

    • 确认 E:/videoplayback.m4a 包含清晰、连续的人声,且长度至少几秒。
    • 若音频本身无声或极短,模型可能无法生成上下文。
  4. 换用其他 ASR 渠道临时测试
    切换到 faster-whisper(本地) + large-v3 模型阿里 FunASR,看是否同样报错。若其他渠道正常,说明是 Qwen-ASR 本地模型或配置的问题。
  5. 重新下载 Qwen-ASR 模型(1.7B)
    模型文件可能部分损坏。删除 models 目录下对应的 Qwen-ASR 模型文件夹(名称类似 models--Qwen--Qwen-ASR-1.7B),重新启动软件让系统自动下载。

若以上步骤均无法解决,请提供软件根目录下 logs/ 文件夹中最新的完整日志文件(底部约 20~30 行),以便进一步定位。


请查阅相关文档:

Post Your Reply
Open source and free maintenance is not easy. If this project is helpful to you, please consider making a small donation to help the project continue to maintain and update. Dondate: ko-fi.com/jianchang512

Related resource

Similar issues already exist