#4366 ValueError: The elements of the batch contain different keys. Cannot batch them ({'is_last', 'input_features', 'attentio

2408:8262* Posted at: 6 hours ago 👁13

Traceback (most recent call last):
File "videotrans\process\stt_fun.py", line 596, in pipe_asr
File "transformers\pipelines\pt_utils.py", line 126, in next

item = next(self.iterator)

File "transformers\pipelines\pt_utils.py", line 271, in next

processed = self.infer(next(self.iterator), **self.params)

File "torch\utils\data\dataloader.py", line 733, in next

data = self._next_data()

File "torch\utils\data\dataloader.py", line 789, in _next_data

data = self._dataset_fetcher.fetch(index)  # may raise StopIteration

File "torch\utils\data\_utils\fetch.py", line 43, in fetch

return self.collate_fn(data)

File "transformers\pipelines\base.py", line 182, in inner

raise ValueError(

ValueError: The elements of the batch contain different keys. Cannot batch them ({'is_last', 'input_features', 'attention_mask'} != {'num_frames', 'is_last', 'input_features', 'attention_mask'})

Traceback (most recent call last):
File "videotrans\task\_mult_video.py", line 36, in run
File "videotrans\task\trans_create.py", line 380, in recogn
File "videotrans\recognition\__init__.py", line 280, in run
File "videotrans\recognition\_base.py", line 143, in run
File "videotrans\recognition\_huggingface.py", line 38, in _exec
File "videotrans\recognition\_huggingface.py", line 63, in _pipe_asr
File "videotrans\configure\_base.py", line 289, in _new_process
RuntimeError: Traceback (most recent call last):
File "videotrans\process\stt_fun.py", line 596, in pipe_asr
File "transformers\pipelines\pt_utils.py", line 126, in next

item = next(self.iterator)

File "transformers\pipelines\pt_utils.py", line 271, in next

processed = self.infer(next(self.iterator), **self.params)

File "torch\utils\data\dataloader.py", line 733, in next

data = self._next_data()

File "torch\utils\data\dataloader.py", line 789, in _next_data

data = self._dataset_fetcher.fetch(index)  # may raise StopIteration

File "torch\utils\data\_utils\fetch.py", line 43, in fetch

return self.collate_fn(data)

File "transformers\pipelines\base.py", line 182, in inner

raise ValueError(

ValueError: The elements of the batch contain different keys. Cannot batch them ({'is_last', 'input_features', 'attention_mask'} != {'num_frames', 'is_last', 'input_features', 'attention_mask'})


=
system:Windows-10-10.0.26200-SP0
version:v3.99
frozen:True
language:zh
root_dir:E:/APP/win-pyvideotrans-v3.99-428

1 Replies

这是由于音频片段长度不一致导致 transformers 管道批处理时特征键不匹配(input_features 形状不同)。

解决方案:

  1. 在菜单栏 工具/选项 → 高级选项 → 语音识别参数 中,勾选 “Whisper预分割音频”,强制模型逐句处理,避免不统一长度的片段混合批处理。
  2. “最长语音持续秒数” 设为较小的值(如 6 秒),避免生成极长片段;“最短语音持续毫秒” 设为 1000ms 以上,过滤过短碎片。
  3. 若问题持续,可尝试切换至 faster-whisper(本地) 渠道,它对变长片段兼容性更好。
  4. 如果必须使用 HuggingFace 模型,请确保软件为最新版并重启后重试。

请查阅相关文档:

Post Your Reply