#2761 用Fun-ASR-Nano-2512批量识别语音时卡住了，勾选了识别说话人与插入说话人选项，日志有一个循环提示，这是什么问题？Downloading Model from https://www.modelscope.cn to direc

240e:3b7* Posted at: 5 months ago

用Fun-ASR-Nano-2512批量识别语音时卡住了，勾选了识别说话人与插入说话人选项，日志有一个循环提示，这是什么问题？
Downloading Model from https://www.modelscope.cn to directory: C:/SOFT/pyvideotrans/models\models\iic\punc_ct-transformer_cn-en-common-vocab471067-large
WARNING:root:trust_remote_code: False
Building prefix dict from the default dictionary ...
DEBUG:jieba:Building prefix dict from the default dictionary ...
Dumping model to file cache C:\TEMP\jieba.cache
DEBUG:jieba:Dumping model to file cache C:\TEMP\jieba.cache
Loading model cost 0.822 seconds.
DEBUG:jieba:Loading model cost 0.822 seconds.
Prefix dict has been built successfully.
DEBUG:jieba:Prefix dict has been built successfully.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results.
Setting pad_token_id to eos_token_id:151645 for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results.
end_idx=1,nu=3749
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results.
Setting pad_token_id to eos_token_id:151645 for open-end generation.
end_idx=2,nu=3749
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results.
Setting pad_token_id to eos_token_id:151645 for open-end generation.
end_idx=3,nu=3749

Floor #1 DeepSeek reply at 5 months ago

根据日志，您遇到的问题是在使用 FunASR 进行批量语音识别并勾选“识别说话人”时，程序卡在下载或加载说话人识别（VAD/ASR）模型阶段。循环出现的 The attention mask and the pad token id were not set... 是模型加载时的警告信息，并非根本原因。

核心问题很可能是模型下载失败或网络连接超时，导致程序在反复尝试。请按以下步骤排查：

检查网络与代理：
- 确保您的网络可以正常访问 modelscope.cn（魔搭社区）。如果使用代理，请在软件主界面第5行的“网络代理”处正确填写（如 http://127.0.0.1:7860）。
- 尝试暂时关闭代理或切换网络环境。
手动下载模型：
- 日志显示模型尝试下载到 C:/SOFT/pyvideotrans/models\models\iic\punc_ct-transformer_cn-en-common-vocab471067-large。
- 您可以尝试手动下载该模型：
  - 访问模型地址：https://www.modelscope.cn/models/iic/punc_ct-transformer_cn-en-common-vocab471067-large
  - 按照页面指引下载模型文件，并放置到上述日志指示的目录中（注意路径中的 \models\ 可能重复，实际路径应为 C:/SOFT/pyvideotrans/models/iic/punc_ct-transformer_cn-en-common-vocab471067-large）。
- 放置后重启软件再试。
简化任务，关闭说话人识别：
- 这是最直接的解决方案。如果您不需要区分不同说话人，请取消勾选“识别说话人”选项，这可以避免加载额外的VAD/ASR模型，通常能立刻解决问题。
使用其他识别模式：
- 如果必须使用说话人识别，可以尝试将“识别模式”从“整体识别”切换到“批量推理”。后者对模型加载的依赖可能不同，有时能绕过此问题。

如果以上步骤均无效，请提供 videotrans/logs/ 目录下更完整的日志文件，以便进一步分析。

#2761 用Fun-ASR-Nano-2512批量识别语音时卡住了，勾选了识别说话人与插入说话人选项，日志有一个循环提示，这是什么问题？Downloading Model from https://www.modelscope.cn to direc

1 Replies

请查看更多参考文档：

Post Your Reply

Related resource