#2420 starting inference...Traceback (most recent call last): File "/Volumes/Ventoy/video/index-tts/.venv/lib/python3.10/site

2406:da18* Posted at: 2 months ago 👁125

starting inference...
Traceback (most recent call last):
File "/Volumes/Ventoy/video/index-tts/.venv/lib/python3.10/site-packages/gradio/queueing.py", line 745, in process_events

response = await route_utils.call_process_api(

File "/Volumes/Ventoy/video/index-tts/.venv/lib/python3.10/site-packages/gradio/route_utils.py", line 349, in call_process_api

output = await app.get_blocks().process_api(

File "/Volumes/Ventoy/video/index-tts/.venv/lib/python3.10/site-packages/gradio/blocks.py", line 2123, in process_api

result = await self.call_function(

File "/Volumes/Ventoy/video/index-tts/.venv/lib/python3.10/site-packages/gradio/blocks.py", line 1630, in call_function

prediction = await anyio.to_thread.run_sync(  # type: ignore

File "/Volumes/Ventoy/video/index-tts/.venv/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync

return await get_async_backend().run_sync_in_worker_thread(

File "/Volumes/Ventoy/video/index-tts/.venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2476, in run_sync_in_worker_thread

return await future

File "/Volumes/Ventoy/video/index-tts/.venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 967, in run

result = context.run(func, *args)

File "/Volumes/Ventoy/video/index-tts/.venv/lib/python3.10/site-packages/gradio/utils.py", line 915, in wrapper

response = f(*args, **kwargs)

File "/Volumes/Ventoy/video/index-tts/webui.py", line 170, in gen_single

output = tts.infer(spk_audio_prompt=prompt, text=text,

File "/Volumes/Ventoy/video/index-tts/indextts/infer_v2.py", line 372, in infer

return list(self.infer_generator(

File "/Volumes/Ventoy/video/index-tts/indextts/infer_v2.py", line 437, in infer_generator

audio_16k = torchaudio.transforms.Resample(sr, 16000)(audio)

File "/Volumes/Ventoy/video/index-tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl

return self._call_impl(*args, **kwargs)

File "/Volumes/Ventoy/video/index-tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl

return forward_call(*args, **kwargs)

File "/Volumes/Ventoy/video/index-tts/.venv/lib/python3.10/site-packages/torchaudio/transforms/_transforms.py", line 980, in forward

return _apply_sinc_resample_kernel(waveform, self.orig_freq, self.new_freq, self.gcd, self.kernel, self.width)

File "/Volumes/Ventoy/video/index-tts/.venv/lib/python3.10/site-packages/torchaudio/functional/functional.py", line 1463, in _apply_sinc_resample_kernel

waveform = waveform.view(-1, shape[-1])

RuntimeError: cannot reshape tensor of 0 elements into shape [-1, 0] because the unspecified dimension size -1 can be any value and is ambiguous
Emo control mode:0,weight:0.65,vec:None

Floor #1 DeepSeek AI reply at 2 months ago

根据错误信息，您在使用 Index-TTS 进行语音合成时遇到了一个运行时错误。核心问题是 torchaudio 在处理音频重采样时，输入的音频数据可能为空或格式不正确，导致无法重塑张量。

请按以下步骤排查和解决：

检查参考音频文件：确保您上传的用于语音克隆的参考音频文件是有效的、非空的音频文件（如 WAV、MP3 格式）。尝试使用一个简短（3-10秒）、清晰的单人说话音频文件。
检查音频采样率：Index-TTS 在处理参考音频时，会尝试将其重采样到 16kHz。请确认您的参考音频文件可以被 torchaudio 正常读取。您可以使用其他音频工具（如 Audacity）打开并检查该文件。
更新或重装依赖：有时 torchaudio 或相关库的版本不兼容会导致此类问题。请尝试在您的 Index-TTS 项目环境中更新或重新安装 torch 和 torchaudio：
```
pip install --upgrade torch torchaudio
```
或者，如果问题依旧，尝试安装特定版本（例如与您 PyTorch 版本兼容的）：
```
pip install torchaudio==0.13.1  # 请根据您的 PyTorch 版本选择兼容版本
```
查看官方整合包指南：如果您使用的是预打包的 Index-TTS Windows 整合包，请确保已按照官方指南正确部署。环境问题在整合包中较为常见。
- 参考文档：Index-TTS2 整合包win版
提供更多信息：如果以上步骤均无效，请提供：
- 您使用的 Index-TTS 版本和部署方式（源码还是整合包）。
- 参考音频文件的详细信息（格式、时长、采样率）。
- 完整的错误日志文件（如果 pyVideoTrans 有生成）。

由于此错误发生在 Index-TTS 的推理过程中，根本原因可能在于其内部对输入音频的处理逻辑。如果问题持续，建议在 Index-TTS 的项目仓库提交 Issue。

#2420 starting inference...Traceback (most recent call last): File "/Volumes/Ventoy/video/index-tts/.venv/lib/python3.10/site

1 Replies

请查看更多参考文档：

Post Your Reply

Related resource