#2420 starting inference...Traceback (most recent call last): File "/Volumes/Ventoy/video/index-tts/.venv/lib/python3.10/site

2406:da18* Posted at: 5 hours ago 👁9
starting inference...
Traceback (most recent call last):
File "/Volumes/Ventoy/video/index-tts/.venv/lib/python3.10/site-packages/gradio/queueing.py", line 745, in process_events
response = await route_utils.call_process_api(

File "/Volumes/Ventoy/video/index-tts/.venv/lib/python3.10/site-packages/gradio/route_utils.py", line 349, in call_process_api

output = await app.get_blocks().process_api(

File "/Volumes/Ventoy/video/index-tts/.venv/lib/python3.10/site-packages/gradio/blocks.py", line 2123, in process_api

result = await self.call_function(

File "/Volumes/Ventoy/video/index-tts/.venv/lib/python3.10/site-packages/gradio/blocks.py", line 1630, in call_function

prediction = await anyio.to_thread.run_sync(  # type: ignore

File "/Volumes/Ventoy/video/index-tts/.venv/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync

return await get_async_backend().run_sync_in_worker_thread(

File "/Volumes/Ventoy/video/index-tts/.venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2476, in run_sync_in_worker_thread

return await future

File "/Volumes/Ventoy/video/index-tts/.venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 967, in run

result = context.run(func, *args)

File "/Volumes/Ventoy/video/index-tts/.venv/lib/python3.10/site-packages/gradio/utils.py", line 915, in wrapper

response = f(*args, **kwargs)

File "/Volumes/Ventoy/video/index-tts/webui.py", line 170, in gen_single

output = tts.infer(spk_audio_prompt=prompt, text=text,

File "/Volumes/Ventoy/video/index-tts/indextts/infer_v2.py", line 372, in infer

return list(self.infer_generator(

File "/Volumes/Ventoy/video/index-tts/indextts/infer_v2.py", line 437, in infer_generator

audio_16k = torchaudio.transforms.Resample(sr, 16000)(audio)

File "/Volumes/Ventoy/video/index-tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl

return self._call_impl(*args, **kwargs)

File "/Volumes/Ventoy/video/index-tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl

return forward_call(*args, **kwargs)

File "/Volumes/Ventoy/video/index-tts/.venv/lib/python3.10/site-packages/torchaudio/transforms/_transforms.py", line 980, in forward

return _apply_sinc_resample_kernel(waveform, self.orig_freq, self.new_freq, self.gcd, self.kernel, self.width)

File "/Volumes/Ventoy/video/index-tts/.venv/lib/python3.10/site-packages/torchaudio/functional/functional.py", line 1463, in _apply_sinc_resample_kernel

waveform = waveform.view(-1, shape[-1])

RuntimeError: cannot reshape tensor of 0 elements into shape [-1, 0] because the unspecified dimension size -1 can be any value and is ambiguous
Emo control mode:0,weight:0.65,vec:None

1 Replies

根据错误信息,您在使用 Index-TTS 进行语音合成时遇到了一个运行时错误。核心问题是 torchaudio 在处理音频重采样时,输入的音频数据可能为空或格式不正确,导致无法重塑张量。

请按以下步骤排查和解决:

  1. 检查参考音频文件:确保您上传的用于语音克隆的参考音频文件是有效的、非空的音频文件(如 WAV、MP3 格式)。尝试使用一个简短(3-10秒)、清晰的单人说话音频文件。
  2. 检查音频采样率:Index-TTS 在处理参考音频时,会尝试将其重采样到 16kHz。请确认您的参考音频文件可以被 torchaudio 正常读取。您可以使用其他音频工具(如 Audacity)打开并检查该文件。
  3. 更新或重装依赖:有时 torchaudio 或相关库的版本不兼容会导致此类问题。请尝试在您的 Index-TTS 项目环境中更新或重新安装 torchtorchaudio

    pip install --upgrade torch torchaudio

    或者,如果问题依旧,尝试安装特定版本(例如与您 PyTorch 版本兼容的):

    pip install torchaudio==0.13.1  # 请根据您的 PyTorch 版本选择兼容版本
  4. 查看官方整合包指南:如果您使用的是预打包的 Index-TTS Windows 整合包,请确保已按照官方指南正确部署。环境问题在整合包中较为常见。

  5. 提供更多信息:如果以上步骤均无效,请提供:

    • 您使用的 Index-TTS 版本和部署方式(源码还是整合包)。
    • 参考音频文件的详细信息(格式、时长、采样率)。
    • 完整的错误日志文件(如果 pyVideoTrans 有生成)。

由于此错误发生在 Index-TTS 的推理过程中,根本原因可能在于其内部对输入音频的处理逻辑。如果问题持续,建议在 Index-TTS 的项目仓库提交 Issue。

请查看更多参考文档:

Post Your Reply
Open source and free maintenance is not easy. If this project is helpful to you, please consider making a small donation to help the project continue to maintain and update.

Donate: https://ko-fi.com/jianchang512

Trending Questions