#4227 配音加速为什么不起作用

2409:8a55* Posted at: 21 days ago 👁73

配音加速为什么不起作用

9 Replies

配音加速的前提是实际生成的配音时长超过了原始语音时长,软件才会触发加速。如果两者时长接近,或配音更短,就不会加速。

请检查以下几点:

  1. 确认已勾选:主界面同步对齐区域需要选中“配音加速”选项。
  2. 理解对齐逻辑:如果同时选中了“配音加速”和“视频慢速”,软件会优先尝试“视频慢速”,满足条件时不会触发加速。
  3. 查看实际时长:在单视频交互模式下可查看每句配音的实际时长是否超出原始语音时长。
  4. 调整加速倍数:在 菜单-工具-高级选项-字幕声音画面对齐 中可调整“音频加速最大倍数”(默认100),如果倍数设置过低也有影响。
  5. 缓存影响:如果之前处理过同一视频且未勾选“清理已生成”,可能直接使用缓存结果,不会重新处理。

请查阅相关文档:

还是不起作用,明明最后的配音大部分都超出了,但最终视频的声音还是正常的,没有加速后时快时慢的迹象

这种现象通常是因为你同时勾选了“视频慢速”选项。根据软件的优先级逻辑,当“配音加速”和“视频慢速”同时开启时,会优先执行“视频慢速”(即拉长视频画面来匹配配音时长),因此配音本身不会被加速,听起来自然就正常、没有时快时慢的感觉。

解决方案:

  1. 只保留“配音加速”选项:在同步对齐区域,仅勾选“配音加速”,取消勾选“视频慢速”。这样软件就会强制加速配音以匹配原视频时长。
  2. 调整加速倍数:如果加速后听感失真明显,可以前往 菜单-工具-高级选项-字幕声音画面对齐,适当降低“音频加速最大倍数”的数值(如从100改为30)。

没有勾选视频慢速

请提供软件安装目录下 logs/ 文件夹中最新的日志文件(复制底部约30行报错信息),以便分析具体原因。

[INFO] Starting call to 'videotrans.translator._chatgpt.ChatGPT.llm_segment.._send', this is the 2nd time calling it.
[DEBUG] 需要断句的:message=[{'role': 'system', 'content': 'Role:\nYou are an expert Multi-Language ASR Post-Editor. Your goal is to losslessly repair, correct, and re-segment subtitles from the given SRT.\n\n


\n\n# GOLDEN RULE — ABSOLUTE REQUIREMENTS\n1. Do NOT delete any meaningful information.\n2. Removing meaningless fillers ("uh", "umm", "erm", “えー”, “嗯”) is allowed and does NOT violate rule #1.\n3. All semantic content must appear in the final output. No summarizing. No shortening.\n\n
\n\n# CORE TASKS\n1. Retention: preserve all meaning.\n2. Correction: fix ASR mistakes (typos, homophones, missing punctuation).\n3. Segmentation: cut long text into natural sentences.\n4. Formatting: output valid SRT wrapped in and .\n\n
\n\n# SEGMENTATION RULES (CRITICAL)\n1. If an original block contains multiple sentences or is longer th
......
re to add four, we would be doing...\n\n20\n00:01:08,336 --> 00:01:10,256\n This, nice and simple.\n\n21\n00:01:10,352 --> 00:01:14,960\n So that's addition. What is subtraction? Well subtraction is moving left so maybe\n\n22\n00:01:14,960 --> 00:01:19,184\n Maybe I'll use a different colour for subtraction, perhaps a game dev green. If we were to take...\n\n23\n00:01:19,184 --> 00:01:21,536\n 4 and subtract 2 we go down to 2.\n\n24\n00:01:21,536 --> 00:01:25,968\n OK, and if we were to subtract, take one and subtract one, we get down to zero.\n\n25\n00:01:25,968 --> 00:01:27,312\n Very simple.\n\n26\n00:01:27,312 --> 00:01:31,792\n So the other option that we have here is to have a negative value.\n\n27\n00:01:31,792 --> 00:01:34,464\n on the number lines let's take a little look at how that would work\n\n28\n00:01:34,464 --> 00:01:36,672\n So we can have a very similar number line, but this...\n\n29\n00:01:36,672 --> 00:01:41,008\n time let's imagine that this number line the zero is shifted across and\n\n30\n00:01:41,008 --> 00:01:42,256\n that we have zero here.\n\n31\n00:01:42,256 --> 00:01:44,608\n Alright, right in the middle there, let's say.\n\n32\n00:01:44,608 --> 00:01:48,976\n Now we can play with things called negative numbers. So if I have maybe a...\n\n33\n00:01:48,976 --> 00:01:49,840\n one sheep.\n\n34\n00:01:49,840 --> 00:01:54,128\n or two sheep or three sheep and I want to take away a certain number\n\n35\n00:01:54,128 --> 00:01:55,552\n of sheep then\n\n36\n00:01:55,552 --> 00:01:58,544\n if you said hey you've got two sheep to give me\n\n37\n00:01:58,544 --> 00:01:59,408\n Three.\n\n38\n00:01:59,408 --> 00:02:03,728\n we would end up back here at a number that we're going to call minus 1, so negative.\n\n39\n00:02:03,728 --> 00:02:04,528\n numbers.\n\n40\n00:02:04,528 --> 00:02:09,151\n Alright, super simple. That's all I'm going to say about the number line for the moment for you.\n\n41\n00:02:09,151 --> 00:02:09,488\n Okay.\n\n42\n00:02:09,488 --> 00:02:10,352\n So...\n\n43\n00:02:10,352 --> 00:02:11,536\n What?\n\n44\n00:02:12,816 --> 00:02:17,584\n we need to think about when we're using a game engine with numbers\n\n45\n00:02:17,584 --> 00:02:21,968\n is a few things firstly we want to look at overflow limits so if i pop here this is a\n\n46\n00:02:21,968 --> 00:02:23,392\n version of Unity.\n\n47\n00:02:23,392 --> 00:02:27,664\n game engine, but it doesn't matter. It's the same in a lot of languages. And I start with a big number.\n\n48\n00:02:27,664 --> 00:02:31,888\n this number for a reason. This is, we'll find out later, a\n\n49\n00:02:31,888 --> 00:02:33,120\n 32-bit integer.\n\n50\n00:02:33,120 --> 00:02:37,648\n So this means the total size that this can be is just over two billion.\n`"}]
[INFO] Finished call to 'videotrans.translator._chatgpt.ChatGPT.llm_segment.._send' after 20.359(s), this was the 2nd time calling it.
[WARNING] 重新断句失败[except],已恢复原样 RetryError[]
[WARNING] 重新断句失败[except],已恢复原样 RetryError[]

从日志看,你遇到的“配音加速不起作用”并非加速功能本身故障,而是因为启用了 LLM重新断句,且该功能连续重试后最终 断句失败(日志显示 重新断句失败[except],已恢复原样)。

断句失败导致任务中途异常中断,后续的配音加速流程实际上没有被正常执行。

解决方案:

  1. 临时关闭 LLM 重新断句:在软件主界面上方菜单栏,点击 “工具/选项” -> “高级选项”,找到 “LLM重新断句每批字幕行数” 或类似“LLM重新断句”的开关,将其 取消勾选或设为0 以禁用此功能。
  2. 重新勾选“清理已生成”:在主界面左上角勾选 “清理已生成”,然后重新启动翻译任务。

关于断句失败的原因:日志显示 RetryError,说明在调用AI(如ChatGPT/DeepSeek)进行字幕重新断句时,遇到了 API 调用超时或报错,重试后依然失败。这通常是网络波动或API限制导致的。

请查阅相关文档:

[ERROR] [Audio-RB] 音频处理失败 D:/win-pyvideotrans-v3.99/tmp/58440/8aea42c40b/dubb-82.wav: Failed to execute rubberband. Please verify that rubberband-cli is installed.
[DEBUG] [Audio] 开始对齐拼接...
[DEBUG] [Audio-Sync] Line=1 | 音频末尾补静音 115ms | [current_slot_audio_len=4493 slot_duration=4608] | Timeline: 176 -> 4784
[DEBUG] [Audio-Sync] Line=2 | 音频末尾补静音 358ms | [current_slot_audio_len=3866 slot_duration=4224] | Timeline: 4784 -> 9008
[DEBUG] [Audio-Sync] Line=3 | 音频溢出截断 5201->4224 | [current_slot_audio_len=5201 slot_duration=4224] | Timeline: 9008 -> 13232
[DEBUG] [Audio-Sync] Line=4 | 音频末尾补静音 91ms | [current_slot_audio_len=4133 slot_duration=4224] | Timeline: 13232 -> 17456
[DEBUG] [Audio-Sync] Line=5 | 音频末尾补静音 2631ms | [current_slot_audio_len=3065 slot_duration=5696] | Timeline: 17456 -> 23152
[DEBUG] [Audio-Sync] Line=6 | 音频溢出截断 3053->2928 | [current_slot_audio_len=3053 slot_duration=2928] | Timeline: 23152 -> 26080
[DEBUG] [Audio-Sync] Line=7 | 音频溢出截断 7105->4272 | [current_slot_audi
......
频末尾补静音 352ms | [current_slot_audio_len=592 slot_duration=944] | Timeline: 235056 -> 236000
[DEBUG] [Audio-Sync] Line=74 | 音频末尾补静音 1262ms | [current_slot_audio_len=3170 slot_duration=4432] | Timeline: 236000 -> 240432
[DEBUG] [Audio-Sync] Line=75 | 音频溢出截断 4272->4224 | [current_slot_audio_len=4272 slot_duration=4224] | Timeline: 240432 -> 244656
[DEBUG] [Audio-Sync] Line=76 | 音频末尾补静音 20ms | [current_slot_audio_len=1788 slot_duration=1808] | Timeline: 244656 -> 246464
[DEBUG] [Audio-Sync] Line=77 | 音频末尾补静音 519ms | [current_slot_audio_len=3785 slot_duration=4304] | Timeline: 246464 -> 250768
[DEBUG] [Audio-Sync] Line=78 | 音频末尾补静音 658ms | [current_slot_audio_len=2670 slot_duration=3328] | Timeline: 250768 -> 254096
[DEBUG] [Audio-Sync] Line=79 | 音频末尾补静音 556ms | [current_slot_audio_len=4052 slot_duration=4608] | Timeline: 254096 -> 258704
[DEBUG] [Audio-Sync] Line=80 | 音频末尾补静音 950ms | [current_slot_audio_len=3274 slot_duration=4224] | Timeline: 258704 -> 262928
[DEBUG] [Audio-Sync] Line=81 | 音频溢出截断 6838->4224 | [current_slot_audio_len=6838 slot_duration=4224] | Timeline: 262928 -> 267152
[DEBUG] [Audio-Sync] Line=82 | 音频溢出截断 2531->1152 | [current_slot_audio_len=2531 slot_duration=1152] | Timeline: 267152 -> 268304
[DEBUG] [Audio-Sync] Line=83 | 音频溢出截断 4470->4462 | [current_slot_audio_len=4470 slot_duration=4462] | Timeline: 268304 -> 272766
[DEBUG] concat_txt='D:/win-pyvideotrans-v3.99/tmp/58440/8aea42c40b/final_audio_concat.txt',filelist[0]='D:/win-pyvideotrans-v3.99/tmp/58440/8aea42c40b/silence_head_0.wav'
[DEBUG] [FFMPEG-CMD]:
ffmpeg -hide_banner -ignore_unknown -threads 0 -y -f concat -safe 0 -i D:/win-pyvideotrans-v3.99/tmp/58440/8aea42c40b/final_audio_concat.txt -c:a copy D:/win-pyvideotrans-v3.99/tmp/58440/8aea42c40b/final_audio_temp.wav

[DEBUG] [Audio-Concat] 最终音频已生成: D:/win-pyvideotrans-v3.99/tmp/58440/8aea42c40b/target.wav
[DEBUG] [FFMPEG-CMD]:
ffmpeg -hide_banner -ignore_unknown -threads 0 -y -i D:/_Output/测试专用/Math 02 - 去除前4s.mp4 -vn -b:a 128k -c:a aac D:/_Output/测试专用/_video_out/Math 02 - 去除前4s-mp4/en.m4a

[DEBUG] [FFMPEG-CMD]:
ffmpeg -hide_banner -ignore_unknown -threads 0 -y -i target.wav -ac 2 -b:a 128k -c:a aac origin_audio.m4a

[DEBUG]


==准备要嵌入的字幕:self.cfg.subtitle_type=4
=
[DEBUG] 最终确定字幕嵌入类型:4 ,目标字幕语言:chi, 字幕文件:D:/win-pyvideotrans-v3.99/tmp/58440/8aea42c40b/shuang.srt

[DEBUG] [FFMPEG-CMD]:
ffmpeg -hide_banner -ignore_unknown -threads 0 -y -i novoice.mp4 -vf tpad=stop_mode=clone:stop_duration=30.503 -c:v libx264 -crf 23 -preset veryfast -an final_video_with_freeze_lastend.mp4

[DEBUG] 视频定格应延长30503ms,实际向上取整秒延长30.503s,操作成功。
[DEBUG] 原始hw_type='hevc_nvenc'
[DEBUG] 整理后hw_type='nvenc'
[DEBUG] [尝试硬件编解码执行命令]
-y -progress compose1776944129.5909653.txt -hwaccel cuda -hwaccel_output_format cuda -i novoice.mp4 -i origin_audio.m4a -i shuang.srt -map 0:v -map 1:a -map 2:s -c:v hevc_nvenc -c:a copy -c:s mov_text -metadata:s:s:0 language=chi -cq 23 -preset p2 -movflags +faststart -t 303.266000 laste_target.mp4

直接原因是:软件依赖 rubberband-cli 工具进行音频加速/变速处理,但你的系统中未安装该工具,导致 配音加速 功能执行失败并报错 Failed to execute rubberband

解决方案:

  1. Windows 预打包版用户:重新下载 完整版软件包(非补丁包),完整版已内置 rubberband。解压后直接覆盖原目录即可。
  2. macOS 用户:在终端执行 brew install rubberband
  3. Linux 用户:执行 sudo apt install rubberband-cli(Debian/Ubuntu)或对应发行版的安装命令。

完成安装后,重新勾选 清理已生成 再启动翻译任务。

请查阅相关文档:

Post Your Reply
Open source and free maintenance is not easy. If this project is helpful to you, please consider making a small donation to help the project continue to maintain and update.

Related resource