#3164 TaskCfg(cache_folder='E:/win-pyvideotrans-v3.96/tmp/84836/55dc1fd853', target_dir='E:/Tutorial/PikumaRaycastingEnginePro

198.98* Posted at: 12 hours ago 👁14

语音识别阶段出错 [faster-whisper(本地)] Traceback (most recent call last):
File "videotrans\process\stt_fun.py", line 256, in faster_whisper
File "faster_whisper\transcribe.py", line 586, in _batched_segments_generator
File "faster_whisper\transcribe.py", line 120, in forward
File "faster_whisper\transcribe.py", line 222, in generate_segment_batched
TypeError: generate(): incompatible function arguments. The following argument types are supported:

1. (self: ctranslate2._ext.Whisper, features: ctranslate2._ext.StorageView, prompts: Union[List[List[str]], List[List[int]]], *, asynchronous: bool = False, beam_size: int = 5, patience: float = 1, num_hypotheses: int = 1, length_penalty: float = 1, repetition_penalty: float = 1, no_repeat_ngram_size: int = 0, max_length: int = 448, return_scores: bool = False, return_logits_vocab: bool = False, return_no_speech_prob: bool = False, max_initial_timestamp_index: int = 50, suppress_blank: bool = True, suppress_tokens: Optional[List[int]]

......
ment_batched
TypeError: generate(): incompatible function arguments. The following argument types are supported:

1. (self: ctranslate2._ext.Whisper, features: ctranslate2._ext.StorageView, prompts: Union[List[List[str]], List[List[int]]], *, asynchronous: bool = False, beam_size: int = 5, patience: float = 1, num_hypotheses: int = 1, length_penalty: float = 1, repetition_penalty: float = 1, no_repeat_ngram_size: int = 0, max_length: int = 448, return_scores: bool = False, return_logits_vocab: bool = False, return_no_speech_prob: bool = False, max_initial_timestamp_index: int = 50, suppress_blank: bool = True, suppress_tokens: Optional[List[int]] = [-1], sampling_topk: int = 1, sampling_temperature: float = 1) -> Union[List[ctranslate2._ext.WhisperGenerationResult], List[ctranslate2._ext.WhisperGenerationResultAsync]]

Invoked with: , -0.509277 1.14844 0.342529 ... 0.0516052 0.0163574 -0.00265884
[cuda:0 float16 storage viewed as 4x1500x1280], [[50258, 50259, 50360, 50364], [50258, 50259, 50360, 50364], [50258, 50259, 50360, 50364], [50258, 50259, 50360, 50364]]; kwargs: beam_size=5, patience=1, length_penalty=1, max_length=448, suppress_blank=True, suppress_tokens=(1, 2, 7, 8, 9, 10, 14, 25, 26, 27, 28, 29, 31, 58, 59, 60, 61, 62, 63, 90, 91, 92, 93, 359, 503, 522, 542, 873, 893, 902, 918, 922, 931, 1350, 1853, 1982, 2460, 2627, 3246, 3253, 3268, 3536, 3846, 3961, 4183, 4667, 6585, 6647, 7273, 9061, 9383, 10428, 10929, 11938, 12033, 12331, 12562, 13793, 14157, 14635, 15265, 15618, 16553, 16604, 18362, 18956, 20075, 21675, 22520, 26130, 26161, 26435, 28279, 29464, 31650, 32302, 32470, 36865, 42863, 47425, 49870, 50254, 50258, 50359, 50360, 50361, 50362, 50363), return_scores=True, return_no_speech_prob=True, sampling_temperature='0.0', repetition_penalty=1.0, no_repeat_ngram_size=0
TaskCfg(cache_folder='E:/win-pyvideotrans-v3.96/tmp/84836/55dc1fd853', target_dir='E:/Tutorial/PikumaRaycastingEngineProgramming/_video_out/02. How to Take this Course-mp4', remove_noise=False, is_separate=False, detect_language='en', subtitle_language=None, source_language='英语', target_language='简体中文', source_language_code='en', target_language_code='zh-cn', source_sub='E:/Tutorial/PikumaRaycastingEngineProgramming/_video_out/02. How to Take this Course-mp4/en.srt', target_sub='E:/Tutorial/PikumaRaycastingEngineProgramming/_video_out/02. How to Take this Course-mp4/zh-cn.srt', source_wav='E:/win-pyvideotrans-v3.96/tmp/84836/55dc1fd853/en.wav', source_wav_output='E:/Tutorial/PikumaRaycastingEngineProgramming/_video_out/02. How to Take this Course-mp4/en.m4a', target_wav='E:/win-pyvideotrans-v3.96/tmp/84836/55dc1fd853/target.wav', target_wav_output='E:/Tutorial/PikumaRaycastingEngineProgramming/_video_out/02. How to Take this Course-mp4/zh-cn.m4a', subtitles='', novoice_mp4='E:/win-pyvideotrans-v3.96/tmp/84836/55dc1fd853/novoice.mp4', noextname='02. How to Take this Course', shibie_audio=None, targetdir_mp4='E:/Tutorial/PikumaRaycastingEngineProgramming/_video_out/02. How to Take this Course-mp4/02. How to Take this Course.mp4', instrument=None, vocal=None, back_audio='', background_music=None, app_mode='biaozhun', subtitle_type=3, volume='+0%', pitch='+0Hz', voice_rate='+0%', voice_role='男声.wav', copysrt_rawvideo=False, clear_cache=True, translate_type=5, tts_type=8, recogn_type=0, model_name='distil-large-v3.5', voice_autorate=True, video_autorate=True, cuda=True, name='E:/Tutorial/PikumaRaycastingEngineProgramming/Pikuma - Raycasting Engine Programming/02. How to Take this Course.mp4', basename='02. How to Take this Course.mp4', ext='mp4', dirname='E:/Tutorial/PikumaRaycastingEngineProgramming/Pikuma - Raycasting Engine Programming', shound_del_name=None, uuid='55dc1fd853', remove_silent_mid=False, align_sub_audio=True, enable_diariz=False, nums_diariz=0, rephrase=0, only_out_mp4=False, fix_punc=True, recogn2pass=False)


=
system:Windows-10-10.0.19045-SP0
version:v3.96
frozen:True
language:zh
root_dir:E:/win-pyvideotrans-v3.96

1 Replies

这是由于 faster-whisper 底层库 ctranslate2 的版本不兼容导致的。错误信息显示 sampling_temperature='0.0' 被作为字符串传递,而函数期望的是浮点数 sampling_temperature: float = 1

  1. 更新软件版本:此问题在较新的软件版本中已修复。请前往官方下载页面,下载并替换为最新的完整包或补丁包。
  2. 临时解决方案:如果暂时无法更新,可以尝试在软件设置中修改 faster-whisper 的参数。进入菜单 -> 工具 -> 高级选项 -> 语音识别参数,将 temperature 参数的值从 0.0 改为 0(确保是数字,而非字符串格式)。
  3. 清理缓存:在尝试上述操作后,建议在软件主界面勾选“执行前清理掉旧文件”选项,或手动删除任务输出目录(如 E:/Tutorial/PikumaRaycastingEngineProgramming/_video_out/02. How to Take this Course-mp4/),然后重新运行任务。

请查阅相关文档:

Post Your Reply
Open source and free maintenance is not easy. If this project is helpful to you, please consider making a small donation to help the project continue to maintain and update.

Related resource