#5217 TaskCfgSTT(uuid='2d65212b45', name='E:/DownLoad/DCL/ASHLEY MASON – MOMMY PANTY ANAL.mp4', dirname='E:/DownLoad/DCL', noe

106.61* Posted at: 1 hour ago

语音识别阶段出错[openai-whisper(本地)] Expected parameter logits (Tensor of shape (1, 51866)) of distribution Categorical(logits: torch.Size([1, 51866])) to satisfy the constraint IndependentConstraint(Real(), 1), but found invalid values:
tensor([[nan, nan, nan, ..., nan, nan, nan]], device='cuda:0'):Traceback (most recent call last):
File "videotrans\process\stt_fun.py", line 102, in openai_whisper
File "whisper\transcribe.py", line 295, in transcribe
File "whisper\transcribe.py", line 201, in decode_with_fallback
File "torch\utils\_contextlib.py", line 116, in decorate_context

return func(*args, **kwargs)

File "whisper\decoding.py", line 824, in decode
File "torch\utils\_contextlib.py", line 116, in decorate_context

return func(*args, **kwargs)

File "whisper\decoding.py", line 737, in run
File "whisper\decoding.py", line 703, in _main_loop
File "whisper\decoding.py", line 283, in update
File "torch\distributions\categorical.py", line 73, in init

super().__init__(batch_shape, validate_args=validate_args)

File "torch\distributions\distribution.py", line 72, in init

raise ValueError(

ValueError: Expected parameter logits (Tensor of shape (1, 51866)) of distribution Categorical(logits: torch.Size([1, 51866])) to satisfy the constraint IndependentConstraint(Real(), 1), but found invalid values:
tensor([[nan, nan, nan, ..., nan, nan, nan]], device='cuda:0')

Traceback (most recent call last):

File "videotrans\task\job.py", line 35, in run

File "videotrans\task\job.py", line 100, in process_task

File "videotrans\task\speech2text.py", line 126, in recogn

File "videotrans\recognition\__init__.py", line 190, in run

File "videotrans\recognition\_base.py", line 94, in run

File "videotrans\recognition\_whisper.py", line 34, in _exec

File "videotrans\recognition\_whisper.py", line 77, in _openai

File "videotrans\configure\base.py", line 253, in _new_process

videotrans.configure.excepts.VideoTransError: Expected parameter logits (Tensor of shape (1, 51866)) of distribution Categorical(logits: torch.Size([1, 51866])) to satisfy the constraint IndependentConstraint(Real(), 1), but found invalid values:
tensor([[nan, nan, nan, ..., nan, nan, nan]], device='cuda:0'):Traceback (most recent call last):
File "videotrans\process\stt_fun.py", line 102, in openai_whisper
File "whisper\transcribe.py", line 295, in transcribe
File "whisper\transcribe.py", line 201, in decode_with_fallback
File "torch\utils\_contextlib.py", line 116, in decorate_context

return func(*args, **kwargs)

File "whisper\decoding.py", line 824, in decode
File "torch\utils\_contextlib.py", line 116, in decorate_context

return func(*args, **kwargs)

File "whisper\decoding.py", line 737, in run
File "whisper\decoding.py", line 703, in _main_loop
File "whisper\decoding.py", line 283, in update
File "torch\distributions\categorical.py", line 73, in init

super().__init__(batch_shape, validate_args=validate_args)

File "torch\distributions\distribution.py", line 72, in init

raise ValueError(

ValueError: Expected parameter logits (Tensor of shape (1, 51866)) of distribution Categorical(logits: torch.Size([1, 51866])) to satisfy the constraint IndependentConstraint(Real(), 1), but found invalid values:
tensor([[nan, nan, nan, ..., nan, nan, nan]], device='cuda:0')
TaskCfgSTT(uuid='2d65212b45', name='E:/DownLoad/DCL/ASHLEY MASON – MOMMY PANTY ANAL.mp4', dirname='E:/DownLoad/DCL', noextname='ASHLEY MASON – MOMMY PANTY ANAL', basename='ASHLEY MASON – MOMMY PANTY ANAL.mp4', ext='mp4', target_dir='D:/win-pyvideotrans-v4.01-0602/output/recogn', cache_folder='D:/win-pyvideotrans-v4.01-0602/tmp/13936/2d65212b45', is_cuda=True, source_language=None, source_language_code=None, source_sub=None, source_wav=None, source_wav_output=None, target_language=None, target_language_code=None, target_sub='D:/win-pyvideotrans-v4.01-0602/output/recogn/ASHLEY MASON – MOMMY PANTY ANAL.srt', target_wav=None, target_wav_output=None, detect_language='auto', recogn_type=1, model_name='large-v3-turbo', shibie_audio='D:/win-pyvideotrans-v4.01-0602/tmp/13936/2d65212b45/ASHLEY MASON – MOMMY PANTY ANAL-1781530568.9856992.wav', remove_noise=False, enable_diariz=False, nums_diariz=0, rephrase=0, fix_punc=False)


=
system:Windows-10-10.0.26100-SP0
version:v4.01
frozen:True
language:zh
root_dir:D:/win-pyvideotrans-v4.01-0602

3 Replies

该错误核心原因是语音识别模型(openai-whisper)在推理过程中输出的 logits 全部为 NaN(无效值),导致概率分布验证失败。这通常由以下因素引起:

  1. 显存不足:当前使用模型 large-v3-turbo 显存消耗较大,当显存(尤其是单卡显存)不够时,CUDA 计算会输出 NaN,直接报错。
  2. CUDA 环境配置异常:CUDA Toolkit 版本过低或未安装 cuDNN,导致浮点运算不稳定。
  3. 模型文件损坏或下载不完整:模型文件缺失或错误也会引发 NaN。
  4. 音频文件格式异常:极少数不符合规范的音频文件也可能使模型产生 NaN。

解决方案(按优先级执行)

1. 降低模型尺寸,释放显存

  • 主界面语音识别渠道选择 faster-whisper(本地) 代替 openai-whisper,它在相同显存下更稳定高效,且同样支持 large-v3-turbolarge-v3
  • 若仍显存不足,将模型切换为 mediumsmall

2. 调整计算精度,降低显存占用

  • 进入 菜单→工具→高级选项→语音识别参数

    • 计算数据类型 改为 float16(适用于N卡,速度快且省显存)
    • beam_sizebest_of 调低为 1
    • 关闭 启用上下文感知

3. 确保 CUDA 环境正确安装

  • 打开 CMD 执行 nvcc -V,确认版本在 12.8 及以上。
  • 执行 nvidia-smi 确认驱动支持 CUDA 12.8+。
  • 若未安装 cuDNN,请下载对应版本(≥9.11)并配置环境变量。

4. 简化文件路径

  • 当前视频文件名包含特殊字符(、空格、大写),建议将视频复制到浅层全英文目录,例如 D:\test.mp4,避免路径问题间接干扰 ffmpeg 处理。

5. 尝试关闭 CUDA 加速

  • 若以上步骤均无效,临时取消勾选主界面 CUDA加速(改用 CPU),验证是否为显卡硬件或驱动问题。若 CPU 可正常处理,则基本判定是 CUDA 环境或显存不足。

如果执行上述所有步骤后问题仍存在,请提供软件安装目录下 logs/ 文件夹中最新的日志文件(复制尾部约30行报错信息),以便进一步分析。

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "logging\__init__.py", line 1103, in emit
AttributeError: 'NoneType' object has no attribute 'write'
Call stack:
File "videotrans\task\job.py", line 37, in run
File "videotrans\task\job.py", line 44, in handle_error
File "logging\__init__.py", line 1512, in exception
File "logging\__init__.py", line 1506, in error
File "logging\__init__.py", line 1624, in _log
File "logging\__init__.py", line 1634, in handle
File "logging\__init__.py", line 1696, in callHandlers
File "logging\__init__.py", line 968, in handle
File "logging\__init__.py", line 1108, in emit
Message: VideoTransError('Expected parameter logits (Tensor of shape (1, 51866)) of distribution Categorical(logits: torch.Size([1, 51866])) to satisfy the constraint IndependentConstraint(Real(), 1), but found invalid values:\ntensor([[nan, nan, nan, ..., nan, nan, nan]], device=\'cuda:0\'):Traceback (most recent call last):\n File "videotrans\process\stt_fun.py", line 102, in openai_whisper\n File "whisper\transcribe.py", line 295, in transcribe\n File "whisper\transcribe.py", line 201, in decode_with_fallback\n File "torch\utils\_contextlib.py", line 116, in decorate_context\n return func(args, kwargs)\n File "whisper\decoding.py", line 824, in decode\n File "torch\utils\_contextlib.py", line 116, in decorate_context\n return func(args, kwargs)\n File "whisper\decoding.py", line 737, in run\n File "whisper\decoding.py", line 703, in _main_loop\n File "whisper\decoding.py", line 283, in update\n File "torch\distributions\categorical.py", line 73, in init\n super().__init__(batch_shape, validate_args=validate_args)\n File "torch\distributions\distribution.py", line 72, in init\n raise ValueError(\nValueError: Expected parameter logits (Tensor of shape (1, 51866)) of distribution Categorical(logits: torch.Size([1, 51866])) to satisfy the constraint IndependentConstraint(Real(), 1), but found invalid values:\ntensor([[nan, nan, nan, ..., nan, nan, nan]], device=\'cuda:0\')\n')
Arguments: ()
[DEBUG] 开始语音识别:渠道1
[DEBUG] [新进程任务 开始:STT use large-v3-turbo]
[DEBUG] 任务参数:kwargs={'prompt': None, 'detect_language': 'auto', 'model_name': 'large-v3-turbo', 'logs_file': 'D:/win-pyvideotrans-v4.01-0602/tmp/13936/63ed126bbc/openai-auto-1781530736.2888234.log', 'is_cuda': True, 'no_speech_threshold': 0.6, 'condition_on_previous_text': False, 'speech_timestamps': None, 'audio_file': 'D:/win-pyvideotrans-v4.01-0602/tmp/13936/63ed126bbc/Mother Catches Shy Son Jerking Off -1781530568.9856992.wav', 'jianfan': False, 'audio_duration': 469345, 'temperature': '', 'compression_ratio_threshold': 2.4, 'max_speech_ms': 5000, 'device_index': 0}
[DEBUG] 新进程执行结束结束timeout=16

AttributeError: 'NoneType' object has no attribute 'write' 是由于在记录第一个 NaN 异常时,日志文件的输出流(Stream)已被关闭或丢失(通常因为磁盘空间满、程序异常退出或临时目录权限问题导致)。此错误不影响根因——核心错误仍是 openai-whisper 在 CUDA 下输出 NaN(显存不足或 CUDA 环境异常)

下一步请优先执行以下两项操作:

  1. 解决 NaN 根因(按优先级):

    • 将语音识别渠道切换到 faster-whisper(本地),模型选择 large-v3-turbo 或更低(如 medium)。
    • 在主界面勾选 CUDA加速 后,进入 菜单→工具→高级选项→语音识别参数,将 计算数据类型 改为 float16beam_sizebest_of 均设为 1,关闭“启用上下文感知”。
    • 若仍然报 NaN,请打开 CMD 执行 nvcc -V 确认 CUDA 版本 ≥ 12.8,并确认已安装 cuDNN ≥ 9.11(具体安装步骤见官网文档)。
    • 暂时取消勾选 CUDA 加速,用 CPU 运行同一视频以确认是否为显存不足。
  2. 修复日志写入失败

    • 检查软件所在磁盘(D: 盘)剩余空间是否充足(建议保留 ≥ 5GB)。
    • 删除 tmp/ 目录下所有临时文件夹(D:/win-pyvideotrans-v4.01-0602/tmp/),然后重启软件重新运行。

若上述操作后仍出现相同错误,请提供执行 nvidia-sminvcc -V 的输出,以及软件目录下 logs/ 文件夹中最新日志的末尾 30 行。

Post Your Reply