小白笔记:持续完善的TF问题整理

所有警告信息整合在下面,逐条收录解决办法。

2022-10-29 00:06:14.626254: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

关于优化CPU计算的警告,可以直接忽略,或者手动隐藏 隐藏提示的代码/优化GPU
此处用到的代码,用处是屏蔽部分信息。此处的数字代表显示提示信息的等级,“0”(显示所有信息)或者"1"(不显示 info), "2"代表不显示warning,"3"代表不显示 error。

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

2022-10-29 00:06:14.812691: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0.

说浮点数取舍导致结果差异,关闭设置上述变量即可。

2022-10-29 00:06:14.855444: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered

似乎是涉及conda和pip安装tensorflow的问题
称用conda安装就不会出现,参考文章 具体解释

2022-10-29 00:06:15.457292: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library ‘libnvinfer.so.7’; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: :/usr/local/cuda/lib64:/usr/local/cuda/lib64:/usr/local/cuda/lib64

2022-10-29 00:06:15.457380: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library ‘libnvinfer_plugin.so.7’; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: :/usr/local/cuda/lib64:/usr/local/cuda/lib64:/usr/local/cuda/lib64

2022-10-29 00:06:15.457390: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.

均同上条,认为是conda与pip安装差异导致的。
用到代码conda install tensorflow,默认给我装了2.9.1,可用。
测试,以上问题均消失。

WARNING:root:The given value for groups will be overwritten.

没管。

2022-10-29 00:06:17.723176: F tensorflow/core/platform/statusor.cc:33] Attempting to fetch value instead of handling error INTERNAL: failed initializing StreamExecutor for CUDA device ordinal 1: INTERNAL: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_OUT_OF_MEMORY: out of memory; total memory reported: 50962300928
Aborted (core dumped)

GPU内存被别的程序占用了,可以使用nvidia-smi查看占用进程,理论上可以kill -9 xxx(PID)中断进程,但是是实验室的服务器,还是等着吧。

2022-10-29 00:41:20.803105: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.

物联沃分享整理
物联沃-IOTWORD物联网 » 小白笔记:持续完善的TF问题整理

发表评论