赞
踩
接上文, 我们发现WSL2 中其实也是可以使用CUDA 的。只不过使用docker 的方式。在加载资源的过程中有点卡。
我们在使用windows 炼丹的过程中,经常会遭遇到,Linux系统下生成的ckpt,或者模型文件加载出来编码错误。
那么临时救急的办法就是使用WSL2 进行加载。
上文 基本搭建好了wsl2 中使用的cuda 驱动等内容。 核心参照:
https://docs.nvidia.com/cuda/wsl-user-guide/index.html
核心步骤就是安装包:Using the WSL-Ubuntu Package
wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-wsl-ubuntu.pin
sudo mv cuda-wsl-ubuntu.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.4.0/local_installers/cuda-repo-wsl-ubuntu-11-4-local_11.4.0-1_amd64.deb
sudo dpkg -i cuda-repo-wsl-ubuntu-11-4-local_11.4.0-1_amd64.deb
sudo apt-key add /var/cuda-repo-wsl-ubuntu-11-4-local/7fa2af80.pub
sudo apt-get update
sudo apt-get -y install cuda
conda create -n nlp_gputf2 python=3.8 -y
conda activate nlp_gputf2
conda install ipykernel
# bert4keras 无法支持高版本
conda install tensorflow-gpu==2.2.0
pip install pandas
pip install matplotlib
pip install sklearn
pip install bert4keras
nvidia-smi
import tensorflow as tf
version = tf.__version__
gpu_ok = tf.test.is_gpu_available()
WARNING:tensorflow:From /tmp/ipykernel_5239/425579737.py:3: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version. Instructions for updating: Use `tf.config.list_physical_devices('GPU')` instead. 2022-02-04 23:33:16.130812: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA 2022-02-04 23:33:16.385570: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2304000000 Hz 2022-02-04 23:33:16.476626: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55d95c1d0a30 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2022-02-04 23:33:16.477265: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version 2022-02-04 23:33:16.526553: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1 2022-02-04 23:33:17.468088: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node Your kernel may have been built without NUMA support. 2022-02-04 23:33:17.468231: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3060 Laptop GPU computeCapability: 8.6 coreClock: 1.702GHz coreCount: 30 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 312.97GiB/s 2022-02-04 23:33:17.477580: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1 2022-02-04 23:33:17.543341: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10 2022-02-04 23:33:17.591854: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10 2022-02-04 23:33:17.604233: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10 2022-02-04 23:33:17.699987: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10 2022-02-04 23:33:17.719334: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10 2022-02-04 23:33:17.898389: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7 2022-02-04 23:33:17.899437: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node Your kernel may have been built without NUMA support. 2022-02-04 23:33:17.900126: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node Your kernel may have been built without NUMA support. 2022-02-04 23:33:17.900167: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0 2022-02-04 23:33:17.900984: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1 2022-02-04 23:33:18.858296: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix: 2022-02-04 23:33:18.858385: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0 2022-02-04 23:33:18.858434: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N 2022-02-04 23:33:18.860154: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node Your kernel may have been built without NUMA support. 2022-02-04 23:33:18.860241: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1330] Could not identify NUMA node of platform GPU id 0, defaulting to 0. Your kernel may not have been built with NUMA support. 2022-02-04 23:33:18.860856: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node Your kernel may have been built without NUMA support. 2022-02-04 23:33:18.861517: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node Your kernel may have been built without NUMA support. 2022-02-04 23:33:18.861655: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/device:GPU:0 with 4846 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 3060 Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.6) 2022-02-04 23:33:18.899505: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55d95ba19b20 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices: 2022-02-04 23:33:18.899580: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): NVIDIA GeForce RTX 3060 Laptop GPU, Compute Capability 8.6
tf.config.list_physical_devices('GPU')
2022-02-04 23:33:23.729200: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node Your kernel may have been built without NUMA support. 2022-02-04 23:33:23.729447: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3060 Laptop GPU computeCapability: 8.6 coreClock: 1.702GHz coreCount: 30 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 312.97GiB/s 2022-02-04 23:33:23.729685: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1 2022-02-04 23:33:23.729717: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10 2022-02-04 23:33:23.729732: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10 2022-02-04 23:33:23.729744: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10 2022-02-04 23:33:23.729756: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10 2022-02-04 23:33:23.729766: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10 2022-02-04 23:33:23.729780: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7 2022-02-04 23:33:23.731680: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node Your kernel may have been built without NUMA support. 2022-02-04 23:33:23.733292: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node Your kernel may have been built without NUMA support. 2022-02-04 23:33:23.733402: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0 [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
print("tf version:",version,"\nuse GPU",gpu_ok)
tf version: 2.2.0
use GPU True
你以为到这块就完了么,其实没有,TensorFlow 和keras 的GPU 使用很多是依赖不同的小版本的,同时对应了不同的CUDA 版本。
CUDA 历史版本的下载链接如下:
https://developer.nvidia.com/cuda-toolkit-archive
https://tensorflow.google.cn/install/source_windows
https://pytorch.org/get-started/previous-versions/
nvidia 神奇的又新增了一个 docker 开发者文档:
nvidia-docker 2.0 感觉就是又加了一层
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。