本文最后更新于:2024年5月8日 下午
Nvidia docker 是nvidia显卡在docker基础上进行封装得到的docker工具,需要电脑中安装Nvidia显卡驱动与docker,配置好Nvidia docker后docker可以使用GPU。本文记录Nvidia docker的安装与使用方法。
环境
- Linux 16.04 64位操作系统
- 显卡驱动 450.80.02
- CUDA 版本 11.0
- docker 安装版本 19.03.4
安装过程
1 2 3 4 5
| curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | \ sudo apt-key add - distribution=$(. /etc/os-release;echo $ID$VERSION_ID) curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \ sudo tee /etc/apt/sources.list.d/nvidia-docker.list
|
好像不好使了,可以用备份库
1 2 3
| distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \ && curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \ && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
|
- 安装nvidia-docker2 载入docker 配置
1 2
| sudo apt-get update sudo apt-get install -y nvidia-docker2
|
- 如果安装失败可以尝试加入参数
--fix-missing
1
| sudo apt-get install -y nvidia-docker2
|
1
| sudo systemctl restart docker
|
1
| docker run --runtime=nvidia --rm nvidia/cuda:11.0.3-base-${DIST} nvidia-smi
|
其中 DIST
为 ubuntu20.04
, ubuntu18.04
, centos7
其中之一。
nvidia/cuda:11.0-base
镜像已经被移除
1
| docker run --runtime=nvidia --rm nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi
|
此时会显示出显卡信息,说明nvidia docker成功创建并在内部正确执行了 nvidia-smi
命令。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
| $ sudo docker run --runtime=nvidia --rm nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi Unable to find image 'nvidia/cuda:11.0.3-base-ubuntu20.04' locally 11.0.3-base-ubuntu20.04: Pulling from nvidia/cuda 96d54c3075c9: Pull complete 59f6381879f6: Pull complete 655ed0df26cf: Pull complete 848b95ad96b5: Pull complete e43c2058e496: Pull complete Digest: sha256:c8269d6967e10940c368ea24fb8086cb21471cb8fefc66861d72f74f0c67e904 Status: Downloaded newer image for nvidia/cuda:11.0.3-base-ubuntu20.04 Tue Dec 5 08:33:54 2023 +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.129.03 Driver Version: 535.129.03 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA GeForce RTX 3080 Off | 00000000:01:00.0 On | N/A | | 30% 27C P8 20W / 320W | 503MiB / 10240MiB | 0% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ +---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| +---------------------------------------------------------------------------------------+
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
| $ sudo apt show nvidia-docker2 -> Package: nvidia-docker2 Version: 2.5.0-1 Priority: optional Section: utils Maintainer: NVIDIA CORPORATION <cudatools@nvidia.com> Installed-Size: 27.6 kB Depends: nvidia-container-runtime (>= 3.4.0), docker-ce (>= 18.06.0~ce~3-0~ubuntu) | docker-ee (>= 18.06.0~ce~3-0~ubuntu) | docker.io (>= 18.06.0) Breaks: nvidia-docker (<< 2.0.0) Replaces: nvidia-docker (<< 2.0.0) Homepage: https://github.com/NVIDIA/nvidia-docker/wiki Download-Size: 5,840 B APT-Manual-Installed: yes APT-Sources: https://nvidia.github.io/nvidia-docker/ubuntu16.04/amd64 Packages Description: nvidia-docker CLI wrapper Replaces nvidia-docker with a new implementation based on nvidia-container-runtime
N: There are 50 additional records. Please use the '-a' switch to see them.
|
出现类似信息说明安装成功
使用nvidia docker镜像
- 之前不是同一个驱动版本下创建的镜像在新驱动下的nvidia docker中可能找不到nvidia-smi命令
- 为了在新的docker下使用gpu,我迂回地使用测试镜像作为初始镜像
1
| nvidia-docker run -it --name first_container nvidia/cuda:11.0.3-base-${DIST} /bin/bash
|
其中 DIST
为 ubuntu20.04
, ubuntu18.04
, centos7
其中之一。
这样便将测试镜像创建出了可以随时访问修改的容器,在该容器基础上保存镜像即可。
1
| docker commit -m "nvidia docker image init" first_container my_image:1.0
|
参考资料
文章链接:
https://www.zywvvd.com/notes/tools/docker/nvidia-docker-install/