docker19.03 读取 NVIDIA 显卡
发布于: 2020 年 05 月 25 日
作者: 张首富时间: 2019-09-06w x: y18163201
前言
2019年7月的docker 19.03
已经正式发布了,这次发布对我来说有两大亮点。
1,就是docker不需要root权限来启动喝运行了
2,就是支持GPU的增强功能,我们在docker里面想读取nvidia显卡再也不需要额外的安装nvidia-docker
了
安装nvidia驱动
确认已检测到NVIDIA卡:
$ lspci -vv | grep -i nvidia00:04.0 3D controller: NVIDIA Corporation GP100GL [Tesla P100 PCIe 16GB] (rev a1) Subsystem: NVIDIA Corporation GP100GL [Tesla P100 PCIe 16GB] Kernel modules: nvidiafb
这里不再详细介绍:如果不知道请移步ubuntu离线安装TTS服务
安装NVIDIA Container Runtime
$ cat nvidia-container-runtime-script.sh curl -s -L https://nvidia.github.io/nvidia-container-runtime/gpgkey | \ sudo apt-key add -distribution=$(. /etc/os-release;echo $ID$VERSION_ID)curl -s -L https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidia-container-runtime.list | \ sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.listsudo apt-get update
执行脚本
sh nvidia-container-runtime-script.sh
OKdeb https://nvidia.github.io/libnvidia-container/ubuntu18.04/$(ARCH) /deb https://nvidia.github.io/nvidia-container-runtime/ubuntu18.04/$(ARCH) /Hit:1 http://archive.canonical.com/ubuntu bionic InReleaseGet:2 https://nvidia.github.io/libnvidia-container/ubuntu18.04/amd64 InRelease [1139 B] Get:3 https://nvidia.github.io/nvidia-container-runtime/ubuntu18.04/amd64 InRelease [1136 B] Hit:4 http://security.ubuntu.com/ubuntu bionic-security InRelease Get:5 https://nvidia.github.io/libnvidia-container/ubuntu18.04/amd64 Packages [4076 B] Get:6 https://nvidia.github.io/nvidia-container-runtime/ubuntu18.04/amd64 Packages [3084 B] Hit:7 http://us-east4-c.gce.clouds.archive.ubuntu.com/ubuntu bionic InReleaseHit:8 http://us-east4-c.gce.clouds.archive.ubuntu.com/ubuntu bionic-updates InReleaseHit:9 http://us-east4-c.gce.clouds.archive.ubuntu.com/ubuntu bionic-backports InReleaseFetched 9435 B in 1s (17.8 kB/s) Reading package lists... Done
$ apt-get install nvidia-container-runtimeReading package lists... DoneBuilding dependency tree Reading state information... DoneThe following packages were automatically installed and are no longer required: grub-pc-bin libnuma1Use 'sudo apt autoremove' to remove them.The following additional packages will be installed:Get:1 https://nvidia.github.io/libnvidia-container/ubuntu18.04/amd64 libnvidia-container1 1.0.2-1 [59.1 kB]Get:2 https://nvidia.github.io/libnvidia-container/ubuntu18.04/amd64 libnvidia-container-tools 1.0.2-1 [15.4 kB]Get:3 https://nvidia.github.io/nvidia-container-runtime/ubuntu18.04/amd64 nvidia-container-runtime-hook 1.4.0-1 [575 kB]...Unpacking nvidia-container-runtime (2.0.0+docker18.09.6-3) ...Setting up libnvidia-container1:amd64 (1.0.2-1) ...Setting up libnvidia-container-tools (1.0.2-1) ...Processing triggers for libc-bin (2.27-3ubuntu1) ...Setting up nvidia-container-runtime-hook (1.4.0-1) ...Setting up nvidia-container-runtime (2.0.0+docker18.09.6-3) ...
which nvidia-container-runtime-hook/usr/bin/nvidia-container-runtime-hook
安装docker-19.03
# step 1: 安装必要的一些系统工具yum install -y yum-utils device-mapper-persistent-data lvm2# Step 2: 添加软件源信息yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo# Step 3: 更新并安装 Docker-CEyum makecache fastyum -y install docker-ce-19.03.2# Step 4: 开启Docker服务systemctl start docker && systemctl enable docker
验证docker版本是否安装正常
$ docker versionClient: Docker Engine - Community Version: 19.03.2 API version: 1.40 Go version: go1.12.8 Git commit: 6a30dfc Built: Thu Aug 29 05:28:55 2019 OS/Arch: linux/amd64 Experimental: falseServer: Docker Engine - Community Engine: Version: 19.03.2 API version: 1.40 (minimum version 1.12) Go version: go1.12.8 Git commit: 6a30dfc Built: Thu Aug 29 05:27:34 2019 OS/Arch: linux/amd64 Experimental: false containerd: Version: 1.2.6 GitCommit: 894b81a4b802e4eb2a91d1ce216b8817763c29fb runc: Version: 1.0.0-rc8 GitCommit: 425e105d5a03fabd737a126ad93d62a9eeede87f docker-init: Version: 0.18.0 GitCommit: fec3683
验证下`-gpus`选项
$ docker run --help | grep -i gpus --gpus gpu-request GPU devices to add to the container ('all' to pass all GPUs)
运行利用GPU的Ubuntu容器
$ docker run -it --rm --gpus all ubuntu nvidia-smiUnable to find image 'ubuntu:latest' locallylatest: Pulling from library/ubuntuf476d66f5408: Pull complete 8882c27f669e: Pull complete d9af21273955: Pull complete f5029279ec12: Pull complete Digest: sha256:d26d529daa4d8567167181d9d569f2a85da3c5ecaf539cace2c6223355d69981Status: Downloaded newer image for ubuntu:latestTue May 7 15:52:15 2019 +-----------------------------------------------------------------------------+| NVIDIA-SMI 390.116 Driver Version: 390.116 ||-------------------------------+----------------------+----------------------+| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC || Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. ||===============================+======================+======================|| 0 Tesla P4 Off | 00000000:00:04.0 Off | 0 || N/A 39C P0 22W / 75W | 0MiB / 7611MiB | 0% Default |+-------------------------------+----------------------+----------------------++-----------------------------------------------------------------------------+| Processes: GPU Memory || GPU PID Type Process name Usage ||=============================================================================|| No running processes found |+-----------------------------------------------------------------------------+:~$
故障排除
您是否遇到以下错误消息:
$ docker run -it --rm --gpus all debiandocker: Error response from daemon: linux runtime spec devices: could not select device driver "" with capabilities: [[gpu]].
上述错误意味着Nvidia无法正确注册Docker。它实际上意味着驱动程序未正确安装在主机上。这也可能意味着安装了nvidia容器工具而无需重新启动docker守护程序:您需要重新启动docker守护程序。
我建议你回去验证是否安装了nvidia-container-runtime或者重新启动Docker守护进程。
列出GPU设备
$ docker run -it --rm --gpus all ubuntu nvidia-smi -LGPU 0: Tesla P4 (UUID: GPU-fa974b1d-3c17-ed92-28d0-805c6d089601)
$ docker run -it --rm --gpus all ubuntu nvidia-smi --query-gpu=index,name,uuid,serial --format=csvindex, name, uuid, serial0, Tesla P4, GPU-fa974b1d-3c17-ed92-28d0-805c6d089601, 0325017070224
划线
评论
复制
发布于: 2020 年 05 月 25 日 阅读数: 48
版权声明: 本文为 InfoQ 作者【首富手记】的原创文章。
原文链接:【http://xie.infoq.cn/article/cef8db29f075006002f9aaadf】。文章转载请联系作者。
首富手记
关注
成功远远比失败来的更艰辛! 2018.11.08 加入
还未添加个人简介
评论