- Architecture OverviewВ¶
- Components and PackagesВ¶
- libnvidia-container В¶
- nvidia-container-toolkit В¶
- nvidia-container-runtime В¶
- nvidia-docker2 В¶
- Which package should I use then?В¶
- Package RepositoryВ¶
- NVIDIA Container Toolkit
- Contents
- Overview
- Installation under Linux
- Installation under Windows with WSL2
- Container image compatibility
- Nvidia container toolkit linux
- About
- Installation GuideВ¶
- Supported PlatformsВ¶
- Linux DistributionsВ¶
- Container RuntimesВ¶
- Pre-RequisitesВ¶
- NVIDIA DriversВ¶
- Platform RequirementsВ¶
- DockerВ¶
- Getting StartedВ¶
- Installing on Ubuntu and DebianВ¶
- Installing on CentOS 7/8В¶
Architecture OverviewВ¶
The NVIDIA Container Toolkit is architected so that it can be targeted to support any container runtime in the ecosystem. For Docker, the NVIDIA Container Toolkit is comprised of the following components (from top to bottom in the hierarchy):
The following diagram represents the flow through the various components:
The packaging of the NVIDIA Container Toolkit is also reflective of these dependencies. If you start with the top-level nvidia-docker2 package for Docker, the package dependencies can be seen below:
Let’s take a brief look at each of the components in the software hierarchy (and corresponding packages).
Components and PackagesВ¶
libnvidia-container В¶
This component provides a library and a simple CLI utility to automatically configure GNU/Linux containers leveraging NVIDIA GPUs. The implementation relies on kernel primitives and is designed to be agnostic of the container runtime.
libnvidia-container provides a well-defined API and a wrapper CLI (called nvidia-container-cli ) that different runtimes can invoke to inject NVIDIA GPU support into their containers.
nvidia-container-toolkit В¶
This component includes a script that implements the interface required by a runC prestart hook. This script is invoked by runC after a container has been created, but before it has been started, and is given access to the config.json associated with the container (e.g. this config.json ). It then takes information contained in the config.json and uses it to invoke the libnvidia-container CLI with an appropriate set of flags. One of the most important flags being which specific GPU devices should be injected into the container.
Note that the previous name of this component was nvidia-container-runtime-hook . nvidia-container-runtime-hook is now simply a symlink to nvidia-container-toolkit on the system.
nvidia-container-runtime В¶
This component used to be a complete fork of runC with NVIDIA specific code injected into it. Since 2019, it is a thin wrapper around the native runC installed on the host system. nvidia-container-runtime takes a runC spec as input, injects the nvidia-container-toolkit script as a prestart hook into it, and then calls out to the native runC , passing it the modified runC spec with that hook set. It’s important to note that this component is not necessarily specific to docker (but it is specific to runC ).
When the package is installed, the Docker daemon.json is updated to point to the binary as can be seen below:
nvidia-docker2 В¶
This package is the only docker-specific package of the hierarchy. It takes the script associated with the nvidia-container-runtime and installs it into docker’s /etc/docker/daemon.json file. This then allows you to run (for example) docker run —runtime=nvidia . to automatically add GPU support to your containers. It also installs a wrapper script around the native docker CLI called nvidia-docker which lets you invoke docker without needing to specify —runtime=nvidia every single time. It also lets you set an environment variable on the host ( NV_GPU ) to specify which GPUs should be injected into a container.
Which package should I use then?В¶
Given this hierarchy of components it’s easy to see that if you only install nvidia-container-toolkit , then you will not get nvidia-container-runtime installed as part of it, and thus —runtime=nvidia will not be available to you. With Docker 19.03+, this is fine because Docker directly invokes nvidia-container-toolkit when you pass it the —gpus option instead of relying on the nvidia-container-runtime as a proxy.
However, if you want to use Kubernetes with Docker 19.03+, you actually need to continue using nvidia-docker2 because Kubernetes doesn’t support passing GPU information down to docker through the —gpus flag yet. It still relies on nvidia-container-runtime to pass GPU information down the runtime stack via a set of environment variables.
The same container runtime stack is used regardless of whether nvidia-docker2 or nvidia-container-toolkit is used. Using nvidia-docker2 will install a thin runtime that can proxy GPU information down to nvidia-container-toolkit via environment variables instead of relying on the —gpus flag to have Docker do it directly.
For purposes of simplicity (and backwards compatibility), it is recommended to continue using nvidia-docker2 as the top-level install package.
See the Installation Guide for more information on installing nvidia-docker2 on various Linux distributions.
Package RepositoryВ¶
The packages for the various components listed above are available in the gh-pages branch of the GitHub repos of these projects. This is particularly useful for air-gapped deployments that may want to get access to the actual packages ( .deb and .rpm ) to support offline installs.
For the different components:
Releases of the software are also hosted on experimental branch of the repository and are graduated to stable after test/validation. To get access to the latest experimental features of the NVIDIA Container Toolkit, you may need to add the experimental branch to the apt or yum repository listing. The installation instructions include information on how to add these repository listings for the package manager.
Источник
NVIDIA Container Toolkit
What is the NVIDIA Container Toolkit and how can it be used to run Linux containers with full GPU acceleration?
- The NVIDIA Container Toolkit (formerly known as NVIDIA Docker) allows Linux containers to access full GPU acceleration.
- All graphics APIs are supported, including OpenGL, Vulkan, OpenCL, CUDA and NVENC/NVDEC.
- This only works with NVIDIA GPUs for Linux containers running on Linux host systems or inside WSL2.
Contents
Overview
The NVIDIA Container Toolkit (formerly known as NVIDIA Docker) is a library and accompanying set of tools for exposing NVIDIA graphics devices to Linux containers. It provides full GPU acceleration for containers running under Docker, containerd, LXC, Podman and Kubernetes. If you are interested in learning about the underlying architecture of the NVIDIA Container Toolkit then be sure to check out the Architecture Overview page of the official documentation.
Containers running with GPU acceleration have access to all supported graphics APIs on NVIDIA GPUs, including OpenGL, Vulkan, OpenCL, CUDA and NVENC/NVDEC. For details of what these APIs are used for, see the GPU acceleration in containers overview page.
The NVIDIA Container Toolkit is designed specifically for Linux containers running directly on Linux host systems or within Linux distributions under version 2 of the Windows Subsystem for Linux (WSL2). The underlying code does not support Windows containers, nor can it be used when running Linux containers on macOS or Windows without WSL2 due to the fact that containers are run inside a Linux VM that does not have GPU access. However, Docker clients running under Windows and macOS can still be used to connect to a Docker daemon running under Linux with the NVIDIA Container Toolkit.
For details of alternative options for other GPU vendors and operating systems, see the GPU acceleration in containers overview page.
Installation under Linux
As per the supported platforms list and prerequisites list from the NVIDIA Container Toolkit Installation Guide, you will need to ensure you have a supported Linux distribution and a supported NVIDIA GPU.
Install the NVIDIA binary GPU driver, ensuring you use a version that meets the minimum requirements for the CUDA version you intend to use or at least version 418.81.07 if you don’t intend to use CUDA.
Install the NVIDIA Container Toolkit by following the instructions for your specific Linux distribution.
If you would like to test out a specific graphics API, pull the relevant NVIDIA base container images from Docker Hub:
- nvidia/opengl for OpenGL support
- nvidia/cuda for CUDA support
- nvidia/cudagl for OpenGL + CUDA support
- nvidia/vulkan for OpenGL + Vulkan + CUDA support
- nvidia/opencl for OpenCL support
If you intend to use the Unreal Engine with a runtime container image then be sure to choose a base image that is pre-configured to support the NVIDIA Container Toolkit.
If you intend to use the Unreal Engine with a development container image then you will need to choose an image source that supports the NVIDIA Container Toolkit. Depending on the graphics APIs that you are interested in using, it may be necessary to build a development image from source that extends the relevant NVIDIA base image.
Installation under Windows with WSL2
The authors of this documentation are still in the process of familiarising themselves with the use of the NVIDIA Container Toolkit under WSL2. This section will be updated when the relevant information has been gathered.
Container image compatibility
Container images built on a system that has the NVIDIA Container Toolkit installed will be identical to container images built on a system without the NVIDIA Container Toolkit. This is because GPU acceleration is not enabled during the build process. The resulting container images can be run with GPU acceleration using the NVIDIA Container Toolkit or without GPU acceleration using any OCI-compatible container runtime.
Older versions of NVIDIA Docker allowed the Docker daemon to use NVIDIA Docker as the default container runtime, which enabled GPU acceleration during image builds and meant that any container images built by that Docker daemon could be rendered non-portable. For this reason, it was strongly recommended that you did not reconfigure the default container runtime on hosts that were used to build containers. This option is not present in newer versions of the NVIDIA Container Toolkit.
Copyright © 2019 — 2021, Adam Rehn and the Unreal Containers community contributors. Unless otherwise noted, all content on this site is licensed under a Creative Commons Attribution 4.0 International License.
Docker and the Docker logo are trademarks or registered trademarks of Docker in the United States and other countries.
Unreal and its logo are Epic Games’ trademarks or registered trademarks in the US and elsewhere.
Источник
Nvidia container toolkit linux
NVIDIA Container Toolkit
The NVIDIA Container Toolkit allows users to build and run GPU accelerated Docker containers. The toolkit includes a container runtime library and utilities to automatically configure containers to leverage NVIDIA GPUs.
Product documentation including an architecture overview, platform support, installation and usage guides can be found in the documentation repository.
Frequently asked questions are available on the wiki.
Make sure you have installed the NVIDIA driver and Docker engine for your Linux distribution Note that you do not need to install the CUDA Toolkit on the host system, but the NVIDIA driver needs to be installed
For instructions on getting started with the NVIDIA Container Toolkit, refer to the installation guide.
The user guide provides information on the configuration and command line options available when running GPU containers with Docker.
Issues and Contributing
- Please let us know by filing a new issue
- You can contribute by opening a pull request
About
Build and run Docker containers leveraging NVIDIA GPUs
Источник
Installation GuideВ¶
Supported PlatformsВ¶
The NVIDIA Container Toolkit is available on a variety of Linux distributions and supports different container engines.
Linux DistributionsВ¶
Supported Linux distributions are listed below:
OS Name / Version
Amazon Linux 2017.09
Amazon Linux 2018.03
Open Suse/SLES 15.0
Open Suse/SLES 15.x
Debian Linux 10
(*) Minor releases of RHEL 7 and RHEL 8 (i.e. 7.4 -> 7.9 are symlinked to centos7 and 8.0 -> 8.3 are symlinked to centos8 resp.)
Container RuntimesВ¶
Supported container runtimes are listed below:
OS Name / Version
RHEL/CentOS 8 podman
CentOS 8 Docker
RHEL/CentOS 7 Docker
On Red Hat Enterprise Linux (RHEL) 8, Docker is no longer a supported container runtime. See Building, Running and Managing Containers for more information on the container tools available on the distribution.
Pre-RequisitesВ¶
NVIDIA DriversВ¶
Before you get started, make sure you have installed the NVIDIA driver for your Linux distribution. The recommended way to install drivers is to use the package manager for your distribution but other installer mechanisms are also available (e.g. by downloading .run installers from NVIDIA Driver Downloads).
For instructions on using your package manager to install drivers from the official CUDA network repository, follow the steps in this guide.
Platform RequirementsВ¶
The list of prerequisites for running NVIDIA Container Toolkit is described below:
GNU/Linux x86_64 with kernel version > 3.10
Docker >= 19.03 (recommended, but some distributions may include older versions of Docker. The minimum supported version is 1.12)
NVIDIA GPU with Architecture >= Kepler (or compute capability 3.0)
NVIDIA Linux drivers >= 418.81.07 (Note that older driver releases or branches are unsupported.)
Your driver version might limit your CUDA capabilities. Newer NVIDIA drivers are backwards-compatible with CUDA Toolkit versions, but each new version of CUDA requires a minimum driver version. Running a CUDA container requires a machine with at least one CUDA-capable GPU and a driver compatible with the CUDA toolkit version you are using. The machine running the CUDA container only requires the NVIDIA driver, the CUDA toolkit doesn’t have to be installed. The CUDA release notes includes a table of the minimum driver and CUDA Toolkit versions.
DockerВ¶
Getting StartedВ¶
For installing Docker CE, follow the official instructions for your supported Linux distribution. For convenience, the documentation below includes instructions on installing Docker for various Linux distributions.
If you are migrating fron nvidia-docker 1.0, then follow the instructions in the Migration from nvidia-docker 1.0 guide.
Installing on Ubuntu and DebianВ¶
The following steps can be used to setup NVIDIA Container Toolkit on Ubuntu LTS — 16.04, 18.04, 20.4 and Debian — Stretch, Buster distributions.
Setting up DockerВ¶
Docker-CE on Ubuntu can be setup using Docker’s official convenience script:
Follow the official instructions for more details and post-install actions.
Setting up NVIDIA Container ToolkitВ¶
Setup the stable repository and the GPG key:
To get access to experimental features such as CUDA on WSL or the new MIG capability on A100, you may want to add the experimental branch to the repository listing:
Install the nvidia-docker2 package (and dependencies) after updating the package listing:
Restart the Docker daemon to complete the installation after setting the default runtime:
At this point, a working setup can be tested by running a base CUDA container:
This should result in a console output shown below:
Installing on CentOS 7/8В¶
The following steps can be used to setup the NVIDIA Container Toolkit on CentOS 7/8.
Setting up Docker on CentOS 7/8В¶
If you’re on a cloud instance such as EC2, then the official CentOS images may not include tools such as iptables which are required for a successful Docker installation. Try this command to get a more functional VM, before proceeding with the remaining steps outlined in this document.
Источник