- Post-installation steps for Linux
- Manage Docker as a non-root user
- Configure Docker to start on boot
- Use a different storage engine
- Configure default logging driver
- Configure where the Docker daemon listens for connections
- Configuring remote access with systemd unit file
- Пользователь в Docker
- Создание пользователя
- Запуск процессов от пользователя
- Монтирование томов
- Присвоение пользователю UID и GID
- Передача идентификатора пользователя внутрь контейнера при построении образа
- Добавление пользователя в группу docker.
- Docker run reference
- General form
- Operator exclusive options
- Detached vs foreground
- Detached (-d)
- Foreground
- Container identification
- Name (—name)
- PID equivalent
- Image[:tag]
- Image[@digest]
- PID settings (—pid)
- Example: run htop inside a container
- Example
- UTS settings (—uts)
- IPC settings (—ipc)
- Network settings
- Network: none
- Network: bridge
- Network: host
- Network: container
- User-defined network
- Managing /etc/hosts
- Restart policies (—restart)
- Examples
- Exit Status
- Clean up (—rm)
- Security configuration
- Specify an init process
- Specify custom cgroups
- Runtime constraints on resources
- User memory constraints
- Kernel memory constraints
- Swappiness constraint
- CPU share constraint
- CPU period constraint
- Cpuset constraint
- CPU quota constraint
- Block IO bandwidth (Blkio) constraint
- Additional groups
- Runtime privilege and Linux capabilities
- Logging drivers (—log-driver)
- Overriding Dockerfile image defaults
- CMD (default command or options)
- ENTRYPOINT (default command to execute at runtime)
- EXPOSE (incoming ports)
- ENV (environment variables)
Post-installation steps for Linux
Estimated reading time: 15 minutes
This section contains optional procedures for configuring Linux hosts to work better with Docker.
Manage Docker as a non-root user
The Docker daemon binds to a Unix socket instead of a TCP port. By default that Unix socket is owned by the user root and other users can only access it using sudo . The Docker daemon always runs as the root user.
If you don’t want to preface the docker command with sudo , create a Unix group called docker and add users to it. When the Docker daemon starts, it creates a Unix socket accessible by members of the docker group.
The docker group grants privileges equivalent to the root user. For details on how this impacts security in your system, see Docker Daemon Attack Surface.
To create the docker group and add your user:
Create the docker group.
Add your user to the docker group.
Log out and log back in so that your group membership is re-evaluated.
If testing on a virtual machine, it may be necessary to restart the virtual machine for changes to take effect.
On a desktop Linux environment such as X Windows, log out of your session completely and then log back in.
On Linux, you can also run the following command to activate the changes to groups:
Verify that you can run docker commands without sudo .
This command downloads a test image and runs it in a container. When the container runs, it prints a message and exits.
If you initially ran Docker CLI commands using sudo before adding your user to the docker group, you may see the following error, which indicates that your
/.docker/ directory was created with incorrect permissions due to the sudo commands.
To fix this problem, either remove the
/.docker/ directory (it is recreated automatically, but any custom settings are lost), or change its ownership and permissions using the following commands:
Configure Docker to start on boot
Most current Linux distributions (RHEL, CentOS, Fedora, Debian, Ubuntu 16.04 and higher) use systemd to manage which services start when the system boots. On Debian and Ubuntu, the Docker service is configured to start on boot by default. To automatically start Docker and Containerd on boot for other distros, use the commands below:
To disable this behavior, use disable instead.
If you need to add an HTTP Proxy, set a different directory or partition for the Docker runtime files, or make other customizations, see customize your systemd Docker daemon options.
Use a different storage engine
For information about the different storage engines, see Storage drivers. The default storage engine and the list of supported storage engines depend on your host’s Linux distribution and available kernel drivers.
Configure default logging driver
Docker provides the capability to collect and view log data from all containers running on a host via a series of logging drivers. The default logging driver, json-file , writes log data to JSON-formatted files on the host filesystem. Over time, these log files expand in size, leading to potential exhaustion of disk resources.
To alleviate such issues, either configure the json-file logging driver to enable log rotation, use an alternative logging driver such as the “local” logging driver that performs log rotation by default, or use a logging driver that sends logs to a remote logging aggregator.
Configure where the Docker daemon listens for connections
By default, the Docker daemon listens for connections on a UNIX socket to accept requests from local clients. It is possible to allow Docker to accept requests from remote hosts by configuring it to listen on an IP address and port as well as the UNIX socket. For more detailed information on this configuration option take a look at “Bind Docker to another host/port or a unix socket” section of the Docker CLI Reference article.
Before configuring Docker to accept connections from remote hosts it is critically important that you understand the security implications of opening docker to the network. If steps are not taken to secure the connection, it is possible for remote non-root users to gain root access on the host. For more information on how to use TLS certificates to secure this connection, check this article on how to protect the Docker daemon socket.
Configuring Docker to accept remote connections can be done with the docker.service systemd unit file for Linux distributions using systemd, such as recent versions of RedHat, CentOS, Ubuntu and SLES, or with the daemon.json file which is recommended for Linux distributions that do not use systemd.
Configuring Docker to listen for connections using both the systemd unit file and the daemon.json file causes a conflict that prevents Docker from starting.
Configuring remote access with systemd unit file
Use the command sudo systemctl edit docker.service to open an override file for docker.service in a text editor.
Add or modify the following lines, substituting your own values.
Источник
Пользователь в Docker
Андрей Копылов, наш технический директор, любит, активно использует и пропагандирует Docker. В новой статье он рассказывает, как создать пользователей в Docker. Правильная работа с ними, почему пользователей нельзя оставлять с root правами и, как решить задачу несовпадения идентификаторов в Dockerfile.
Все процессы в контейнере будут работать из-под пользователя root, если специальным образом его не указать. Это кажется очень удобно, ведь у этого пользователя нет никаких ограничений. Именно поэтому работать под рутом неправильно с точки зрения безопасности. Если на локальном компьютере никто в здравом уме не работает с рутовыми правами, то многие запускают процессы под рутом в контейнерах.
Всегда есть баги, которые позволят зловреду выбраться из контейнера и попасть на хостовый компьютер. Предполагая худшее, мы должны обеспечить запуск процессов внутри контейнера от пользователя, который не имеет никаких прав на хостовой машине.
Создание пользователя
Создание пользователя в контейнере не отличается от его создания в линуксовых дистрибутивах. Однако для разных базовых образов команды могут различаться.
Для дистрибутивов основанных на debian в Dockerfile необходимо добавить:
Запуск процессов от пользователя
Для запуска всех последующих процессов от пользователя с UID 2000 выполните:
Для запуска всех последующих процессов от пользователя node выполните:
Монтирование томов
При монтировании томов внутрь контейнера обеспечьте пользователю возможность читать и (или) писать файлы. Для этого UID (GID) пользователя в контейнере и пользователя за пределами контейнера, у которого есть соответствующие права на доступ к файлу, должны соответствовать. При этом имена пользователей значения не имеют.
Часто на линуксовом компьютере у пользователя UID и GID равны 1000. Эти идентификаторы присваиваются первому пользователю компьютера.
Узнать свои идентификаторы просто:
Вы получите исчерпывающую информацию о своем пользователе.
Замените 2000 из примеров на свой идентификатор и все будет в порядке.
Присвоение пользователю UID и GID
Если пользователь создан ранее, но необходимо изменить идентификаторы, то можно сделать это так:
Если вы используете базовый образ alpine, то нужно установить пакет shadow:
Передача идентификатора пользователя внутрь контейнера при построении образа
Если ваш идентификатор и идентификаторы всех людей, которые работают над проектом, совпадают, то достаточно указать этот идентификатор в Dockerfile. Однако часто идентификаторы пользователей не совпадают.
Как осуществить желаемое не сразу понятно. Для меня это было самым сложным в процессе освоения docker. Многие пользователи docker не задумываются о том, что есть разные этапы жизни образа. Сначала образ собирается для этого используют Dockerfile. При запуске контейнера из образа Dockerfile уже не используется.
Создание пользователей должно происходить при построении образа. Это же касается и определения пользователя, из-под которого запускаются процессы. Значит, что мы каким-то образом должны передать внутрь контейнера UID (GID).
Для использования внешних переменных в Dockerfile служат директивы ENV и ARG. Подробное сравнение директив тут.
Передать аргументы через docker-compose можно так:
Источник
Добавление пользователя в группу docker.
После установки пакета docker в операционных системах Linux, если попробовать вызвать любую команду docker без sudo то возникнет ошибка:
Ошибка доступа без использования sudo.
Далее в терминальных командах будет использоваться $ , в случае если вам будет нужно в группу docker добавить пользователя, под которым вы не работаете, то замените на имя нужного пользователя.
При локальной разработке в Linux, чтобы постоянно не запускать команды docker с правами root , можно добавить пользователя в группу docker.
Сделать это очень просто с помощью команды:
Добавление текущего пользователя в группу docker.
Команда отрабатывает без вывода успеха в консоль. И теперь, чтобы изменения вступили в силу нужно либо выйти и снова залогиниться на удаленный сервер, или в случае локальной «машины», достаточно выполнить команду:
Потребуется ввести пароль пользователя и вы получите новую сессию для пользователя.
Теперь если посмотреть в каких группах состоит текущий пользователь, то появится группа 998(docker) . Для этого достаточно выполнить команду:
Источник
Docker run reference
Docker runs processes in isolated containers. A container is a process which runs on a host. The host may be local or remote. When an operator executes docker run , the container process that runs is isolated in that it has its own file system, its own networking, and its own isolated process tree separate from the host.
This page details how to use the docker run command to define the container’s resources at runtime.
General form
The basic docker run command takes this form:
The docker run command must specify an IMAGE to derive the container from. An image developer can define image defaults related to:
- detached or foreground running
- container identification
- network settings
- runtime constraints on CPU and memory
With the docker run [OPTIONS] an operator can add to or override the image defaults set by a developer. And, additionally, operators can override nearly all the defaults set by the Docker runtime itself. The operator’s ability to override image and Docker runtime defaults is why run has more options than any other docker command.
To learn how to interpret the types of [OPTIONS] , see Option types.
Depending on your Docker system configuration, you may be required to preface the docker run command with sudo . To avoid having to use sudo with the docker command, your system administrator can create a Unix group called docker and add users to it. For more information about this configuration, refer to the Docker installation documentation for your operating system.
Operator exclusive options
Only the operator (the person executing docker run ) can set the following options.
Detached vs foreground
When starting a Docker container, you must first decide if you want to run the container in the background in a “detached” mode or in the default foreground mode:
Detached (-d)
To start a container in detached mode, you use -d=true or just -d option. By design, containers started in detached mode exit when the root process used to run the container exits, unless you also specify the —rm option. If you use -d with —rm , the container is removed when it exits or when the daemon exits, whichever happens first.
Do not pass a service x start command to a detached container. For example, this command attempts to start the nginx service.
This succeeds in starting the nginx service inside the container. However, it fails the detached container paradigm in that, the root process ( service nginx start ) returns and the detached container stops as designed. As a result, the nginx service is started but could not be used. Instead, to start a process such as the nginx web server do the following:
To do input/output with a detached container use network connections or shared volumes. These are required because the container is no longer listening to the command line where docker run was run.
To reattach to a detached container, use docker attach command.
Foreground
In foreground mode (the default when -d is not specified), docker run can start the process in the container and attach the console to the process’s standard input, output, and standard error. It can even pretend to be a TTY (this is what most command line executables expect) and pass along signals. All of that is configurable:
If you do not specify -a then Docker will attach to both stdout and stderr . You can specify to which of the three standard streams ( STDIN , STDOUT , STDERR ) you’d like to connect instead, as in:
For interactive processes (like a shell), you must use -i -t together in order to allocate a tty for the container process. -i -t is often written -it as you’ll see in later examples. Specifying -t is forbidden when the client is receiving its standard input from a pipe, as in:
A process running as PID 1 inside a container is treated specially by Linux: it ignores any signal with the default action. As a result, the process will not terminate on SIGINT or SIGTERM unless it is coded to do so.
Container identification
Name (—name)
The operator can identify a container in three ways:
Identifier type | Example value |
---|---|
UUID long identifier | “f78375b1c487e03c9438c729345e54db9d20cfa2ac1fc3494b6eb60872e74778” |
UUID short identifier | “f78375b1c487” |
Name | “evil_ptolemy” |
The UUID identifiers come from the Docker daemon. If you do not assign a container name with the —name option, then the daemon generates a random string name for you. Defining a name can be a handy way to add meaning to a container. If you specify a name , you can use it when referencing the container within a Docker network. This works for both background and foreground Docker containers.
Containers on the default bridge network must be linked to communicate by name.
PID equivalent
Finally, to help with automation, you can have Docker write the container ID out to a file of your choosing. This is similar to how some programs might write out their process ID to a file (you’ve seen them as PID files):
Image[:tag]
While not strictly a means of identifying a container, you can specify a version of an image you’d like to run the container with by adding image[:tag] to the command. For example, docker run ubuntu:14.04 .
Image[@digest]
Images using the v2 or later image format have a content-addressable identifier called a digest. As long as the input used to generate the image is unchanged, the digest value is predictable and referenceable.
The following example runs a container from the alpine image with the sha256:9cacb71397b640eca97488cf08582ae4e4068513101088e9f96c9814bfda95e0 digest:
PID settings (—pid)
By default, all containers have the PID namespace enabled.
PID namespace provides separation of processes. The PID Namespace removes the view of the system processes, and allows process ids to be reused including pid 1.
In certain cases you want your container to share the host’s process namespace, basically allowing processes within the container to see all of the processes on the system. For example, you could build a container with debugging tools like strace or gdb , but want to use these tools when debugging processes within the container.
Example: run htop inside a container
Create this Dockerfile:
Build the Dockerfile and tag the image as myhtop :
Use the following command to run htop inside a container:
Joining another container’s pid namespace can be used for debugging that container.
Example
Start a container running a redis server:
Debug the redis container by running another container that has strace in it:
UTS settings (—uts)
The UTS namespace is for setting the hostname and the domain that is visible to running processes in that namespace. By default, all containers, including those with —network=host , have their own UTS namespace. The host setting will result in the container using the same UTS namespace as the host. Note that —hostname and —domainname are invalid in host UTS mode.
You may wish to share the UTS namespace with the host if you would like the hostname of the container to change as the hostname of the host changes. A more advanced use case would be changing the host’s hostname from a container.
IPC settings (—ipc)
The following values are accepted:
Value | Description |
---|---|
”” | Use daemon’s default. |
“none” | Own private IPC namespace, with /dev/shm not mounted. |
“private” | Own private IPC namespace. |
“shareable” | Own private IPC namespace, with a possibility to share it with other containers. |
“container: « | Join another (“shareable”) container’s IPC namespace. |
“host” | Use the host system’s IPC namespace. |
If not specified, daemon default is used, which can either be «private» or «shareable» , depending on the daemon version and configuration.
IPC (POSIX/SysV IPC) namespace provides separation of named shared memory segments, semaphores and message queues.
Shared memory segments are used to accelerate inter-process communication at memory speed, rather than through pipes or through the network stack. Shared memory is commonly used by databases and custom-built (typically C/OpenMPI, C++/using boost libraries) high performance applications for scientific computing and financial services industries. If these types of applications are broken into multiple containers, you might need to share the IPC mechanisms of the containers, using «shareable» mode for the main (i.e. “donor”) container, and «container: » for other containers.
Network settings
By default, all containers have networking enabled and they can make any outgoing connections. The operator can completely disable networking with docker run —network none which disables all incoming and outgoing networking. In cases like this, you would perform I/O through files or STDIN and STDOUT only.
Publishing ports and linking to other containers only works with the default (bridge). The linking feature is a legacy feature. You should always prefer using Docker network drivers over linking.
Your container will use the same DNS servers as the host by default, but you can override this with —dns .
By default, the MAC address is generated using the IP address allocated to the container. You can set the container’s MAC address explicitly by providing a MAC address via the —mac-address parameter (format: 12:34:56:78:9a:bc ).Be aware that Docker does not check if manually specified MAC addresses are unique.
Network | Description |
---|---|
none | No networking in the container. |
bridge (default) | Connect the container to the bridge via veth interfaces. |
host | Use the host’s network stack inside the container. |
container: | Use the network stack of another container, specified via its name or id. |
NETWORK | Connects the container to a user created network (using docker network create command) |
Network: none
With the network is none a container will not have access to any external routes. The container will still have a loopback interface enabled in the container but it does not have any routes to external traffic.
Network: bridge
With the network set to bridge a container will use docker’s default networking setup. A bridge is setup on the host, commonly named docker0 , and a pair of veth interfaces will be created for the container. One side of the veth pair will remain on the host attached to the bridge while the other side of the pair will be placed inside the container’s namespaces in addition to the loopback interface. An IP address will be allocated for containers on the bridge’s network and traffic will be routed though this bridge to the container.
Containers can communicate via their IP addresses by default. To communicate by name, they must be linked.
Network: host
With the network set to host a container will share the host’s network stack and all interfaces from the host will be available to the container. The container’s hostname will match the hostname on the host system. Note that —mac-address is invalid in host netmode. Even in host network mode a container has its own UTS namespace by default. As such —hostname and —domainname are allowed in host network mode and will only change the hostname and domain name inside the container. Similar to —hostname , the —add-host , —dns , —dns-search , and —dns-option options can be used in host network mode. These options update /etc/hosts or /etc/resolv.conf inside the container. No change are made to /etc/hosts and /etc/resolv.conf on the host.
Compared to the default bridge mode, the host mode gives significantly better networking performance since it uses the host’s native networking stack whereas the bridge has to go through one level of virtualization through the docker daemon. It is recommended to run containers in this mode when their networking performance is critical, for example, a production Load Balancer or a High Performance Web Server.
—network=»host» gives the container full access to local system services such as D-bus and is therefore considered insecure.
Network: container
With the network set to container a container will share the network stack of another container. The other container’s name must be provided in the format of —network container: . Note that —add-host —hostname —dns —dns-search —dns-option and —mac-address are invalid in container netmode, and —publish —publish-all —expose are also invalid in container netmode.
Example running a Redis container with Redis binding to localhost then running the redis-cli command and connecting to the Redis server over the localhost interface.
User-defined network
You can create a network using a Docker network driver or an external network driver plugin. You can connect multiple containers to the same network. Once connected to a user-defined network, the containers can communicate easily using only another container’s IP address or name.
For overlay networks or custom plugins that support multi-host connectivity, containers connected to the same multi-host network but launched from different Engines can also communicate in this way.
The following example creates a network using the built-in bridge network driver and running a container in the created network
Managing /etc/hosts
Your container will have lines in /etc/hosts which define the hostname of the container itself as well as localhost and a few other common things. The —add-host flag can be used to add additional lines to /etc/hosts .
If a container is connected to the default bridge network and linked with other containers, then the container’s /etc/hosts file is updated with the linked container’s name.
Since Docker may live update the container’s /etc/hosts file, there may be situations when processes inside the container can end up reading an empty or incomplete /etc/hosts file. In most cases, retrying the read again should fix the problem.
Restart policies (—restart)
Using the —restart flag on Docker run you can specify a restart policy for how a container should or should not be restarted on exit.
When a restart policy is active on a container, it will be shown as either Up or Restarting in docker ps . It can also be useful to use docker events to see the restart policy in effect.
Docker supports the following restart policies:
Policy | Result |
---|---|
no | Do not automatically restart the container when it exits. This is the default. |
on-failure[:max-retries] | Restart only if the container exits with a non-zero exit status. Optionally, limit the number of restart retries the Docker daemon attempts. |
always | Always restart the container regardless of the exit status. When you specify always, the Docker daemon will try to restart the container indefinitely. The container will also always start on daemon startup, regardless of the current state of the container. |
unless-stopped | Always restart the container regardless of the exit status, including on daemon startup, except if the container was put into a stopped state before the Docker daemon was stopped. |
An increasing delay (double the previous delay, starting at 100 milliseconds) is added before each restart to prevent flooding the server. This means the daemon will wait for 100 ms, then 200 ms, 400, 800, 1600, and so on until either the on-failure limit, the maximum delay of 1 minute is hit, or when you docker stop or docker rm -f the container.
If a container is successfully restarted (the container is started and runs for at least 10 seconds), the delay is reset to its default value of 100 ms.
You can specify the maximum amount of times Docker will try to restart the container when using the on-failure policy. The default is that Docker will try forever to restart the container. The number of (attempted) restarts for a container can be obtained via docker inspect . For example, to get the number of restarts for container “my-container”;
Or, to get the last time the container was (re)started;
Combining —restart (restart policy) with the —rm (clean up) flag results in an error. On container restart, attached clients are disconnected. See the examples on using the —rm (clean up) flag later in this page.
Examples
This will run the redis container with a restart policy of always so that if the container exits, Docker will restart it.
This will run the redis container with a restart policy of on-failure and a maximum restart count of 10. If the redis container exits with a non-zero exit status more than 10 times in a row Docker will abort trying to restart the container. Providing a maximum restart limit is only valid for the on-failure policy.
Exit Status
The exit code from docker run gives information about why the container failed to run or why it exited. When docker run exits with a non-zero code, the exit codes follow the chroot standard, see below:
125 if the error is with Docker daemon itself
126 if the contained command cannot be invoked
127 if the contained command cannot be found
Exit code of contained command otherwise
Clean up (—rm)
By default a container’s file system persists even after the container exits. This makes debugging a lot easier (since you can inspect the final state) and you retain all your data by default. But if you are running short-term foreground processes, these container file systems can really pile up. If instead you’d like Docker to automatically clean up the container and remove the file system when the container exits, you can add the —rm flag:
If you set the —rm flag, Docker also removes the anonymous volumes associated with the container when the container is removed. This is similar to running docker rm -v my-container . Only volumes that are specified without a name are removed. For example, when running:
the volume for /foo will be removed, but the volume for /bar will not. Volumes inherited via —volumes-from will be removed with the same logic: if the original volume was specified with a name it will not be removed.
Security configuration
Option | Description |
---|---|
—security-opt=»label=user:USER» | Set the label user for the container |
—security-opt=»label=role:ROLE» | Set the label role for the container |
—security-opt=»label=type:TYPE» | Set the label type for the container |
—security-opt=»label=level:LEVEL» | Set the label level for the container |
—security-opt=»label=disable» | Turn off label confinement for the container |
—security-opt=»apparmor=PROFILE» | Set the apparmor profile to be applied to the container |
—security-opt=»no-new-privileges:true» | Disable container processes from gaining new privileges |
—security-opt=»seccomp=unconfined» | Turn off seccomp confinement for the container |
—security-opt=»seccomp=profile.json» | White-listed syscalls seccomp Json file to be used as a seccomp filter |
You can override the default labeling scheme for each container by specifying the —security-opt flag. Specifying the level in the following command allows you to share the same content between containers.
Automatic translation of MLS labels is not currently supported.
To disable the security labeling for this container versus running with the —privileged flag, use the following command:
If you want a tighter security policy on the processes within a container, you can specify an alternate type for the container. You could run a container that is only allowed to listen on Apache ports by executing the following command:
You would have to write policy defining a svirt_apache_t type.
If you want to prevent your container processes from gaining additional privileges, you can execute the following command:
This means that commands that raise privileges such as su or sudo will no longer work. It also causes any seccomp filters to be applied later, after privileges have been dropped which may mean you can have a more restrictive set of filters. For more details, see the kernel documentation.
Specify an init process
You can use the —init flag to indicate that an init process should be used as the PID 1 in the container. Specifying an init process ensures the usual responsibilities of an init system, such as reaping zombie processes, are performed inside the created container.
The default init process used is the first docker-init executable found in the system path of the Docker daemon process. This docker-init binary, included in the default installation, is backed by tini.
Specify custom cgroups
Using the —cgroup-parent flag, you can pass a specific cgroup to run a container in. This allows you to create and manage cgroups on their own. You can define custom resources for those cgroups and put containers under a common parent group.
Runtime constraints on resources
The operator can also adjust the performance parameters of the container:
Option | Description |
---|---|
-m , —memory=»» | Memory limit (format: [ ] ). Number is a positive integer. Unit can be one of b , k , m , or g . Minimum is 4M. |
—memory-swap=»» | Total memory limit (memory + swap, format: [ ] ). Number is a positive integer. Unit can be one of b , k , m , or g . |
—memory-reservation=»» | Memory soft limit (format: [ ] ). Number is a positive integer. Unit can be one of b , k , m , or g . |
—kernel-memory=»» | Kernel memory limit (format: [ ] ). Number is a positive integer. Unit can be one of b , k , m , or g . Minimum is 4M. |
-c , —cpu-shares=0 | CPU shares (relative weight) |
—cpus=0.000 | Number of CPUs. Number is a fractional number. 0.000 means no limit. |
—cpu-period=0 | Limit the CPU CFS (Completely Fair Scheduler) period |
—cpuset-cpus=»» | CPUs in which to allow execution (0-3, 0,1) |
—cpuset-mems=»» | Memory nodes (MEMs) in which to allow execution (0-3, 0,1). Only effective on NUMA systems. |
—cpu-quota=0 | Limit the CPU CFS (Completely Fair Scheduler) quota |
—cpu-rt-period=0 | Limit the CPU real-time period. In microseconds. Requires parent cgroups be set and cannot be higher than parent. Also check rtprio ulimits. |
—cpu-rt-runtime=0 | Limit the CPU real-time runtime. In microseconds. Requires parent cgroups be set and cannot be higher than parent. Also check rtprio ulimits. |
—blkio-weight=0 | Block IO weight (relative weight) accepts a weight value between 10 and 1000. |
—blkio-weight-device=»» | Block IO weight (relative device weight, format: DEVICE_NAME:WEIGHT ) |
—device-read-bps=»» | Limit read rate from a device (format: : [ ] ). Number is a positive integer. Unit can be one of kb , mb , or gb . |
—device-write-bps=»» | Limit write rate to a device (format: : [ ] ). Number is a positive integer. Unit can be one of kb , mb , or gb . |
—device-read-iops=»» | Limit read rate (IO per second) from a device (format: : ). Number is a positive integer. |
—device-write-iops=»» | Limit write rate (IO per second) to a device (format: : ). Number is a positive integer. |
—oom-kill-disable=false | Whether to disable OOM Killer for the container or not. |
—oom-score-adj=0 | Tune container’s OOM preferences (-1000 to 1000) |
—memory-swappiness=»» | Tune a container’s memory swappiness behavior. Accepts an integer between 0 and 100. |
—shm-size=»» | Size of /dev/shm . The format is . number must be greater than 0 . Unit is optional and can be b (bytes), k (kilobytes), m (megabytes), or g (gigabytes). If you omit the unit, the system uses bytes. If you omit the size entirely, the system uses 64m . |
User memory constraints
We have four ways to set user memory usage:
Option | Result | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
memory=inf, memory-swap=inf (default) | There is no memory limit for the container. The container can use as much memory as needed. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
memory=L -1 ) The container is not allowed to use more than L bytes of memory, but can use as much swap as is needed (if the host supports swap memory). | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
memory=L -m / —memory option. When memory reservation is set, Docker detects memory contention or low memory and forces containers to restrict their consumption to a reservation limit. Always set the memory reservation value below the hard limit, otherwise the hard limit takes precedence. A reservation of 0 is the same as setting no reservation. By default (without reservation set), memory reservation is the same as the hard memory limit. Memory reservation is a soft-limit feature and does not guarantee the limit won’t be exceeded. Instead, the feature attempts to ensure that, when memory is heavily contended for, memory is allocated based on the reservation hints/setup. The following example limits the memory ( -m ) to 500M and sets the memory reservation to 200M. Under this configuration, when the container consumes memory more than 200M and less than 500M, the next system memory reclaim attempts to shrink container memory below 200M. The following example set memory reservation to 1G without a hard memory limit. The container can use as much memory as it needs. The memory reservation setting ensures the container doesn’t consume too much memory for long time, because every memory reclaim shrinks the container’s consumption to the reservation. By default, kernel kills processes in a container if an out-of-memory (OOM) error occurs. To change this behaviour, use the —oom-kill-disable option. Only disable the OOM killer on containers where you have also set the -m/—memory option. If the -m flag is not set, this can result in the host running out of memory and require killing the host’s system processes to free memory. The following example limits the memory to 100M and disables the OOM killer for this container: The following example, illustrates a dangerous way to use the flag: The container has unlimited memory which can cause the host to run out memory and require killing system processes to free memory. The —oom-score-adj parameter can be changed to select the priority of which containers will be killed when the system is out of memory, with negative scores making them less likely to be killed, and positive scores more likely. Kernel memory constraintsKernel memory is fundamentally different than user memory as kernel memory can’t be swapped out. The inability to swap makes it possible for the container to block system services by consuming too much kernel memory. Kernel memory includesпјљ
You can setup kernel memory limit to constrain these kinds of memory. For example, every process consumes some stack pages. By limiting kernel memory, you can prevent new processes from being created when the kernel memory usage is too high. Kernel memory is never completely independent of user memory. Instead, you limit kernel memory in the context of the user memory limit. Assume “U” is the user memory limit and “K” the kernel limit. There are three possible ways to set limits:
We set memory and kernel memory, so the processes in the container can use 500M memory in total, in this 500M memory, it can be 50M kernel memory tops. We set kernel memory without -m, so the processes in the container can use as much memory as they want, but they can only use 50M kernel memory. Swappiness constraintBy default, a container’s kernel can swap out a percentage of anonymous pages. To set this percentage for a container, specify a —memory-swappiness value between 0 and 100. A value of 0 turns off anonymous page swapping. A value of 100 sets all anonymous pages as swappable. By default, if you are not using —memory-swappiness , memory swappiness value will be inherited from the parent. For example, you can set: Setting the —memory-swappiness option is helpful when you want to retain the container’s working set and to avoid swapping performance penalties. CPU share constraintBy default, all containers get the same proportion of CPU cycles. This proportion can be modified by changing the container’s CPU share weighting relative to the weighting of all other running containers. To modify the proportion from the default of 1024, use the -c or —cpu-shares flag to set the weighting to 2 or higher. If 0 is set, the system will ignore the value and use the default of 1024. The proportion will only apply when CPU-intensive processes are running. When tasks in one container are idle, other containers can use the left-over CPU time. The actual amount of CPU time will vary depending on the number of containers running on the system. For example, consider three containers, one has a cpu-share of 1024 and two others have a cpu-share setting of 512. When processes in all three containers attempt to use 100% of CPU, the first container would receive 50% of the total CPU time. If you add a fourth container with a cpu-share of 1024, the first container only gets 33% of the CPU. The remaining containers receive 16.5%, 16.5% and 33% of the CPU. On a multi-core system, the shares of CPU time are distributed over all CPU cores. Even if a container is limited to less than 100% of CPU time, it can use 100% of each individual CPU core. For example, consider a system with more than three cores. If you start one container CPU period constraintThe default CPU CFS (Completely Fair Scheduler) period is 100ms. We can use —cpu-period to set the period of CPUs to limit the container’s CPU usage. And usually —cpu-period should work with —cpu-quota . If there is 1 CPU, this means the container can get 50% CPU worth of run-time every 50ms. In addition to use —cpu-period and —cpu-quota for setting CPU period constraints, it is possible to specify —cpus with a float number to achieve the same purpose. For example, if there is 1 CPU, then —cpus=0.5 will achieve the same result as setting —cpu-period=50000 and —cpu-quota=25000 (50% CPU). The default value for —cpus is 0.000 , which means there is no limit. Cpuset constraintWe can set cpus in which to allow execution for containers. This means processes in container can be executed on cpu 1 and cpu 3. This means processes in container can be executed on cpu 0, cpu 1 and cpu 2. We can set mems in which to allow execution for containers. Only effective on NUMA systems. This example restricts the processes in the container to only use memory from memory nodes 1 and 3. This example restricts the processes in the container to only use memory from memory nodes 0, 1 and 2. CPU quota constraintThe —cpu-quota flag limits the container’s CPU usage. The default 0 value allows the container to take 100% of a CPU resource (1 CPU). The CFS (Completely Fair Scheduler) handles resource allocation for executing processes and is default Linux Scheduler used by the kernel. Set this value to 50000 to limit the container to 50% of a CPU resource. For multiple CPUs, adjust the —cpu-quota as necessary. For more information, see the CFS documentation on bandwidth limiting. Block IO bandwidth (Blkio) constraintBy default, all containers get the same proportion of block IO bandwidth (blkio). This proportion is 500. To modify this proportion, change the container’s blkio weight relative to the weighting of all other running containers using the —blkio-weight flag. The blkio weight setting is only available for direct IO. Buffered IO is not currently supported. The —blkio-weight flag can set the weighting to a value between 10 to 1000. For example, the commands below create two containers with different blkio weight: If you do block IO in the two containers at the same time, by, for example: You’ll find that the proportion of time is the same as the proportion of blkio weights of the two containers. The —blkio-weight-device=»DEVICE_NAME:WEIGHT» flag sets a specific device weight. The DEVICE_NAME:WEIGHT is a string containing a colon-separated device name and weight. For example, to set /dev/sda device weight to 200 : If you specify both the —blkio-weight and —blkio-weight-device , Docker uses the —blkio-weight as the default weight and uses —blkio-weight-device to override this default with a new value on a specific device. The following example uses a default weight of 300 and overrides this default on /dev/sda setting that weight to 200 : The —device-read-bps flag limits the read rate (bytes per second) from a device. For example, this command creates a container and limits the read rate to 1mb per second from /dev/sda : The —device-write-bps flag limits the write rate (bytes per second) to a device. For example, this command creates a container and limits the write rate to 1mb per second for /dev/sda : Both flags take limits in the : The —device-read-iops flag limits read rate (IO per second) from a device. For example, this command creates a container and limits the read rate to 1000 IO per second from /dev/sda : The —device-write-iops flag limits write rate (IO per second) to a device. For example, this command creates a container and limits the write rate to 1000 IO per second to /dev/sda : Both flags take limits in the : Additional groupsBy default, the docker container process runs with the supplementary groups looked up for the specified user. If one wants to add more to that list of groups, then one can use this flag: Runtime privilege and Linux capabilities
By default, Docker containers are “unprivileged” and cannot, for example, run a Docker daemon inside a Docker container. This is because by default a container is not allowed to access any devices, but a “privileged” container is given access to all devices (see the documentation on cgroups devices). When the operator executes docker run —privileged , Docker will enable access to all devices on the host as well as set some configuration in AppArmor or SELinux to allow the container nearly all the same access to the host as processes running outside containers on the host. Additional information about running with —privileged is available on the Docker Blog. If you want to limit access to a specific device or devices you can use the —device flag. It allows you to specify one or more devices that will be accessible within the container. By default, the container will be able to read , write , and mknod these devices. This can be overridden using a third :rwm set of options to each —device flag: In addition to —privileged , the operator can have fine grain control over the capabilities using —cap-add and —cap-drop . By default, Docker has a default list of capabilities that are kept. The following table lists the Linux capability options which are allowed by default and can be dropped.
The next table shows the capabilities which are not granted by default and may be added.
Further reference information is available on the capabilities(7) — Linux man page, and in the Linux kernel source code. Both flags support the value ALL , so to allow a container to use all capabilities except for MKNOD : The —cap-add and —cap-drop flags accept capabilities to be specified with a CAP_ prefix. The following examples are therefore equivalent: For interacting with the network stack, instead of using —privileged they should use —cap-add=NET_ADMIN to modify the network interfaces. To mount a FUSE based filesystem, you need to combine both —cap-add and —device : The default seccomp profile will adjust to the selected capabilities, in order to allow use of facilities allowed by the capabilities, so you should not have to adjust this. Logging drivers (—log-driver)The container can have a different logging driver than the Docker daemon. Use the —log-driver=VALUE with the docker run command to configure the container’s logging driver. The following options are supported:
The docker logs command is available only for the json-file and journald logging drivers. For detailed information on working with logging drivers, see Configure logging drivers. Overriding Dockerfile image defaultsWhen a developer builds an image from a Dockerfile or when she commits it, the developer can set a number of default parameters that take effect when the image starts up as a container. Four of the Dockerfile commands cannot be overridden at runtime: FROM , MAINTAINER , RUN , and ADD . Everything else has a corresponding override in docker run . We’ll go through what the developer might have set in each Dockerfile instruction and how the operator can override that setting. CMD (default command or options)Recall the optional COMMAND in the Docker commandline: This command is optional because the person who created the IMAGE may have already provided a default COMMAND using the Dockerfile CMD instruction. As the operator (the person running a container from the image), you can override that CMD instruction just by specifying a new COMMAND . If the image also specifies an ENTRYPOINT then the CMD or COMMAND get appended as arguments to the ENTRYPOINT . ENTRYPOINT (default command to execute at runtime)The ENTRYPOINT of an image is similar to a COMMAND because it specifies what executable to run when the container starts, but it is (purposely) more difficult to override. The ENTRYPOINT gives a container its default nature or behavior, so that when you set an ENTRYPOINT you can run the container as if it were that binary, complete with default options, and you can pass in more options via the COMMAND . But, sometimes an operator may want to run something else inside the container, so you can override the default ENTRYPOINT at runtime by using a string to specify the new ENTRYPOINT . Here is an example of how to run a shell in a container that has been set up to automatically run something else (like /usr/bin/redis-server ): or two examples of how to pass more parameters to that ENTRYPOINT: You can reset a containers entrypoint by passing an empty string, for example: Passing —entrypoint will clear out any default command set on the image (i.e. any CMD instruction in the Dockerfile used to build it). EXPOSE (incoming ports)The following run command options work with container networking: With the exception of the EXPOSE directive, an image developer hasn’t got much control over networking. The EXPOSE instruction defines the initial incoming ports that provide services. These ports are available to processes inside the container. An operator can use the —expose option to add to the exposed ports. To expose a container’s internal port, an operator can start the container with the -P or -p flag. The exposed port is accessible on the host and the ports are available to any client that can reach the host. The -P option publishes all the ports to the host interfaces. Docker binds each exposed port to a random port on the host. The range of ports are within an ephemeral port range defined by /proc/sys/net/ipv4/ip_local_port_range . Use the -p flag to explicitly map a single port or range of ports. The port number inside the container (where the service listens) does not need to match the port number exposed on the outside of the container (where clients connect). For example, inside the container an HTTP service is listening on port 80 (and so the image developer specifies EXPOSE 80 in the Dockerfile). At runtime, the port might be bound to 42800 on the host. To find the mapping between the host ports and the exposed ports, use docker port . If the operator uses —link when starting a new client container in the default bridge network, then the client container can access the exposed port via a private networking interface. If —link is used when starting a container in a user-defined network as described in Networking overview, it will provide a named alias for the container being linked to. ENV (environment variables)Docker automatically sets some environment variables when creating a Linux container. Docker does not set any environment variables when creating a Windows container. The following environment variables are set for Linux containers:
Additionally, the operator can set any environment variable in the container by using one or more -e flags, even overriding those mentioned above, or already defined by the developer with a Dockerfile ENV . If the operator names an environment variable without specifying a value, then the current value of the named variable is propagated into the container’s environment: Similarly the operator can set the HOSTNAME (Linux) or COMPUTERNAME (Windows) with -h . Источник |