Flatcar container linux by kinfolk

locksmith is a reboot manager for the Flatcar update engine which is able to use etcd to ensure that only a subset of a cluster of machines are rebooting at any given time. locksmithd runs as a daemon on Flatcar hosts and is responsible for controlling the reboot behaviour after updates.

There are three different strategies that locksmithd can use after the update engine has successfully applied an update:

etcd-lock — reboot after first taking a lock in etcd.
reboot — reboot without taking a lock.
off — causes locksmithd to exit and do nothing.

These strategies will either be followed immediately after an update, or during the next available reboot window if one has been configured.

These strategies can be configured via /etc/flatcar/update.conf with a line that looks like:

The reboot strategy can also be configured through a Container Linux Config.

The default strategy is to follow the etcd-lock strategy if etcd is running, and to otherwise follow the reboot strategy.

locksmithctl is a simple client that can be use to introspect and control the lock used by locksmith. It is installed by default on Flatcar Container Linux.

Run locksmithctl -help for a list of command-line options.

All command-line options can also be specified using environment variables with a LOCKSMITHCTL_ prefix. For example, the -endpoint argument can be set using LOCKSMITHCTL_ENDPOINT .

Connecting to multiple endpoints

Multiple endpoints can be specified by passing the -endpoint= option for each endpoint, or by passing a comma-separated list of endpoints, e.g.:

Specifying multiple endpoints using an environment variable is supported by passing a comma-delimited list, e.g.:

Listing the Holders

In some cases a machine may go away permanently or semi-permanently while holding a reboot lock. A system administrator can clear the lock of a specific machine using the unlock command:

By default the reboot lock only allows a single holder. However, a user may want more than a single machine to be upgrading at a time. This can be done by increasing the semaphore count.

locksmithd coordinates the reboot lock in groups of machines. The default group is «», or the empty string. locksmithd will only coordinate the reboot lock with other machines in the same group.

The purpose of groups is to allow faster updating of certain sets of machines while maintaining availability of certain services. For example, in a cluster of 5 Flatcar hosts with all machines in the default group, if you have 2 load balancers and run locksmithctl set-max 2 , then it is possible that both load balancers would be rebooted at the same time, interrupting the service they provide. However, if the load balancers are put into their own group named «lb», and both the default group and the «lb» group have a max holder of 1, two reboots can occur at once, but both load balancers will never reboot at the same time.

To place machines in a group other than the default, locksmithd must be started with the -group=groupname flag or set the LOCKSMITHD_GROUP=groupname environment variable.

To control the semaphore of a group other than the default, you must invoke locksmithctl with the -group=groupname flag or set the LOCKSMITHCTL_GROUP=groupname environment variable.

locksmithd can be configured to only reboot during certain timeframes. These reboot windows work with any reboot strategy.

The reboot window is configured through two environment variables, LOCKSMITHD_REBOOT_WINDOW_START and LOCKSMITHD_REBOOT_WINDOW_LENGTH . Note that REBOOT_WINDOW_START and REBOOT_WINDOW_LENGTH are also acceptable. Here is an example configuration:

This would configure locksmithd to only reboot between 2pm and 3pm. Optionally, a day of week may be specified for the start of the window:

This would configure locksmithd to only reboot the system on Thursday after 11pm, or on Friday before 12:30am.

Currently, the only supported values for the day of week are short day names, e.g. Sun , Mon , Tue , Wed , Thu , Fri , and Sat , but the day of week can be upper or lower case. The time of day must be specified in 24-hour time format. The window length is expressed as input to go’s time.ParseDuration function.

The following section describes how locksmith works under the hood.

locksmith uses a semaphore in etcd, located at the key coreos.com/updateengine/rebootlock/semaphore , to coordinate the reboot lock. If a non-default group name is used, the etcd key will be coreos.com/updateengine/rebootlock/groups/$groupname/semaphore .

The semaphore is a JSON document, describing a simple semaphore, that clients swap to take the lock.

When it is first created it will be initialized like so:

For a client to take the lock, the document is swapped with this:

Please use the Flatcar issue tracker to report all bugs, issues, and feature requests.

Источник

2905.2.0 docker containers don’t start. #457

Comments

dabeck commented Jul 29, 2021

Description

Cluster containers with securityContext don’t start anymore:

Impact

All containers with a securityContext setting don’t start as expected.

Environment and steps to reproduce

Set-up: RKE Cluster with Flatcar 2905.2.0 nodes
Task: Update from previous release 2765.2.6
Action(s):
a. Update downloads and installs automatically
b. Node reboots
Error: Containers won’t start anymore.

Expected behavior

Containers should start.

Additional information

Imho this is related to the change:

Docker: disabled SELinux support in the Docker daemon

The text was updated successfully, but these errors were encountered:

tormath1 commented Jul 29, 2021

@dabeck hi and thanks for raising this issue!
Could you share the securityContext applied to your containers, in order to reproduce / investigate locally. 🙂

dabeck commented Jul 29, 2021

This for example is from linkerd sidecar-containers:

And this is from coreDNS:

Both failed after the upgrade.

tormath1 commented Jul 29, 2021

Could we assert that docker is not running with SELinux mode enabled ?

I can reproduce the issue if Docker is started with SELinux support:

dabeck commented Jul 29, 2021

thank you. You are right. Our nodes report this one:

unfortunately I’m not sure why this is enabled even on a fresh setup FlatCar node!
Do you have any advice? Our cloud-config (yes we’re actually still using cloud-config) has the following config for docker:

tormath1 commented Jul 29, 2021

@dabeck then we have a lead !

I would say that —live-restore could be the root cause but it should not preserve existing configuration for the daemon, only for containers. Could we assert that there is no configuration in /etc/docker/ ?

dabeck commented Jul 29, 2021 •

@tormath1 affirmative. /etc/docker has just a few certificates inside.

EDIT: As soon as I manually add —selinux-enabled=false to the cloud-config it works. 🤔

tormath1 commented Jul 29, 2021

As soon as I manually add —selinux-enabled=false to the cloud-config it works

perfect, because it should be the actual behavior of the pulled stable (see this commit: kinvolk/coreos-overlay@956f975).

affirmative. /etc/docker has just a few certificates inside.

Ok ! Then we could check that there is no other remaining drop-in that could override the default behavior — what’s the output of systemctl cat docker.service ?

dabeck commented Jul 29, 2021

Ok ! Then we could check that there is no other remaining drop-in that could override the default behavior — what’s the output of systemctl cat docker.service ?

dabeck commented Jul 29, 2021 •

Maybe I found something.

/usr/lib/coreos/dockerd has a default option for selinux-enabled

Since we use docker-machine to provision our nodes and docker-machine recognizes flatcar as coreos it uses this file as it displays in drop-in 10-machine.conf

dabeck commented Jul 29, 2021

Possible solution: flatcar should update /usr/lib/coreos/dockerd which is deprecated but should keep working since it is used by docker-machine.

Our workaround is to set the —selinux-enabled=false option by hand.

Источник

Microsoft buys Flatcar Container Linux creator Kinvolk

Microsoft has acquired Kinvolk and plans to bring the team to Azure to work on Azure Kubernetes Service, Azure Arc and other hybrid container platform capabilities.

Microsoft

Microsoft has acquired Kinvolk GmbH, the creator and distributor of Flatcar Container Linux, for an undisclosed amount, officials announced on April 29. Flatcar Container Linux is a Linux distribution designed for container workloads, which Kinvolk says has high security and low maintenance overhead.

Microsoft’s Brendan Burns, Corporate Vice President of Azure Compute, said the Kinvolk team will join Azure and contribute to the Azure Kubernetes Service (AKS), Azure Arc management platform and «future projects that will expand Azure’s hybrid container platform capabilities.» He added that the Kinvolk team will continue with their existing open-source projects, including the evolution of Flatcar Container Linux.

Burns said that Flatcar Container Linux already has a sizeable community of users on Azure, along with other clouds, and on-premises.

Kinvolk also did some early work with CoreOS (the company, not Microsoft’s Windows Core OS platform), as well as the Lokomotive and Inspektor Gadget projects. Lokomotive is a self-hosted Kubernetes distribution for bare-metal and cloud platforms that can run as a full-stack Kubernetes cluster on top of Flatcar Container Linux or another Kubernetes offering. Inspektor Gadget is set of debugging and inspection tools.

Kinvolk was founded in Berlin in 2015. It launched Flatcar Container Linux in 2018.

Microsoft is no stranger to open source these days. More than half of the workloads on Azure are Linux-based. And Microsoft even has its own Linux distribution called CBL-Mariner, which is a lightweight Linux distro that Microsoft uses for its own first-party Azure services and edge appliances.

Источник

Flatcar container linux by kinfolk

Mantle: Gluing Container Linux together

This repository is a collection of utilities for developing Container Linux. Most of the tools are for uploading, running, and interacting with Container Linux instances running locally or in a cloud.

Mantle is composed of many utilities:

cork for handling the Container Linux SDK
gangue for downloading from Google Storage
kola for launching instances and running tests
kolet an agent for kola that runs on instances
ore for interfacing with cloud providers
plume for releasing Container Linux

All of the utilities support the help command to get a full listing of their subcommands and options.

Cork is a tool that helps working with Container Linux images and the SDK.

Download and unpack the Container Linux SDK.

Enter the SDK chroot, and optionally run a command. The command and its arguments can be given after — .

cork enter — repo sync

Download a Container Linux image into $PWD/.cache/images .

cork download-image —platform=qemu

Building Container Linux with cork

See Modifying Container Linux for an example of using cork to build a Container Linux image.

Gangue is a tool for downloading and verifying files from Google Storage with authenticated requests. It is primarily used by the SDK.

Get a file from Google Storage and verify it using GPG.

Kola is a framework for testing software integration in Container Linux instances across multiple platforms. It is primarily designed to operate within the Container Linux SDK for testing software that has landed in the OS image. Ideally, all software needed for a test should be included by building it into the image from the SDK.

Kola supports running tests on multiple platforms, currently QEMU, GCE, AWS, VMware VSphere, Packet, and OpenStack. In the future systemd-nspawn and other platforms may be added. Machines on cloud platforms do not have direct access to the kola so tests may depend on Internet services such as discovery.etcd.io or quay.io instead.

Kola outputs assorted logs and test data to _kola_temp for later inspection.

Kola is still under heavy development and it is expected that its interface will continue to change.

By default, kola uses the qemu platform with the most recently built image (assuming it is run from within the SDK).

Getting started with QEMU

The easiest way to get started with kola is to run a qemu test.

requirements:

IPv4 forwarding (to provide internet access to the instance): sudo sysctl -w net.ipv4.ip_forward=1
dnsmasq , go and iptables installed and present in the $PATH
qemu-system-x86_64 and / or qemu-system-aarch64 to respectively tests amd64 and / or arm64

From the pulled sources, kola and kolet must be compiled:

Finally, a Flatcar image must be available on the system:

from a locally built image
from an official release

Run tests for AMD64

Example with the latest alpha release:

Run tests for ARM64

Example with the latest alpha release:

Note for both architectures:

sudo is required because we need to create some iptables rules to provide QEMU Internet access
using —remove=false -d , it’s possible to keep the instances running (even after the test) and identify the PID of QEMU instances to SSH into (running processes must be killed once the action done)
using —key , it’s possible to SSH into the created instances — PID identification of the qemu instance is required:

The list command lists all of the available tests.

The spawn command launches Container Linux instances.

The mkimage command creates a copy of the input image with its primary console set to the serial port (/dev/ttyS0). This causes more output to be logged on the console, which is also logged in _kola_temp . This can only be used with QEMU images and must be used with the coreos_*_image.bin image, not the coreos_*_qemu_image.img .

The bootchart command launches an instance then generates an svg of the boot process using systemd-analyze .

The updatepayload command launches a Container Linux instance then updates it by sending an update to its update_engine. The update is the coreos_*_update.gz in the latest build directory.

kola subtest parallelization

Subtests can be parallelized by adding c.H.Parallel() at the top of the inline function given to c.Run . It is not recommended to utilize the FailFast flag in tests that utilize this functionality as it can have unintended results.

kola test namespacing

The top-level namespace of tests should fit into one of the following categories:

Groups of tests targeting specific packages/binaries may use that namespace (ex: docker.* )
Tests that target multiple supported distributions may use the coreos namespace.
Tests that target singular distributions may use the distribution’s namespace.

kola test registration

Registering kola tests currently requires that the tests are registered under the kola package and that the test function itself lives within the mantle codebase.

Groups of similar tests are registered in an init() function inside the kola package. Register(*Test) is called per test. A kola Test struct requires a unique name, and a single function that is the entry point into the test. Additionally, userdata (such as a Container Linux Config) can be supplied. See the Test struct in kola/register/register.go for a complete list of options.

kola test writing

A kola test is a go function that is passed a platform.TestCluster to run code against. Its signature is func(platform.TestCluster) and must be registered and built into the kola binary.

A TestCluster implements the platform.Cluster interface and will give you access to a running cluster of Container Linux machines. A test writer can interact with these machines through this interface.

To see test examples look under kola/tests in the mantle codebase.

For a quickstart see kola/README.md.

kola native code

For some tests, the Cluster interface is limited and it is desirable to run native go code directly on one of the Container Linux machines. This is currently possible by using the NativeFuncs field of a kola Test struct. This like a limited RPC interface.

NativeFuncs is used similar to the Run field of a registered kola test. It registers and names functions in nearby packages. These functions, unlike the Run entry point, must be manually invoked inside a kola test using a TestCluster ‘s RunNative method. The function itself is then run natively on the specified running Container Linux instances.

For more examples, look at the coretest suite of tests under kola. These tests were ported into kola and make heavy use of the native code interface.

The platform.Manhole() function creates an interactive SSH session which can be used to inspect a machine during a test.

kolet is run on kola instances to run native functions in tests. Generally kolet is not invoked manually.

Ore provides a low-level interface for each cloud provider. It has commands related to launching instances on a variety of platforms (gcloud, aws, azure, esx, and packet) within the latest SDK image. Ore mimics the underlying api for each cloud provider closely, so the interface for each cloud provider is different. See each providers help command for the available actions.

Note, when uploading to some cloud providers (e.g. gce) the image may need to be packaged with a different —format (e.g. —format=gce) when running image_to_vm.sh

Plume is the Container Linux release utility. Releases are done in two stages, each with their own command: pre-release and release. Both of these commands are idempotent.

The pre-release command does as much of the release process as possible without making anything public. This includes uploading images to cloud providers (except those like gce which don’t allow us to upload images without making them public).

Publish a new Container Linux release. This makes the images uploaded by pre-release public and uploads images that pre-release could not. It copies the release artifacts to public storage buckets and updates the directory index.

Generate and upload index.html objects to turn a Google Cloud Storage bucket into a publicly browsable file tree. Useful if you want something like Apache’s directory index for your software download repository. Plume release handles this as well, so it does not need to be run as part of the release process.

Each platform reads the credentials it uses from different files. The aws , azure , do , esx and packet platforms support selecting from multiple configured credentials, call «profiles». The examples below are for the «default» profile, but other profiles can be specified in the credentials files and selected via the —

/.aws/credentials file used by Amazon’s aws command-line tool. It can be created using the aws command:

To configure a different profile, use the —profile flag

/.aws/credentials file can also be populated manually:

To install the aws command in the SDK, run:

/.azure/azureProfile.json . This can be created using the az command:

It also requires that the environment variable AZURE_AUTH_LOCATION points to a JSON file (this can also be set via the —azure-auth parameter). The JSON file will require a service provider active directory account to be created.

Service provider accounts can be created via the az command (the output will contain an appId field which is used as the clientId variable in the AZURE_AUTH_LOCATION JSON):

The client secret can be created inside of the Azure portal when looking at the service provider account under the Azure Active Directory service on the App registrations tab.

You can find your subscriptionId & tenantId in the

The JSON file exported to the variable AZURE_AUTH_LOCATION should be generated by hand and have the following contents:

Источник

Flatcar container linux by kinfolk

K8s 1.20 docker deprecation: simplify use of containerd for k8s #284

Comments

t-lo commented Dec 4, 2020 •

Flatcar container linux by kinfolk

2905.2.0 docker containers don’t start. #457

Comments

dabeck commented Jul 29, 2021

Description

Impact

Environment and steps to reproduce

Expected behavior

Additional information

tormath1 commented Jul 29, 2021

dabeck commented Jul 29, 2021

tormath1 commented Jul 29, 2021

dabeck commented Jul 29, 2021

tormath1 commented Jul 29, 2021

dabeck commented Jul 29, 2021 •

tormath1 commented Jul 29, 2021

dabeck commented Jul 29, 2021

dabeck commented Jul 29, 2021 •

dabeck commented Jul 29, 2021

Microsoft buys Flatcar Container Linux creator Kinvolk

Microsoft

Flatcar container linux by kinfolk