2023-06-25 20:18:31 +00:00
|
|
|
---
|
|
|
|
title: "LXD: Containers for Human Beings"
|
|
|
|
subtitle: "Docker's great and all, but I prefer the workflow of interacting with VMs"
|
2023-08-16 19:34:57 +00:00
|
|
|
date: 2023-08-11T16:30:00-04:00
|
2023-06-25 20:18:31 +00:00
|
|
|
categories:
|
2023-08-16 19:34:57 +00:00
|
|
|
- Technology
|
2023-06-25 20:18:31 +00:00
|
|
|
tags:
|
2023-08-16 19:34:57 +00:00
|
|
|
- Sysadmin
|
|
|
|
- Containers
|
|
|
|
- VMs
|
|
|
|
- Docker
|
|
|
|
- LXD
|
2023-06-25 20:18:31 +00:00
|
|
|
draft: true
|
2023-08-28 23:49:44 +00:00
|
|
|
toc: true
|
2023-06-25 20:18:31 +00:00
|
|
|
rss_only: false
|
|
|
|
cover: ./cover.png
|
|
|
|
---
|
|
|
|
|
2023-08-16 19:34:57 +00:00
|
|
|
This is a blog post version of a talk I presented at both Ubuntu Summit 2022 and
|
2023-06-25 20:18:31 +00:00
|
|
|
SouthEast LinuxFest 2023. The first was not recorded, but the second was and is
|
2023-08-18 02:04:07 +00:00
|
|
|
on [SELF's PeerTube instance.][selfpeertube] I apologise for the terrible audio,
|
2023-08-24 00:38:21 +00:00
|
|
|
but there's unfortunately nothing I can do about that. If you're already
|
|
|
|
intimately familiar with the core concepts of VMs or containers, I would suggest
|
|
|
|
skipping those respective sections. If you're vaguely familiar with either, I
|
|
|
|
would recommend reading them because I do go a little bit in-depth.
|
2023-06-25 20:18:31 +00:00
|
|
|
|
|
|
|
[selfpeertube]: https://peertube.linuxrocks.online/w/hjiTPHVwGz4hy9n3cUL1mq?start=1m
|
|
|
|
|
2023-08-16 19:34:57 +00:00
|
|
|
{{< adm type="warn" >}}
|
|
|
|
|
|
|
|
**Note:** Canonical has decided to [pull LXD out][lxd] from under the Linux
|
|
|
|
Containers entity and instead continue development under the Canonical brand.
|
2023-08-24 00:38:21 +00:00
|
|
|
The majority of the LXD creators and developers have congregated around a fork
|
|
|
|
called [Incus.][inc] I'll be keeping a close eye on the project and intend to
|
|
|
|
migrate as soon as there's an installable release.
|
2023-08-16 19:34:57 +00:00
|
|
|
|
|
|
|
[lxd]: https://linuxcontainers.org/lxd/
|
|
|
|
[inc]: https://linuxcontainers.org/incus/
|
|
|
|
|
|
|
|
{{< /adm >}}
|
|
|
|
|
2023-06-25 20:18:31 +00:00
|
|
|
## The benefits of VMs and containers
|
|
|
|
|
2023-08-24 00:38:21 +00:00
|
|
|
- **Isolation:** you don't want to allow an attacker to infiltrate your email
|
|
|
|
server through your web application; the two should be completely separate
|
|
|
|
from each other and VMs/containers provide strong isolation guarantees.
|
2023-06-25 20:18:31 +00:00
|
|
|
- **Flexibility:** <abbr title="Virtual Machines">VMs</abbr> and containers only
|
2023-08-18 02:04:07 +00:00
|
|
|
use the resources they've been given. If you tell the VM it has 200 MBs of
|
|
|
|
RAM, it's going to make do with 200 MBs of RAM and the kernel's <abbr
|
|
|
|
title="Out Of Memory">OOM</abbr> killer is going to have a fun time 🤠
|
2023-06-25 20:18:31 +00:00
|
|
|
- **Portability:** once set up and configured, VMs and containers can mostly be
|
2023-08-27 18:16:14 +00:00
|
|
|
treated as closed boxes; as long as the surrounding environment of the new
|
|
|
|
host is similar to the previous in terms of communication (proxies, web
|
|
|
|
servers, etc.), they can just be picked up and dropped between various hosts
|
|
|
|
as necessary.
|
2023-08-18 02:04:07 +00:00
|
|
|
- **Density:** applications are usually much lighter than the systems they're
|
|
|
|
running on, so it makes sense to run many applications on one system. VMs and
|
|
|
|
containers facilitate that without sacrificing security.
|
2023-08-24 00:38:21 +00:00
|
|
|
- **Cleanliness:** VMs and containers are applications in black boxes. When
|
|
|
|
you're done with the box, you can just throw it away and most everything
|
|
|
|
related to the application is gone.
|
2023-07-18 17:21:24 +00:00
|
|
|
|
|
|
|
## Virtual machines
|
|
|
|
|
2023-08-24 00:38:21 +00:00
|
|
|
As the name suggests, Virtual Machines are all virtual; a hypervisor creates
|
|
|
|
virtual disks for storage, virtual <abbr title="Central Processing
|
|
|
|
Units">CPUs</abbr>, virtual <abbr title="Network Interface Cards">NICs</abbr>,
|
|
|
|
virtual <abbr title="Random Access Memory">RAM</abbr>, etc. On top of the
|
|
|
|
virtualised hardware, you have your kernel. This is what facilitates
|
|
|
|
communication between the operating system and the (virtual) hardware. Above
|
|
|
|
that is the operating system and all your applications.
|
|
|
|
|
|
|
|
At this point, the stack is quite large; VMs aren't exactly lightweight, and
|
|
|
|
this impacts how densely you can pack the host.
|
|
|
|
|
|
|
|
I mentioned a "hypervisor" a minute ago. I've explained what hypervisors in
|
|
|
|
general do, but there are actually two different kinds of hypervisor. They're
|
|
|
|
creatively named **Type 1** and **Type 2**.
|
|
|
|
|
|
|
|
### Type 1 hypervisors
|
|
|
|
|
|
|
|
These run directly in the host kernel without an intermediary OS. A good example
|
|
|
|
would be [KVM,][kvm] a **VM** hypervisor than runs in the **K**ernel. Type 1
|
|
|
|
hypervisors can communicate directly with the host's hardware to allocate RAM,
|
|
|
|
issue instructions to the CPU, etc.
|
|
|
|
|
|
|
|
[debian]: https://debian.org
|
|
|
|
[kvm]: https://www.linux-kvm.org
|
|
|
|
[vb]: https://www.virtualbox.org/
|
|
|
|
|
2023-08-18 02:04:07 +00:00
|
|
|
```kroki {type=d2,d2theme=flagship-terrastruct,d2sketch=true}
|
2023-08-24 00:38:21 +00:00
|
|
|
hk: Host kernel
|
2023-08-24 01:51:27 +00:00
|
|
|
hk.h: Type 1 hypervisor
|
|
|
|
hk.h.k1: Guest kernel
|
|
|
|
hk.h.k2: Guest kernel
|
|
|
|
hk.h.k3: Guest kernel
|
|
|
|
hk.h.k1.os1: Guest OS
|
|
|
|
hk.h.k2.os2: Guest OS
|
|
|
|
hk.h.k3.os3: Guest OS
|
|
|
|
hk.h.k1.os1.app1: Many apps
|
|
|
|
hk.h.k2.os2.app2: Many apps
|
|
|
|
hk.h.k3.os3.app3: Many apps
|
2023-08-24 00:38:21 +00:00
|
|
|
```
|
|
|
|
|
|
|
|
### Type 2 hypervisors
|
|
|
|
|
|
|
|
These run in userspace as an application, like [VirtualBox.][vb] Type 2
|
|
|
|
hypervisors have to first go through the operating system, adding an additional
|
|
|
|
layer to the stack.
|
|
|
|
|
|
|
|
```kroki {type=d2,d2theme=flagship-terrastruct,d2sketch=true}
|
|
|
|
hk: Host kernel
|
2023-08-24 01:51:27 +00:00
|
|
|
hk.os: Host OS
|
|
|
|
hk.os.h: Type 2 hypervisor
|
|
|
|
hk.os.h.k1: Guest kernel
|
|
|
|
hk.os.h.k2: Guest kernel
|
|
|
|
hk.os.h.k3: Guest kernel
|
|
|
|
hk.os.h.k1.os1: Guest OS
|
|
|
|
hk.os.h.k2.os2: Guest OS
|
|
|
|
hk.os.h.k3.os3: Guest OS
|
|
|
|
hk.os.h.k1.os1.app1: Many apps
|
|
|
|
hk.os.h.k2.os2.app2: Many apps
|
|
|
|
hk.os.h.k3.os3.app3: Many apps
|
2023-07-18 17:21:24 +00:00
|
|
|
```
|
|
|
|
|
|
|
|
## Containers
|
|
|
|
|
2023-08-27 18:16:14 +00:00
|
|
|
VMs use virtualisation to achieve isolation. Containers use **namespaces** and
|
|
|
|
**cgroups**, technologies pioneered in the Linux kernel. By now, though, there
|
|
|
|
are [equivalents for Windows] and possibly other platforms.
|
2023-08-18 02:04:07 +00:00
|
|
|
|
2023-08-27 18:16:14 +00:00
|
|
|
[equivalents for Windows]: https://learn.microsoft.com/en-us/virtualization/community/team-blog/2017/20170127-introducing-the-host-compute-service-hcs
|
|
|
|
|
|
|
|
**[Linux namespaces]** partition kernel resources like process IDs, hostnames,
|
|
|
|
user IDs, directory hierarchies, network access, etc. This prevents one
|
|
|
|
collection of processes from seeing or gaining access to data regarding another
|
|
|
|
collection of processes.
|
|
|
|
|
|
|
|
**[Cgroups]** limit, track, and isolate the hardware resource use of a
|
|
|
|
collection of processes. If you tell a cgroup that it's only allowed to spawn
|
|
|
|
500 child processes and someone executes a fork bomb, the fork bomb will expand
|
|
|
|
until it hits that limit. The kernel will prevent it from spawning further
|
|
|
|
children and you'll have to resolve the issue the same way you would with VMs:
|
|
|
|
delete and re-create it, restore from a good backup, etc. You can also limit CPU
|
|
|
|
use, the number of CPU cores it can access, RAM, disk use, and so on.
|
2023-08-18 02:04:07 +00:00
|
|
|
|
2023-08-24 01:51:27 +00:00
|
|
|
[Linux namespaces]: https://en.wikipedia.org/wiki/Linux_namespaces
|
|
|
|
[Cgroups]: https://en.wikipedia.org/wiki/Cgroups
|
|
|
|
|
|
|
|
### Application containers
|
|
|
|
|
2023-08-27 18:16:14 +00:00
|
|
|
The most well-known example of application container tech is probably
|
|
|
|
[Docker.][docker] The goal here is to run a single application as minimally as
|
|
|
|
possible inside each container. In the case of a single, statically-linked Go
|
|
|
|
binary, a minimal Docker container might contain nothing more than the binary.
|
|
|
|
If it's a Python application, you're more likely to use an [Alpine Linux image]
|
|
|
|
and add your Python dependencies on top of that. If a database is required, that
|
|
|
|
goes in a separate container. If you've got a web server to handle TLS
|
|
|
|
termination and proxy your application, that's a third container. One cohesive
|
|
|
|
system might require many Docker containers to function as intended.
|
|
|
|
|
|
|
|
[docker]: https://docker.com/
|
|
|
|
[Alpine Linux image]: https://hub.docker.com/_/alpine
|
|
|
|
|
2023-08-24 01:51:27 +00:00
|
|
|
```kroki {type=d2,d2theme=flagship-terrastruct,d2sketch=true}
|
|
|
|
Host kernel.Container runtime.c1: Container
|
|
|
|
Host kernel.Container runtime.c2: Container
|
|
|
|
Host kernel.Container runtime.c3: Container
|
|
|
|
|
|
|
|
Host kernel.Container runtime.c1.One app
|
|
|
|
Host kernel.Container runtime.c2.Few apps
|
|
|
|
Host kernel.Container runtime.c3.Full OS.Many apps
|
2023-08-18 02:04:07 +00:00
|
|
|
```
|
|
|
|
|
2023-08-24 01:51:27 +00:00
|
|
|
### System containers
|
|
|
|
|
2023-08-27 18:16:14 +00:00
|
|
|
One of the most well-known examples of system container tech is the subject of
|
|
|
|
this post: LXD! Rather than containing a single application or a very small set
|
|
|
|
of them, system containers are designed to house entire operating systems, like
|
|
|
|
[Debian] or [Rocky Linux,][rocky] along with everything required for your
|
|
|
|
application. Using our examples from above, a single statically-linked Go binary
|
|
|
|
might run in a full Debian container, just like the Python application might.
|
|
|
|
The database and webserver might go in _that same_ container.
|
|
|
|
|
|
|
|
[Debian]: https://www.debian.org/
|
|
|
|
[rocky]: https://rockylinux.org/
|
|
|
|
|
|
|
|
You treat each container more like you would a VM, but you get the performance
|
|
|
|
benefit of _not_ virtualising everything. Containers are _much_ lighter than any
|
|
|
|
virtual machine.
|
|
|
|
|
2023-08-18 02:04:07 +00:00
|
|
|
```kroki {type=d2,d2theme=flagship-terrastruct,d2sketch=true}
|
2023-08-24 01:51:27 +00:00
|
|
|
hk: Host kernel
|
|
|
|
hk.c1: Container
|
|
|
|
hk.c2: Container
|
|
|
|
hk.c3: Container
|
|
|
|
hk.c1.os1: Full OS
|
|
|
|
hk.c2.os2: Full OS
|
|
|
|
hk.c3.os3: Full OS
|
|
|
|
hk.c1.os1.app1: Many apps
|
|
|
|
hk.c2.os2.app2: Many apps
|
|
|
|
hk.c3.os3.app3: Many apps
|
2023-07-18 17:21:24 +00:00
|
|
|
```
|
|
|
|
|
2023-08-27 18:16:14 +00:00
|
|
|
## When to use which
|
2023-07-18 17:21:24 +00:00
|
|
|
|
2023-08-28 23:49:44 +00:00
|
|
|
These are personal opinions. Please evaluate each technology and determine for
|
|
|
|
yourself whether it's a suitable fit for your environment.
|
|
|
|
|
|
|
|
### VMs
|
2023-07-18 17:21:24 +00:00
|
|
|
|
2023-08-27 18:16:14 +00:00
|
|
|
As far as I'm aware, VMs are your only option when you want to work with
|
2023-08-28 23:49:44 +00:00
|
|
|
esoteric hardware or hardware you don't physically have on-hand. You can tell
|
|
|
|
your VM that it's running with RAM that's 20 years old, a still-in-development
|
|
|
|
RISC-V CPU, and a 420p monitor. That's not possible with containers. VMs are
|
|
|
|
also your only option when you want to work with foreign operating systems:
|
|
|
|
running Linux on Windows, Windows on Linux, or OpenBSD on a Mac all require
|
|
|
|
virtualisation. Another reason to stick with VMs is for compliance purposes.
|
|
|
|
Containers are still very new and some regulatory bodies require virtualisation
|
|
|
|
because it's a decades-old and battle-tested isolation technique.
|
2023-07-18 17:21:24 +00:00
|
|
|
|
2023-08-27 18:16:14 +00:00
|
|
|
{{< adm type="note" >}}
|
|
|
|
See Drew DeVault's blog post [_In praise of qemu_][qemu] for a great use of VMs
|
2023-07-18 17:21:24 +00:00
|
|
|
|
2023-08-27 18:16:14 +00:00
|
|
|
[qemu]: https://drewdevault.com/2022/09/02/2022-09-02-In-praise-of-qemu.html
|
2023-08-28 23:49:44 +00:00
|
|
|
|
2023-08-27 18:16:14 +00:00
|
|
|
{{< /adm >}}
|
2023-07-18 17:21:24 +00:00
|
|
|
|
2023-08-28 23:49:44 +00:00
|
|
|
### Application containers
|
|
|
|
|
2023-08-27 18:16:14 +00:00
|
|
|
Application containers are particularly popular for [microservices] and
|
|
|
|
[reproducible builds,][repb] though I personally think [NixOS] is a better fit
|
|
|
|
for the latter. App containers are also your only option if you want to use
|
|
|
|
cloud platforms with extreme scaling capabilities like Google Cloud's App Engine
|
|
|
|
standard environment or AWS's Fargate.
|
|
|
|
|
|
|
|
[microservices]: https://en.wikipedia.org/wiki/Microservices
|
|
|
|
[repb]: https://en.wikipedia.org/wiki/Reproducible_builds
|
|
|
|
[NixOS]: https://nixos.org/
|
|
|
|
|
2023-08-28 23:49:44 +00:00
|
|
|
Application containers also tend to be necessary when the application you want
|
|
|
|
to self-host is _only_ distributed as a Docker image and the maintainers
|
|
|
|
adamantly refuse to support any other deployment method. This is a _massive_ pet
|
|
|
|
peeve of mine; yes, Docker can make running self-hosted applications easier for
|
|
|
|
inexperienced individuals,[^1] but application orchestration system _does not_
|
|
|
|
fit in every single environment. By refusing to provide proper "manual"
|
|
|
|
deployment instructions, maintainers of these projects alienate an entire class
|
|
|
|
of potential users and it pisses me off.
|
|
|
|
|
|
|
|
Just document your shit.
|
|
|
|
|
|
|
|
### System containers
|
|
|
|
|
|
|
|
Personally, I use system containers for everything else. I prefer the simplicity
|
|
|
|
of being able to shell into a system and work with it almost exactly
|
2023-07-18 17:21:24 +00:00
|
|
|
|
|
|
|
## Crash course to LXD
|
|
|
|
|
2023-08-24 00:38:21 +00:00
|
|
|
### Installation
|
|
|
|
|
|
|
|
{{< adm type="note" >}}
|
|
|
|
|
|
|
|
**Note:** the instructions below say to install LXD using [Snap.][snap] I
|
2023-08-24 01:51:27 +00:00
|
|
|
personally dislike Snap, but LXD is a Canonical product and they're doing their
|
2023-08-28 23:49:44 +00:00
|
|
|
best to promote it as much as possible. One of the first things the Incus
|
|
|
|
project did was [rip out Snap support,][rsnap] so it will eventually be
|
|
|
|
installable as a proper native package.
|
2023-08-24 00:38:21 +00:00
|
|
|
|
|
|
|
[snap]: https://en.wikipedia.org/wiki/Snap_(software)
|
|
|
|
[rsnap]: https://github.com/lxc/incus/compare/9579f65cd0f215ecd847e8c1cea2ebe96c56be4a...3f64077a80e028bb92b491d42037124e9734d4c7
|
|
|
|
|
|
|
|
{{< /adm >}}
|
|
|
|
|
2023-07-18 17:21:24 +00:00
|
|
|
1. Install snap following [Canonical's tutorial](https://earl.run/ZvUK)
|
2023-08-24 00:38:21 +00:00
|
|
|
- LXD is natively packaged for Arch and Alpine, but configuration can be a
|
|
|
|
massive headache.
|
2023-07-18 17:21:24 +00:00
|
|
|
2. `sudo snap install lxd`
|
|
|
|
3. `lxd init`
|
2023-08-28 23:49:44 +00:00
|
|
|
- Defaults are fine for the most part; you may want to increase the size of
|
|
|
|
the storage pool.
|
|
|
|
4. `lxc launch images:debian/12 container-name`
|
|
|
|
5. `lxc shell container-name`
|
2023-08-24 00:38:21 +00:00
|
|
|
|
|
|
|
### Usage
|
|
|
|
|
2023-08-28 23:49:44 +00:00
|
|
|
As an example of how to use LXD in a real situation, we'll set up [my URL
|
|
|
|
shortener.][earl] You'll need a VPS with LXD installed and a (sub)domain pointed
|
|
|
|
to the VPS.
|
|
|
|
|
|
|
|
Run `lxc launch images:debian/12 earl` followed by `lxc shell earl` and `apt
|
|
|
|
install curl`. Also `apt install` a text editor, like `vim` or `nano` depending
|
|
|
|
on what you're comfortable with. Head to the **Installation** section of [earl's
|
|
|
|
SourceHut page][earl] and expand the **List of latest binaries**. Copy the link
|
|
|
|
to the binary appropriate for your platform, head back to your terminal, type
|
|
|
|
`curl -LO`, and paste the link you copied. This will download the binary to your
|
|
|
|
system. Run `mv <filename> earl` to rename it, `chmod +x earl` to make it
|
|
|
|
executable, then `./earl` to execute it. It will create a file called
|
|
|
|
`config.yaml` that you need to edit before proceeding. Change the `accessToken`
|
|
|
|
to something else and replace the `listen` value, `127.0.0.1`, with `0.0.0.0`.
|
|
|
|
This exposes the application to the host system so we can reverse proxy it.
|
|
|
|
|
|
|
|
[earl]: https://earl.run/source
|
|
|
|
|
|
|
|
The next step is daemonising it so it runs as soon as the system boots. Edit the
|
|
|
|
file located at `/etc/systemd/system/earl.service` and paste the following code
|
|
|
|
snippet into it.
|
|
|
|
|
|
|
|
```ini
|
|
|
|
[Unit]
|
|
|
|
Description=personal link shortener
|
|
|
|
After=network.target
|
|
|
|
|
|
|
|
[Service]
|
|
|
|
User=root
|
|
|
|
Group=root
|
|
|
|
WorkingDirectory=/root/
|
|
|
|
ExecStart=/root/earl -c config.yaml
|
|
|
|
|
|
|
|
[Install]
|
|
|
|
WantedBy=multi-user.target
|
|
|
|
```
|
|
|
|
|
|
|
|
Save, then run `systemctl daemon-reload` followed by `systemctl enable --now
|
|
|
|
earl`. You should be able to `curl localhost:8275` and see some HTML.
|
|
|
|
|
|
|
|
Now we need a reverse proxy on the host. Exit the container with `exit` or
|
|
|
|
`Ctrl+D`, and if you have a preferred webserver, install it. If you don't have a
|
|
|
|
preferred webserver yet, I recommend [installing Caddy.][caddy] All that's left
|
|
|
|
is running `lxc list`, making note of the `earl` container's `IPv4` address, and
|
|
|
|
reverse proxying it. If you're using Caddy, edit `/etc/caddy/Caddyfile` and
|
|
|
|
replace everything that's there with the following.
|
|
|
|
|
|
|
|
[caddy]: https://caddyserver.com/docs/install
|
|
|
|
|
|
|
|
```text
|
|
|
|
<(sub)domain> {
|
|
|
|
encode zstd gzip
|
|
|
|
reverse_proxy <container IP address>:1313
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
Run `systemctl restart caddy` and head to whatever domain or subdomain you
|
|
|
|
entered. You should see the home page with just the text `earl` on it. If you go
|
|
|
|
to `/login`, you'll be able to enter whatever access token you set earlier and
|
|
|
|
log in.
|
|
|
|
|
|
|
|
### Executing a fork bomb
|
|
|
|
|
|
|
|
I've seen some people say that executing a fork bomb from inside a container is
|
|
|
|
equivalent to executing it on the host. The fork bomb will blow up the whole
|
|
|
|
system and render every application and container you're running inoperable.
|
|
|
|
|
|
|
|
That's partially true because LXD _by default_ doesn't put a limit on how many
|
|
|
|
processes a particular container can spawn. You can limit that number yourself
|
|
|
|
by running
|
|
|
|
|
|
|
|
```text
|
|
|
|
lxc profile set default limits.processes <num-processes>
|
|
|
|
```
|
|
|
|
|
|
|
|
Any container you create under the `default` profile will have a total process
|
|
|
|
limit of `<num-processes>`. I can't tell you what a good process limit is
|
|
|
|
though; you'll need to do some testing and experimentation on your own.
|
|
|
|
|
|
|
|
Note that this doesn't _save_ you from fork bombs, all it does is prevent an
|
|
|
|
affected container from affecting _other_ containers. If someone executes a fork
|
|
|
|
bomb in a container, it'll be the same as if they executed it in a virtual
|
|
|
|
machine; assuming it's a one-off, you'll need to fix it by rebooting the
|
|
|
|
container. If it was set to run at startup, you'll need to recreate the
|
|
|
|
container, restore from a backup, revert to a snapshot, etc.
|
2023-08-24 01:51:27 +00:00
|
|
|
|
2023-08-28 23:49:44 +00:00
|
|
|
[^1]:
|
|
|
|
Until they need to do _anything_ more complex than pull a newer image. Then
|
|
|
|
it's twice as painful as the "manual" method might have been.
|