Background

There are bunch of services lying around on my homelab. Over time, the accumulated complexity, ops load, reliability risk, and security risks started to harm my mental health. I figured it’s probably the time to properly tidy up my homelab and have it actually managed. I figured my ideal setup shall be secure, reasonably reliable, at a low ops cost. More specifically, I would like to:

  • Have some services running
  • Have some declarative way to manage them
  • Have some isolation on what each services could access
  • Have automated updates

Very simple and innocent requirements, right? Surely the solution ought to be simple too, doesn’t it?

Disclaimer: you will see way more πŸ‘Ž than πŸ‘ below. This is not to suggest those solutions are “bad” but to emphasize the gaps towards my ideal state. You can assume there are full of πŸ‘ in aspects I didn’t mention.

Options

NixOS

NixOS feels a very natural choice when it comes to declarative management of the whole system. nixpkgs also offers lots of packaged services at our choice. Setting up is generally convenient and straightforward, but issues do exist.

πŸ‘ Auto upgrade

NixOS has builtin support for auto upgrade through system.autoUpgrade.

πŸ€” Package availability

Apparently, not everything is currently offered on nixpkgs or through flakes. We will either need to maintain the package ourself, which is lots of work, or find some alternative options.

πŸ€” Indirection

Typically, NixOS offers its own way to configure programs, wrapping around the stock configuration interface of each package. This provides consistency in the configuration language, allows some config generation through Nix language, and usually makes simple case simple. However, such wrapper comes with the cost of reduced flexibility and extra layer of indirection, whose overhead may become significant as the usage become complex. The official documentation and community solutions don’t directly apply on the nixpkgs-wrapped interface. One may constantly have to refer to the Nix module implementation to translate the desired configuration to Nix, which can be an ongoing cost. Even worse, not all possible configuration can be expressed through the nixpkgs-wrapped interface. While most services do allow raw configuration to be supplied, it may imply rewriting everything already invested on the nixpkgs-wrapped interface, and it could be a heavily degraded experience if the underlying software consumes complex configurations in multiple files (e.g. freeradius).

πŸ‘Ž Isolation

While many nixpkgs services do come with reasonable level of isolation via systemd, nothing is systematically enforced. Usually, a service binary will have access to a large part of the file system, as well as the whole networking stack, which can be too much for some services.

The service binary technically may also alter the system however they want, leaving it in an dirty state even after its uninstallation. Impermanence may help to maintain a clean environment but in case some services rely on those state to behave correctly, they can be broken at reboot, leaving it difficult to roll back, or tracing down to the root cause, especially if one doesn’t reboot often and one reboot resets a whole year worth of state.

πŸ‘Ž Supply chain security

By using a Nix package, we are implicitly staking on it being free of tampering and up-to-date with all the security updates, which requires lots of work from package maintainers. We would have to trust our package maintainers, in additional to the original author of the software. This may not be a big problem for popular packages (e.g. nginx). Lots of people are staring at it and we can reasonably expect security updates to be merged timely and malicious tampering to be rejected. But the trust can be really hard to establish on packages which aren’t that popular. They might be looked after by just one or two people, at a minimal degree of commitment. We don’t know these people in person to find out how trustworthy they are. Each package may have different maintainers, and each maintainer may only do very few things, which means there isn’t even some “reputation” we can track.

πŸ‘Ž Productivity

Whenever a small change is needed, we run nixos-rebuild, which is slow.

NixOS Container

NixOS Container is a feature of NixOS, allowing declarative configuration of containers running NixOS in the same way the host is configured, powered by systemd-nspawn.

πŸ€” Isolation

While systemd-nspawn offers some isolation, not all isolation comes by default. All containers are by default privileged as user namespace isn’t enabled by default unless --private-users is specified in extraFlags. There seems to be some complication setting up bind mounts when --private-users is used. I only had success with the bind directory manually chown’ed. Network namespace isn’t enabled unless privateNetwork is specified. To use that, one would have to manually specify IP addresses for each container and each container will receive its own veth pair. More fine-grained isolation will have to be configured through firewall separately.

πŸ‘Ž NixOS issues

Many of aforementioned issues with NixOS still remains.

Podman

Or Docker, through NixOS, or Docker Compose, or Arion, whatever.

πŸ‘ Productivity

Docker compose or Arion allows us to declaratively manage the containers without having to nixos-rebuild every time. If we carefully structure our nix files, Arion can even work with both the cli and nixos-rebuild.

πŸ‘ Isolation

Podman containers are properly isolated from the host and each other by default, which is good.

πŸ€” Audo update

Podman also has an auto update feature builtin, as well as a systemd timer unit podman-auto-update.timer, which can be enabled as needed.

However, it doesn’t work with Docker Compose, or Arion which uses Docker Compose.

πŸ€” Prune

Docker compose supports --remove-orphans, which removes services deleted from the compose file. However, there isn’t a good story for declaratively pruning networks, which can be annoying if you have one network per service.

πŸ‘Ž Network policy

Podman doesn’t have a native solution for configurable network policy. It might be possible through CNI plugins, which, however, has been deprecated in Podman 4.0, replaced by Netavark, which doesn’t offer network policy at the present.

Hand crafting firewall is an option, but that’s a separate piece of config to manage. If we choose to manage that via NixOS, we are again subject to the aforementioned productivity issue.

Note that even with a firewall setup, we may still be subject to ARP spoofing / IP spoofing from containers with CAP_NET_RAW or CAP_NET_ADMIN inside the same Podman network. This can be fixed with a separate podman network for each container, but that comes with some performance loss and increased management burden.

Kubernetes

Or k3s, k0s, microk8s, whatever. Kubernetes appears to be the standard for container orchestration today, so it must be good, and enterpriseℒ️ class.

πŸ‘ Prune

Kubernetes has a good story for declarative pruning resources through --prune.

πŸ‘ Network isolation

Kubernetes offers network policy, which allows easy and flexible configuration of connectivity between containers, so we no longer need hand-crafted firewall rules. While it may still be subject to ARP spoofing / IP spoofing if an L2 CNI plugin is used, an L3 CNI plugin like Cilium can mitigate that risk through eBPF.

πŸ€” Complication

Kubernetes is a bit more complex to get started. Instead of just some services and networks in Docker Compose, to get started on Kubernetes, we’ll need to have pods, services, ingresses, persistent volumes, secrets, and possibly more.

Luckily, there are tools like kubenix which we could use to generate those resources from a high level configuration.

πŸ‘Ž Devices and local networks

Kubernetes doesn’t have a native alternative for --device in docker. The proper way to attach device to a container would be through some device plugin like akri. Alternatively you can mount device file as a volume but that requires the container to be privileged.

Connecting pods to local networks (e.g. macvlan) is also more complex than Podman. Most CNI plugin doesn’t natively support that and something like multus-cni will be needed.

πŸ‘Ž Security Complication

In addition to the added complexity to get started, it can be MUCH MORE complex to use it securely if third-party resources are used. Typically vendors will offer some canned configurations that includes maybe dozens of resources, too complex to manage manually.

So you use Helm, the package manager for Kubernetes. By default, the Helm chart could affect any resources on the whole cluster. One might think passing --namespace could restrict the Helm chart to its own namespace, but the chart may create privileged pods, mount host paths, and effectively get the root of the host.

One might think configuring Pod Security Admission may block the chart from using privileged features, but technically the chart could remove the PSA configuration. It’s possible to set up RBAC to restrict Helm from tampering PSA, but if you are using the k3s-io/helm-controller built into k3s, Helm will be granted cluster-admin. We will need some other solutions that supports RBAC like fluxcd/helm-controller.

Now that we’ve reasonably locked down the access of Helm chart, are we good? Let’s try to install ingress-nginx with hostPort. It’s rejected by PSA because hostPort is considered somewhat privileged. Can we exempt this specific case? There is Pod Security Policy but it’s now deprecated. We’ll need a third party admission plugin such as Kyverno.

The Verdict

I almost put together a single-node k3s setup that meets my needs, which involves:

  • NixOS: to manage the host.
    • Also to run some system services (e.g. openssh, k3s)
  • k3s: to manage containers.
  • Kubenix: to generate Kubernetes resources.
  • Helmfile: to fetch resources from Helm.
    • The resources are then manually inspected, versioned, and installed.
  • Cilium: to manage container networking.
  • cert-manager: to manage certificates.
  • ingress-nginx: to manage ingress.
  • Pod Security Admission + Kyverno: secures resources from Helm.
  • Akri: to connect to IoT devices.
  • multus-cni: to connect to IoT networks.

Wow, that’s a lot!

I find that extra complication of k8s isn’t really worthwhile for my usage, especially considering lots of trust being required. For now, I’ll stick to:

  • NixOS: to manage the host.
    • Also to run some system softwares (e.g. openssh, podman)
  • NixOS container: to run nginx.
  • Podman: to manage containers.
    • One podman network per container to mitigate ARP / IP spoofing.
  • quadlet-nix to declaratively configured Podman.
    • Manages both containers and networks.
    • Supports auto-update.
  • nftables: to manage network policy.