Chapter III · Self-hosted infrastructure

Service Isolation: One Service Per Container

Tested on
Proxmox VE 9.2 on a single Intel Ultra 9 node, 32GB RAM, unprivileged LXC + a few QEMU VMs. ~8 service containers running, ~5 build containers stopped-on-demand.
Last updated
2026-06-04

I run my homelab as a pile of small, boring, unprivileged LXC containers. One service each. Not one fat VM with a docker-compose monolith, not a single Debian box hand-fed twelve daemons. This guide is the discipline behind that choice: why I do it, where the blast radius gets drawn, and the handful of cases where a full VM is actually the right call.

This is the why, not the map. For the inventory of what runs where, see homelab-topology.md.

The Rule: One Service, One Container

Every long-lived service gets its own unprivileged LXC. DNS in one, the SIEM in another, the photo server in a third, the social-automation stack in a fourth. They share a kernel (that’s what makes LXC cheap) but nothing else: separate rootfs, separate network interface, separate resource caps, separate snapshot timeline.

The instinct early on is to consolidate. “Why spin up five containers when one Debian box could run all five daemons?” Because consolidation trades a one-time setup cost for a permanent operational tax, and the tax compounds every time something breaks.

What one-service-per-container actually buys you

That last point is the real reason. The threat model isn’t just “software crashes.” It’s “I am tired, it is late, and I am about to paste a command I half-understand into a root shell.” Isolation means that mistake costs me one service, not the whole lab. 🦞

Blast-Radius Thinking

Draw the boundary at the unit you’d want to restore independently. For me that’s “one service.” Ask three questions per container:

  1. If this is compromised, what else can it reach? A container with a NAS mount and SSH keys is a bigger prize than a stateless DNS resolver. Keep the credential-heavy services small and few, and don’t co-locate a public-facing service with your secrets. This is the container-level version of the agent-side rules in ../security/agent-security-hardening.md.
  2. If this dies, what dies with it? The answer should be “only this service.” If killing container A also kills container B, you’ve built a hidden monolith with extra steps. The classic trap here is shared bind-mounts (see the next section).
  3. If I have to roll this back, what state do I lose? Snapshot scope equals blast radius. A rollback reverts everything in that container’s rootfs to the snapshot point. Smaller containers mean rollbacks lose less.

The bind-mount trap

The fastest way to accidentally couple two “isolated” containers is a shared dependency that lives outside both of them. I learned this the expensive, silent way.

My photo server (its own LXC) bind-mounts a directory from a NAS share into the container so uploads land on bulk storage. One morning the NAS CIFS mount on the Proxmox host failed to come up after a reboot race. The host mount point existed but was empty, so the container’s bind-mount pointed at an empty directory. The photo server’s app container couldn’t create its upload path and exited. Postgres, the ML worker, and Redis in the same stack all stayed healthy and reported green. The phone app just silently stopped backing up. Nothing crashed loudly. The dependency that took the service down lived between the host and the container, in a place neither one’s health check was watching.

Two lessons baked in:

The fix was two lines (remount the NAS share, restart the one app container). The diagnosis took an hour because everything looked fine. Isolation didn’t fail here. It worked: only one service went down. But a hidden shared dependency is the seam where isolation leaks, so go find your bind-mounts and treat each one as a coupling you signed up for on purpose.

Per-Container Resource Caps

Isolation without caps is a polite fiction. If every container can burst to all 32GB of host RAM, one of them will, and you’re back to shared-fate. Set caps at create time and treat them as the contract.

# A small, stateless service: DNS resolver. Tiny on purpose.
pct create 100 local:vztmpl/debian-12-standard_amd64.tar.zst \
  --hostname dns \
  --cores 1 --memory 512 --swap 512 \
  --rootfs local-lvm:4 \
  --net0 name=eth0,bridge=vmbr0,ip=dhcp \
  --unprivileged 1 --onboot 1 --start 1
# A heavy service: SIEM. Gets real resources, still capped.
pct create 105 local:vztmpl/debian-12-standard_amd64.tar.zst \
  --hostname siem \
  --cores 4 --memory 8192 --swap 512 \
  --rootfs local-lvm:50 \
  --features nesting=1,keyctl=1 \
  --net0 name=eth0,bridge=vmbr0,ip=dhcp \
  --unprivileged 1 --onboot 1 --start 1

Adjust caps live when a service genuinely needs more, but do it deliberately:

pct set 105 --memory 8192     # bump RAM
pct set 105 --cores 4         # bump CPU
pct resize 105 rootfs +10G    # grow disk (can grow, cannot shrink)

Sizing principles I actually follow:

The Ephemeral Build-Container Pattern

Here’s the pattern I’m proudest of. PR builds, smoke tests, and CI-style jobs do not get a permanent home and they do not run on a service container. They get a dedicated container that is stopped by default and started only for the duration of the job.

The host keeps several of these around, stopped, as ready-to-go templates. One per project that needs clean-room builds:

112  openclaw-prbuild   stopped   clean OpenClaw build/test sandbox
113  tokenjuice-prbuild  stopped   tokenjuice PR/build sandbox
115  mcp-smoke           stopped   MCP smoke-test sandbox
119  orca-prbuild        stopped   project build/test sandbox
120  gh-runner           stopped   CI runner sandbox

The workflow is: clone or reset, start, build, capture result, stop.

# Spin up a clean build CT for a PR, run the build, tear it back down.
ssh hypervisor "pct start 112"
ssh hypervisor "pct exec 112 -- bash -lc '
  cd /root/openclaw && git fetch origin && git checkout pr-branch &&
  pnpm install && pnpm build && pnpm test
'"
ssh hypervisor "pct stop 112"

Why a dedicated stopped container per project instead of a shared build box or just running it on the dev machine:

Reset between runs with snapshots

Pair build containers with a “pristine” snapshot so each run starts identical:

# One time: get the CT to a known-good state, then snapshot it.
pct snapshot 112 pristine

# Before each build job: roll back to pristine, then start.
pct rollback 112 pristine
pct start 112
# ... run build ...
pct stop 112

The ephemeral-state gotcha

Ephemeral containers have a sharp edge: state you want to keep can live in the part of the container that gets thrown away. I hit this with a Docker-based automation service whose SSH known_hosts file lived in the container’s writable layer rather than in a mounted data volume. Every container recreate wiped known_hosts, and SSH-using workflows started failing silently with host-key verification errors because the new layer didn’t trust any hosts yet.

The rule that falls out: if it’s ephemeral, treat anything outside an explicit persistent volume as gone on the next reset. For build containers that’s a feature. For services that happen to recreate their app containers (Docker-in-LXC), it’s a trap. Audit where each service keeps its must-survive state, and make sure it’s in a named volume or a bind-mount, not the disposable layer.

Snapshot Before Change

The cheap habit that makes everything above safe: snapshot the container before you change it.

pct snapshot 105 pre-maint-$(date +%Y%m%d-%H%M)
# ... do the risky thing: apt full-upgrade, config edit, version bump ...
# if it went sideways:
pct rollback 105 pre-maint-20260604-1430

Before my last host-wide maintenance window I snapshotted every service container that supported it (pre-maint-<timestamp>), upgraded, and verified. The snapshot timeline is per-container, so a botched upgrade on the monitoring box rolls back in seconds without touching anything else. That’s isolation paying off again: a per-service rollback is only possible because services aren’t sharing a rootfs.

Two caveats from the field:

When a Full VM Is Actually Justified

I default to LXC, but the discipline includes knowing when LXC is the wrong tool. Reach for a QEMU VM when:

What does not justify a VM: “it’s a big service,” “it has a database,” “it runs Docker.” LXC handles all of those fine. Heavy and isolated are different axes. My SIEM is 8GB and 4 cores in an LXC and it’s still isolated from everything else.

Gotchas

  1. CTID and VMID share one numbering pool. On Proxmox, pct list shows only LXC, qm list shows only VMs, and they draw from the same ID space. ID 110 can be a VM while 111 and 112 are containers. Always check both before assuming an ID is free or “doesn’t exist.”

  2. A bind-mount is a coupling, not a convenience. Every shared mount you bind into a container extends that container’s blast radius to include host-side state. Inventory them. The silent failures (empty mount after a boot race) are the ones that cost you an afternoon.

  3. Stopped build containers still need upkeep. They’re templates, but a pristine snapshot from six months ago builds against six-month-old toolchains. Periodically start, update, re-snapshot, stop. Otherwise your “clean room” is a museum.

  4. unprivileged 1 can’t always be toggled after create. Flipping a container between privileged and unprivileged isn’t a simple pct set; UID mapping changes how files on the rootfs are owned. Decide at create time. For services, the answer is always unprivileged.

  5. Resource caps are a contract you have to enforce. Setting --memory 512 caps the container, but a service that genuinely needs more will OOM-loop quietly inside its cap. Cap deliberately, then actually watch for the service hitting the ceiling instead of assuming the number you picked at 2am was right.

  6. Snapshot before change is a habit, not a feature you enable. Proxmox won’t snapshot for you. The discipline is yours: pct snapshot before every upgrade, config edit, or risky command. The five seconds it costs is the cheapest insurance in the lab.