Chapter VI · Hardware & host

Bare-Metal Setup

One Linux box, picked once, lives for years. Spec it for the workload that already gives you grief, not the workload you imagine you’ll have someday.

What this is

This is the physical layer behind everything else in the cookbook: a single x86-64 box running Ubuntu Desktop as the host OS for an always-on multi-agent AI stack, browser automation, local LLMs, and a home network of services. No virtualization at the host layer, no container OS, no remote bare metal. One machine, one disk pool, one user, one set of services.

If you are deciding whether to host this stack on a VPS, a NAS, or a Raspberry Pi cluster, read philosophy/why-one-host.md first. This guide assumes the decision is made.

Why this way

Three things drive the spec:

  1. An agent stack is a latency game, not a throughput game. Most of the wall-clock time the agent burns is waiting on a model. The model is usually remote. Your local machine’s job is to be fast at the boring stuff: filesystem walks, JSON parsing, sqlite reads, sub-process spawn. A modern desktop CPU is more than enough; the bottleneck is single-thread responsiveness, not core count.

  2. Browser automation and local LLMs both want RAM. A persistent Chromium with a few logged-in profiles, plus an Ollama model held in cache, plus the agent’s working memory, plus the rest of the desktop will sit at 25-30 GB resident comfortably. 64 GB is the floor; 32 GB will work but you’ll feel it the first time the agent tries to keep three browser sessions warm.

  3. Two SSDs in an LVM pool is the cheapest insurance you can buy. One drive owns root, both drives feed the data pool. Lose one and you reinstall the OS, the data survives. See disk-layout-lvm.md for the full layout.

The temptation is to put this on a server in a closet. Resist it. A desktop with a monitor and a keyboard is recoverable when SSH is broken, and that matters once a year.

Prerequisites

Before / After

Before: the stack runs on a laptop, a VPS, or a NUC bought for something else. The CPU pegs during browser automation, the disk fills up because root and data are the same partition, and reboots are scheduled around when nobody needs the machine.

After: the stack runs on one purpose-built box that wakes from suspend in two seconds, holds the agent and the browser and a local model all warm, and has separate disks for OS-recovery and data-survival.

Implementation

Spec target

ComponentTargetWhy
CPURecent desktop class, 16+ threads, P/E hybrid is fineSingle-thread responsiveness over core count
RAM64 GB DDR5 minimumBrowser + Ollama + agent + desktop
Primary storage1 TB NVMe, PCIe 4.0 or 5.0Hot disk for root + most data
Secondary storage1 TB NVMe, same generationLVM extension, recovery margin
NetworkOnboard 2.5 GbESkip the USB dongle drama
GPUWhatever the CPU ships withThe model is remote, you do not need a discrete GPU for the agent
CaseMid-tower with two M.2 slots and at least one 3.5” bayFuture external pool over USB or SATA
PSU80+ Gold, sized 2x peak loadThe fan is silent when the load is half of rated

Notes on real choices:

Host OS

Ubuntu 24.04 LTS Desktop. Reasons:

Install steps (single user, full encryption optional):

# After base install, before installing anything else:
sudo apt update && sudo apt upgrade -y
sudo apt install -y build-essential git curl jq ripgrep htop tmux \
                    nvme-cli smartmontools ufw fail2ban unattended-upgrades
sudo apt purge -y snapd-desktop-integration firefox || true

Then install the OEM kernel for the hardware in question:

sudo apt install -y linux-oem-24.04d  # Or whichever OEM kernel ships current for your CPU generation

The OEM kernel is the one that picks up new CPU power management, NIC drivers, and NVMe firmware features. The HWE kernel is the alternative if no OEM kernel ships for your platform; pick one, do not stack both.

User layout

One user, owned by the agent. Add the user that runs the agent service early, even if it is your own login:

sudo adduser agentuser
sudo usermod -aG sudo,docker,plugdev agentuser
sudo loginctl enable-linger agentuser   # User services start at boot, not at login

enable-linger is the part most guides skip and the reason the agent gateway doesn’t start until you log in graphically. Without it, every reboot leaves the stack down until someone opens the desktop.

Baseline tuning

Two cheap wins before anything stack-specific:

# Higher inotify limit - editors, file watchers, agents all share this pool.
sudo tee /etc/sysctl.d/90-inotify.conf <<'EOF'
fs.inotify.max_user_watches = 524288
fs.inotify.max_user_instances = 512
EOF

# Disable Ubuntu Pro nag and motd-news lookups.
sudo pro config set apt_news=false || true
sudo systemctl mask motd-news.service motd-news.timer || true

sudo sysctl --system

The rest of kernel tuning (swappiness, dirty-page ratios, scheduler choices) lives in kernel-tuning.md.

Power and sleep

The agent host should never suspend on its own.

sudo systemctl mask sleep.target suspend.target hibernate.target hybrid-sleep.target

Then set screen blanking but not suspend in GNOME settings. The box stays on, the monitor turns off, the agent keeps running.

If the box is on a UPS, install nut-client or apcupsd and wire shutdown to fire on prolonged battery. Verify with a real pull of the wall plug before you trust it. Without that test, your “UPS-protected” host is one outage away from a hard crash.

Verification

After the first install, the following should all be true:

# OEM kernel is running, not the generic.
uname -r | grep -E 'oem|hwe' && echo OK

# Lingering is enabled for the agent user.
loginctl show-user agentuser --property=Linger | grep -q yes && echo OK

# Suspend is masked.
systemctl status sleep.target | grep -q "masked" && echo OK

# Inotify watch limit is high.
sysctl fs.inotify.max_user_watches | awk '$3>=524288 {print "OK"}'

# Both NVMe drives are visible and healthy.
sudo nvme list
sudo smartctl -a /dev/nvme0n1 | grep -i "Critical Warning"
sudo smartctl -a /dev/nvme1n1 | grep -i "Critical Warning"

A healthy box shows zero critical warnings, both drives present, generic Ubuntu kernel replaced by an OEM kernel, and the user that owns the agent stack able to keep user services running across reboot.

Gotchas

The Snap Firefox locks Playwright profiles in confinement and you’ll waste an afternoon. Browser automation guides assume a regular firefox binary in $PATH. Snap puts it in /snap/firefox/current/usr/lib/firefox/firefox and adds AppArmor rules that prevent the agent from writing into a persistent profile under ~/.openclaw/browser/. Replace Firefox with the Mozilla PPA build at install time.

Auto-installer LVM picks 100 GB for root and stops there. The Ubuntu installer’s default LVM layout reserves the whole disk to root, then resizes it down to 100 GB and leaves the rest unallocated for “future use.” There is no future use. Partition by hand or follow disk-layout-lvm.md before you click install, or you will be pvextend-ing within a month.

Hibernate corrupts the swap file the second time you upgrade the kernel. The auto-installed /swap.img works for swap and fails for resume-from-hibernate the moment its physical block list changes. Disable hibernate (sudo systemctl mask hibernate.target) and use suspend-to-RAM only on machines that need it. The agent host should not be doing either.

linger does not survive userdel and re-creation. If you delete and re-add the agent user (for example, swapping UIDs to match a NAS export), you need to re-run loginctl enable-linger. The systemd state file under /var/lib/systemd/linger/ does not get migrated.

unattended-upgrades will reboot at 6 am unless you tell it not to. The default Ubuntu config is fine for desktops, hostile for an always-on agent. Either set Unattended-Upgrade::Automatic-Reboot "false"; in /etc/apt/apt.conf.d/50unattended-upgrades and run reboots from a maintenance window, or leave it on and accept the daily restart.

OEM kernels are not picked by do-release-upgrade. Major Ubuntu LTS upgrades fall back to the generic kernel and leave the OEM package in a half-installed state. Pin the OEM kernel via apt-mark hold before any release upgrade, and re-install the matching version for the new release manually.

Templates