Building a Private AI Stack: From Mini PC to Autonomous Agents

36 minute read

Published:

For the past several years I have been thinking carefully about what it means to run AI infrastructure that I actually own, control, and understand from the ground up. The rapid proliferation of frontier model APIs, agentic coding tools, and open-weight model releases in 2025-2026 finally made this tractable at a price and complexity point that a single person could manage. This post documents the architecture I settled on: a self-hosted, Docker-based stack running on a mini PC, unified by a single OpenAI-compatible model gateway, and surfaced through a collection of local inference servers, agentic CLI tools, autonomous agent frameworks, open-source Cowork alternatives, and a task-bounded command harness built around structured queues. My goals were privacy, sovereignty, reproducibility, and the ability to swap components without rebuilding everything from scratch.

Motivation: Why Self-Host?

The short answer is control. When I work on grant-funded AI research, on student data in the context of my courses, or on institutional planning, I want to know exactly where inference is happening and what data is leaving my environment. The longer answer is pedagogical: I cannot credibly teach AI literacy, AI ethics, and responsible deployment if I have not made serious, hands-on architectural decisions myself. Running your own stack is humbling in the right ways.

There is also a strategic argument. As the open-source AI agent framework ecosystem matured through early 2026, it became clear that the layered architecture of these systems, separating the model layer, the orchestration layer, the tool-access layer, and the user-interface layer, was stabilizing into recognizable patterns. A well-designed self-hosted stack can plug into any of these layers without being locked to a single vendor. That flexibility is worth the setup cost.

Hardware: The Mini PC

The physical foundation of the stack is a mini PC running Linux Mint, which gives me a clean Debian-lineage environment with full access to the upstream Docker Engine repository and no virtualization layer between the container workloads and the host kernel. Everything runs directly on the host, which simplifies networking, volume permissions, and service lifecycle management considerably compared to hypervisor-based setups.

Installing Docker Engine

Linux Mint ships an older Docker build from the distribution mirror, so I install from the upstream Docker repository instead. The setup sequence is:

# Remove any distribution-packaged versions
sudo apt-get remove docker docker-engine docker.io containerd runc
 
sudo apt-get update
sudo apt-get install -y ca-certificates curl gnupg lsb-release
 
# Add Docker's official GPG key
sudo mkdir -m 0755 -p /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | \
  sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
 
# Point the apt source at the Ubuntu codename that Mint is based on
echo \
  "deb [arch=$(dpkg --print-architecture) \
  signed-by=/etc/apt/keyrings/docker.gpg] \
  https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo $UBUNTU_CODENAME) stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
 
sudo apt-get update
sudo apt-get install -y docker-ce docker-ce-cli containerd.io \
  docker-buildx-plugin docker-compose-plugin
 
sudo docker run hello-world
sudo usermod -aG docker $USER
newgrp docker

The UBUNTU_CODENAME variable in the source line is the key detail for Mint: it resolves to the underlying Ubuntu release name rather than the Mint release name, which is what the Docker repository actually indexes.

Installing Ollama

Ollama runs as a native Linux service, not inside a container, which keeps model-weight I/O off the Docker networking path and avoids bind-mount overhead for the weight files:

curl -fsSL https://ollama.com/install.sh | sh
systemctl status ollama

Ollama listens on http://localhost:11434 by default and is reachable from within containers via --add-host=host.docker.internal:host-gateway, which every container in the stack declares. This flag is required on Linux Docker Engine; unlike Docker Desktop, the Linux engine does not inject host.docker.internal automatically.

Workspace Layout

All agent state lives under a single root in the home directory:

$HOME/agents/
├── workspace/              # Shared project files (all TUI tools mount here)
├── skills/core/            # Read-only skills package (mounted :ro)
├── litellm/                # LiteLLM config and Compose files
├── kilocode/home/          # KiloCode identity
├── opencode/home/          # opencode.ai identity
├── pi/home/                # pi.dev identity, models.json, settings.json
├── hermes/home/            # Hermes agent identity
├── gnhf/                   # Good Night Have Fun harness
├── open-design/            # Open Design app data and pi identity
├── commercial/             # Claude Code, Codex, Gemini CLI, Copilot, CCR
├── mastra/data/            # Mastra SQLite database
├── a0/data/                # Agent Zero persistent user data
├── archon/data/            # Archon workflow state
├── ollama/data/            # Downloaded model weights (~/.ollama)
├── localai/                # LocalAI models/, backends/, config/
├── googleworkspacecli/     # Google Workspace CLI (gcloud/)
├── googleagentscli/        # Google ADK CLI
└── openwebui/data/         # Open WebUI persistent data

Creating the full tree is a one-liner drawn from the deploy script:

BASE="$HOME/agents"
mkdir -p \
  "$BASE/workspace" "$BASE/skills/core" "$BASE/litellm" \
  "$BASE/kilocode/home" "$BASE/opencode/home" "$BASE/pi/home" \
  "$BASE/hermes/home" "$BASE/ollama/data" \
  "$BASE/localai/models" "$BASE/localai/backends" "$BASE/localai/config" \
  "$BASE/mastra/data" "$BASE/openwebui/data" \
  "$BASE/commercial/claude/workspace" "$BASE/commercial/claude/home" \
  "$BASE/commercial/claude/npm" "$BASE/commercial/claude/config" \
  "$BASE/commercial/claude/cache" \
  "$BASE/googleworkspacecli/gcloud" \
  "$BASE/googleagentscli/data" "$BASE/googleagentscli/config" \
  "$BASE/googleagentscli/cache/uv" "$BASE/googleagentscli/cache/npm" \
  "$BASE/googleagentscli/evals" "$BASE/googleagentscli/logs" \
  "$BASE/archon/data" "$BASE/a0/data" \
  "$BASE/gnhf/home" "$BASE/gnhf/npm" "$BASE/gnhf/config" "$BASE/gnhf/cache" \
  "$BASE/open-design/data" "$BASE/open-design/pi"
echo "Directory tree created under $BASE"

The Core Design Principle: A Unified Model Gateway

The single most important architectural decision I made was to route all LLM inference through a single OpenAI-compatible endpoint rather than having each tool reach out to Ollama, Anthropic, or OpenRouter independently. I use LiteLLM for this. Every service in the stack sends its requests to http://localhost:4000/v1 with the bearer token sk-litellm-local. LiteLLM translates those requests to whatever backend is appropriate, whether a local Ollama model, a LocalAI GGUF endpoint, or an OpenRouter free-tier model, according to a YAML routing configuration.

model_list:
  - model_name: llama3
    litellm_params:
      model: ollama/llama3
      api_base: http://host.docker.internal:11434
 
  - model_name: qwen2.5-3b
    litellm_params:
      model: ollama/qwen2.5:3b
      api_base: http://host.docker.internal:11434
 
  - model_name: hermes3
    litellm_params:
      model: ollama/hermes3:8b
      api_base: http://host.docker.internal:11434
 
  - model_name: openrouter/auto
    litellm_params:
      model: openrouter/auto
      api_key: ${OPENROUTER_API_KEY}
      api_base: https://openrouter.ai/api/v1

The practical consequence is that changing a model, adding a provider, or adjusting routing requires editing one YAML file and restarting one service. No other container needs to know that anything changed. This is the same composability principle I teach in software engineering: minimize coupling, maximize cohesion.

Connecting a commercial CLI tool like Claude Code to the local stack requires only two environment variable exports:

export ANTHROPIC_BASE_URL=http://localhost:4000
export ANTHROPIC_API_KEY=sk-litellm-local

LiteLLM Deployment

LiteLLM runs as a Docker Compose service. The docker-compose.yml pulls the pre-built image; everything else is bind-mounted configuration.

# litellm/docker-compose.yml
version: "3.9"
services:
  litellm:
    image: ghcr.io/berriai/litellm:main-latest
    container_name: litellm-${USER}
    restart: unless-stopped
    ports:
      - "4000:4000"
    volumes:
      - ./litellm_config.yaml:/app/config.yaml:ro
    env_file:
      - .env
    command: ["--config", "/app/config.yaml", "--port", "4000", "--num_workers", "2"]
    extra_hosts:
      - "host.docker.internal:host-gateway"

The .env file holds the master key and any cloud provider keys:

# litellm/.env
LITELLM_MASTER_KEY=sk-litellm-local
OPENROUTER_API_KEY=YOUR_OPENROUTER_API_KEY

Build and run scripts are minimal wrappers around Compose:

# litellm/build.sh — pull image, start, verify endpoint
cd "$HOME/agents/litellm"
docker compose pull
docker compose up -d
sleep 20
curl -s http://localhost:4000/models \
  -H "Authorization: Bearer sk-litellm-local" | python3 -m json.tool | head -20
 
# litellm/run.sh — start after reboot
cd "$HOME/agents/litellm" && docker compose up -d
 
# litellm/attach.sh — tail live logs
cd "$HOME/agents/litellm" && docker compose logs --tail=30 -f litellm-${USER}

Restart after a config change: cd $HOME/agents/litellm && docker compose down && docker compose up -d.

The Full Stack: Services and Ports

ServicePortPurpose
LiteLLM4000Unified model gateway (OpenAI-compatible)
Ollama11434Local LLM inference (native systemd service)
LocalAI8080GGUF model inference (OpenAI-compatible)
Open WebUI3000Browser-based LLM frontend with MCP tool calling
Mastra4111TypeScript AI agent server (API + Studio UI)
Agent Zero8081Autonomous hierarchical agent with web UI
Archon3090Workflow-driven agent runner
Open Design5173Collaborative design canvas with embedded pi agent
Portainer9000Docker management UI

Local Inference: Ollama and LocalAI

The stack runs two local inference backends with distinct tradeoffs, and LiteLLM routes between them transparently based on model alias.

Ollama is the primary inference backend. Its systemd service model keeps it available before Docker is fully up, its model management CLI is clean, and its HTTP API is stable. The models I maintain are selected for RAM footprint first: phi4-mini, smollm2, gemma4:e2b, qwen2.5:1.5b, qwen2.5:3b, llama3, and hermes3:8b.

# ollama/run.sh
docker run -d \
  --name ollama-${USER} \
  --restart no \
  --add-host=host.docker.internal:host-gateway \
  -p 11434:11434 \
  -v "$HOME/agents/ollama/data:/root/.ollama" \
  ollama/ollama:latest
 
# Pull models after starting
for model in phi4-mini smollm2 "gemma4:e2b" "qwen2.5:1.5b" "qwen2.5:3b" llama3 "hermes3:8b"; do
  docker exec ollama-${USER} ollama pull "$model"
done

LocalAI (github.com/mudler/LocalAI) is the secondary inference backend, running on port 8080 behind an OpenAI-compatible API surface. It supports llama.cpp for text generation, whisper.cpp for speech transcription, and stable diffusion for image generation, all behind the same endpoint. LocalAI is organized around three bind-mounted directories: models/ holds GGUF weight files, backends/ holds compiled backend binaries, and config/ holds per-model YAML configuration. Note that the host directory is named config/ while the container mount path is /configuration; this asymmetry is intentional and must be preserved.

# localai/run.sh
docker run -d \
  --name localai-${USER} \
  --restart no \
  --add-host=host.docker.internal:host-gateway \
  -p 8080:8080 \
  -v "$HOME/agents/localai/models:/models" \
  -v "$HOME/agents/localai/backends:/backends" \
  -v "$HOME/agents/localai/config:/configuration" \
  localai/localai:latest

A per-model YAML config controls backend selection and context length:

# localai/config/phi4-mini.yaml
name: phi4-mini
backend: llama
parameters:
  model: phi-4-mini-instruct.Q4_K_M.gguf
  context_size: 8192
  threads: 8

Browser Frontend: Open WebUI

Open WebUI is the browser-based interface for direct LLM interaction, running on port 3000. It connects to Ollama directly and enumerates available models automatically. Its native MCP tool-calling support (version 0.4 and later) intercepts tool-call responses, dispatches them to registered MCP servers, and injects results back into the conversation.

# openwebui/run.sh
docker run -d -p 3000:8080 \
  --add-host=host.docker.internal:host-gateway \
  -v "$HOME/agents/openwebui/data:/app/backend/data" \
  --name open-webui-${USER} \
  ghcr.io/open-webui/open-webui:main

After launch, connect to LiteLLM via Admin Settings → Connections → OpenAI: set the URL to http://host.docker.internal:4000/v1 and the API key to sk-litellm-local. Models confirmed to work reliably with Open WebUI tool calling include hermes3:8b, llama3.1:8b, qwen2.5:7b, qwen2.5:14b, and mistral-nemo:12b.

Agentic CLI Tools: A Comparative Survey

By April 2026, a mature set of agentic coding CLI tools has emerged with distinct architectural philosophies, and I run all of them through the unified LiteLLM gateway. Each tool runs inside a dedicated Docker container with an identity bind mount, a shared workspace mount, and an optional skills mount. The sections below show the Dockerfile, build script, and run script for each.

Claude Code (Anthropic, Node.js) is the most fully featured in terms of built-in subagent support, MCP integration, and permission gate granularity. It uses CLAUDE.md files for project context and .claude/agents/ Markdown files for custom subagent definitions.

OpenAI Codex CLI (Rust) supports native multi-provider configuration through a TOML config file. Custom providers are defined as named sections:

model = "llama3.3:70b"
model_provider = "openwebui"
 
[model_providers.openwebui]
name = "Open WebUI"
base_url = "http://localhost:3000/openai"
env_key = "OPENWEBUI_API_KEY"

Gemini CLI (Google, Node.js) uses GEMINI.md files for project context and a three-tier discovery hierarchy for skills. Routing it through an OpenAI-compatible endpoint requires the open-gemini-cli fork, which injects an adapter layer that translates Gemini’s internal message format.

OpenCode (opencode.ai, Go) is the most flexible in terms of provider support, relying on the @ai-sdk/openai-compatible adapter to connect to any OpenAI-compatible backend:

{
  "provider": {
    "litellm": {
      "npm": "@ai-sdk/openai-compatible",
      "options": { "baseURL": "http://localhost:4000/v1" },
      "models": { "llama3": {}, "qwen2.5-3b": {}, "hermes3": {} }
    }
  }
}

Commercial Tools Deployment (Claude Code, Codex, Gemini CLI, Copilot, CCR)

Claude Code, Codex, Gemini CLI, GitHub Copilot, and the Claude Code Router all share a single Docker image. The tool to launch is selected at runtime as an argument to run.sh.

# commercial/Dockerfile
FROM node:22-bookworm
 
RUN apt-get update && apt-get install -y --no-install-recommends \
    git curl ca-certificates ripgrep less nano vim \
    && rm -rf /var/lib/apt/lists/*
 
RUN npm install -g \
    @anthropic-ai/claude-code \
    @openai/codex \
    @google/gemini-cli \
    @github/copilot \
    @musistudio/claude-code-router
 
RUN mkdir -p /root/.claude-code-router
WORKDIR /workspace
CMD ["/bin/bash"]
# commercial/build.sh (excerpt)
docker build -t commercial-ai:latest "$HOME/agents/commercial"
 
# Create per-tool identity directories
for tool in claude codex gemini ccr copilot; do
  mkdir -p "$HOME/agents/commercial/$tool/"{workspace,home,npm,config,cache}
done
# commercial/run.sh (excerpt — tool is the first argument)
TOOL="$1"   # claude | codex | gemini | copilot | ccr
 
docker run --rm -it \
  --name "commercial-${TOOL}-${USER}" \
  --add-host=host.docker.internal:host-gateway \
  -e "${API_VAR}=${!API_VAR}" \
  -e HOME="/home/agent" \
  -v "$HOME/agents/commercial/${TOOL}/workspace:/workspace" \
  -v "$HOME/agents/commercial/${TOOL}/home:/home/agent" \
  -v "$HOME/agents/commercial/${TOOL}/npm:/home/agent/.npm" \
  -v "$HOME/agents/commercial/${TOOL}/config:/home/agent/.config" \
  -v "$HOME/agents/commercial/${TOOL}/cache:/home/agent/.cache" \
  -w /workspace \
  commercial-ai:latest ${CLI_CMD}

The ccr (Claude Code Router) variant additionally mounts its routing config read-only:

// commercial/ccr/config.json  routing table
{
  "Router": {
    "default":          "ollama,llama3:latest",
    "background":       "ollama,qwen2.5:1.5b",
    "think":            "ollama,gemma4:e2b",
    "longContext":      "openrouter,google/gemini-2.5-pro-exp-03-25:free",
    "longContextThreshold": 60000,
    "webSearch":        "openrouter,google/gemini-2.5-pro-exp-03-25:online"
  }
}

Switching the active model within a running Claude Code session is a single slash command: /model ollama,llama3:latest.

OpenCode Deployment

# opencode/Dockerfile
FROM node:20-bookworm-slim
 
RUN apt-get update \
    && apt-get install -y --no-install-recommends \
        curl ca-certificates git bash findutils \
    && rm -rf /var/lib/apt/lists/*
 
RUN curl -fsSL https://opencode.ai/install | bash \
    && mkdir -p /opt/opencode/bin \
    && cp "$(find /root -type f -name opencode | head -n 1)" /opt/opencode/bin/opencode \
    && chmod 755 /opt/opencode/bin/opencode
 
ENV PATH="/opt/opencode/bin:${PATH}"
ENV HOME=/home/opencode
VOLUME ["/workspace"]
WORKDIR /workspace
ENTRYPOINT ["/opt/opencode/bin/opencode"]
# opencode/build.sh
docker build -t opencode:local "$HOME/agents/opencode"
mkdir -p "$HOME/agents/opencode/home"
 
# opencode/run.sh
docker run --restart no -it \
  --name opencode-${USER} \
  --add-host=host.docker.internal:host-gateway \
  -v "$HOME/agents/opencode/home:/home/opencode" \
  -v "$HOME/agents/workspace:/workspace" \
  -v "$HOME/agents/skills/core:/app/skills/core:ro" \
  opencode:local
 
# opencode/attach.sh
docker start -ai opencode-${USER}

KiloCode (VS Code extension, Node.js) is the VS Code-native member of the survey. It brings the agentic loop into the editor rather than the terminal, with direct access to the VS Code language server for diagnostics, symbol navigation, and refactoring. Connecting it to LiteLLM is a one-field change in its settings JSON.

KiloCode Deployment

# kilocode/Dockerfile
FROM debian:bookworm-slim
 
RUN apt-get update && \
    apt-get install -y --no-install-recommends \
        ca-certificates git bash wget curl && \
    rm -rf /var/lib/apt/lists/*
 
ARG TARGETARCH=amd64
 
RUN ARCH=${TARGETARCH} && \
    if [ "$ARCH" = "arm64" ]; then ARCH="arm64"; else ARCH="x64"; fi && \
    wget -qO /tmp/kilo.tar.gz \
        "https://github.com/Kilo-Org/kilocode/releases/latest/download/kilo-linux-${ARCH}.tar.gz" && \
    tar -xzf /tmp/kilo.tar.gz -C /usr/local/bin && \
    chmod +x /usr/local/bin/kilo && \
    rm /tmp/kilo.tar.gz
 
VOLUME ["/workspace"]
WORKDIR /workspace
ENTRYPOINT ["kilo"]
# kilocode/build.sh
docker build -t kilocode:local "$HOME/agents/kilocode"
mkdir -p "$HOME/agents/kilocode/home"
chown -R $(id -u):$(id -g) "$HOME/agents/kilocode/home"
chmod -R u+rwX "$HOME/agents/kilocode/home"
 
# kilocode/run.sh
docker run --restart no -it \
  --user $(id -u):$(id -g) \
  --add-host=host.docker.internal:host-gateway \
  -e HOME=/home/kilo \
  -e XDG_CONFIG_HOME=/home/kilo/.config \
  -e XDG_DATA_HOME=/home/kilo/.local/share \
  -e XDG_CACHE_HOME=/home/kilo/.cache \
  --name kilocode-${USER} \
  -e TERM=xterm-256color \
  -v "$HOME/agents/kilocode/home:/home/kilo" \
  -v "$HOME/agents/workspace:/workspace" \
  -v "$HOME/agents/skills/core:/app/skills/core:ro" \
  kilocode:local
 
# kilocode/attach.sh
docker start -ai kilocode-${USER}

pi (pi.dev, Node.js) takes a deliberately minimal stance, omitting built-in MCP, plan mode, and permission gates in favor of a package-based extensibility model. It supports OpenRouter directly through a provider block in models.json, and NVIDIA NIM endpoints are equally accessible through the same mechanism. I reach for pi primarily for rapid exploratory work precisely because it does not impose an opinionated workflow.

Pi Deployment

# pi/Dockerfile
FROM node:22-bookworm-slim
 
RUN apt-get update \
 && apt-get install -y --no-install-recommends \
    bash ca-certificates curl git openssh-client \
 && rm -rf /var/lib/apt/lists/*
 
RUN npm install -g @mariozechner/pi-coding-agent
 
ENV HOME=/home/pi-agent
VOLUME ["/workspace"]
WORKDIR /workspace
ENTRYPOINT ["pi"]
# pi/build.sh
docker build -t pi:local "$HOME/agents/pi"
mkdir -p "$HOME/agents/pi/home/.pi/agent"
 
# Write default models.json pointing at Ollama
cat > "$HOME/agents/pi/home/.pi/agent/models.json" << 'EOF'
{
  "providers": {
    "ollama": {
      "baseUrl": "http://host.docker.internal:11434/v1",
      "api": "openai-completions",
      "apiKey": "ollama",
      "models": [
        {
          "id": "qwen2.5:7b",
          "name": "Qwen 2.5 7B (Local)",
          "contextWindow": 32768,
          "maxTokens": 8192,
          "cost": { "input": 0, "output": 0 }
        }
      ]
    },
    "openrouter": {
      "baseUrl": "https://openrouter.ai/api/v1",
      "apiKey": "${OPENROUTER_API_KEY}",
      "models": [
        { "id": "google/gemini-2.5-pro-exp-03-25:free", "contextWindow": 1000000 },
        { "id": "meta-llama/llama-4-maverick:free",      "contextWindow": 128000  }
      ]
    }
  }
}
EOF
 
# pi/run.sh
docker run --restart no -it \
  --name pi-${USER} \
  --add-host=host.docker.internal:host-gateway \
  -e TERM=xterm-256color \
  -e OLLAMA_HOST="http://host.docker.internal:11434" \
  -e OPENROUTER_API_KEY="${OPENROUTER_API_KEY:-YOUR_OPENROUTER_API_KEY}" \
  -v "$HOME/agents/pi/home:/home/pi-agent" \
  -v "$HOME/agents/workspace:/workspace" \
  -v "$HOME/agents/skills/core:/app/skills/core:ro" \
  pi:local
 
# pi/attach.sh
docker start -ai pi-${USER}

Hermes is a named agent identity maintained around Nous Research’s Hermes 3 model family. The hermes/home/ directory holds a configuration and prompt library tuned for Hermes 3’s specific instruction format and function-calling conventions. Because Hermes 3 handles structured output and tool-use with uncommon consistency, I route all tool-calling-heavy workloads to this identity.

Hermes Agent Deployment

# hermes/build.sh — probe first, then run
# Confirm the image's runtime user and home directory before committing a mount path:
docker pull nousresearch/hermes-agent:latest
docker run --rm --entrypoint sh nousresearch/hermes-agent:latest \
  -c 'id && echo HOME=$HOME'
 
mkdir -p "$HOME/agents/hermes/home"
 
# hermes/run.sh
# Always launch with -it — running detached causes immediate exit
docker run --restart no -it \
  --name hermes-${USER} \
  --add-host=host.docker.internal:host-gateway \
  -e TERM=xterm-256color \
  -v "$HOME/agents/hermes/home:/home/hermes/.hermes" \
  -v "$HOME/agents/workspace:/workspace" \
  nousresearch/hermes-agent:latest
 
# hermes/attach.sh
docker start -ai hermes-${USER}

GNHF (Good Night Have Fun) is the task-bounded harness I use when I want a single-purpose agent that executes a defined workflow, reports results, and stops. The name is a ham radio sign-off, which fits its character: polite, brief, and does exactly what it was asked to do. Unlike Claude Code or pi, which are interactive and session-oriented, GNHF takes a task description and a workspace path as inputs, executes against the LiteLLM gateway, writes its outputs to the workspace, and exits.

GNHF Deployment

GNHF requires a Dockerfile placed at $HOME/agents/gnhf/Dockerfile before build.sh can execute, because the gnhf binary distribution mechanism is external to this stack. A representative starting point:

# gnhf/Dockerfile — adapt to your gnhf binary distribution
FROM node:22-bookworm
 
RUN apt-get update && apt-get install -y --no-install-recommends \
    git curl ca-certificates ripgrep less \
    && rm -rf /var/lib/apt/lists/*
 
# Install the agent CLIs that gnhf wraps
RUN npm install -g \
    @anthropic-ai/claude-code \
    @openai/codex \
    @github/copilot
 
# Install gnhf — adapt to your distribution method:
# RUN npm install -g gnhf
# or: COPY gnhf /usr/local/bin/gnhf && chmod +x /usr/local/bin/gnhf
 
WORKDIR /workspace
CMD ["/bin/bash"]
# gnhf/build.sh
if [[ ! -f "$HOME/agents/gnhf/Dockerfile" ]]; then
  echo "ERROR: place Dockerfile in $HOME/agents/gnhf/ first"; exit 1
fi
docker build -t gnhf:latest "$HOME/agents/gnhf"
 
# gnhf/run.sh — key arguments shown; full script handles key resolution per agent
# Usage: ./run.sh --agent <codex|claude|copilot> --repo <path> \
#                 [--max-iterations N] [--max-tokens N] "task description"
docker run --rm -it \
  --name "gnhf-${AGENT}-${USER}" \
  -e ANTHROPIC_API_KEY="${ANTHROPIC_API_KEY:-YOUR_ANTHROPIC_API_KEY}" \
  -e HOME="/home/agent" \
  -v "$HOME/agents/workspace:/workspace" \
  -v "$HOME/agents/gnhf/home:/home/agent" \
  -v "$HOME/agents/gnhf/npm:/home/agent/.npm" \
  -v "$HOME/agents/gnhf/config:/home/agent/.config" \
  -v "$HOME/agents/gnhf/cache:/home/agent/.cache" \
  gnhf:latest \
  bash -lc '
    cd "$1" &&
    git config --global --add safe.directory "$1" &&
    shift && exec "$@"
  ' _ "${REPO_PATH}" gnhf --agent "${AGENT}" "${PROMPT}"

All six tools use project context files (CLAUDE.md, AGENTS.md, GEMINI.md) to provide persistent project instructions without consuming prompt tokens on every turn, and all six are converging on MCP as the standard for tool integration.

The Google Ecosystem: Agents CLI and Workspace CLI

Two Google-specific tools occupy their own tier in the stack, with independent container identities and a shared philosophy of treating Google’s API surface as a set of agent-accessible tools.

Google Agents CLI is the command-line interface for Google’s Agent Development Kit (ADK), a Python framework for building multi-agent systems that run on Google’s infrastructure and interact with Gemini models. The uv cache indicates a Python-heavy dependency footprint, the evals/ directory holds evaluation datasets and result logs, and the container runs as a non-root user to match the bind-mounted volume permissions.

Google Agents CLI Deployment

# googleagentscli/Dockerfile
FROM python:3.12-slim
 
RUN apt-get update && apt-get install -y --no-install-recommends \
        ca-certificates curl git gnupg lsb-release unzip wget \
        jq vim less procps build-essential \
    && rm -rf /var/lib/apt/lists/*
 
RUN curl -fsSL https://deb.nodesource.com/setup_lts.x | bash - \
    && apt-get install -y --no-install-recommends nodejs \
    && rm -rf /var/lib/apt/lists/*
 
COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /usr/local/bin/
 
# Google Cloud SDK
RUN echo "deb [signed-by=/usr/share/keyrings/cloud.google.gpg] \
        https://packages.cloud.google.com/apt cloud-sdk main" \
        | tee /etc/apt/sources.list.d/google-cloud-sdk.list \
    && curl -fsSL https://packages.cloud.google.com/apt/doc/apt-key.gpg \
        | gpg --dearmor -o /usr/share/keyrings/cloud.google.gpg \
    && apt-get update && apt-get install -y --no-install-recommends google-cloud-cli \
    && rm -rf /var/lib/apt/lists/*
 
RUN groupadd -g 1000 agent && useradd -m -u 1000 -g agent -s /bin/bash agent
 
ENV UV_TOOL_DIR=/usr/local/uv-tools
ENV PATH="${UV_TOOL_DIR}/bin:${PATH}"
 
RUN uv tool install google-agents-cli && chown -R agent:agent "${UV_TOOL_DIR}"
 
RUN mkdir -p /workspace /home/agent/.config/agents-cli /home/agent/.config/gcloud \
        /home/agent/.cache/uv /home/agent/.cache/npm /home/agent/evals /home/agent/logs \
    && chown -R agent:agent /workspace /home/agent/.config /home/agent/.cache \
        /home/agent/evals /home/agent/logs
 
USER agent
WORKDIR /workspace
CMD ["sleep", "infinity"]
# googleagentscli/build.sh
docker build --progress=plain -t google-agents-cli:local \
  "$HOME/agents/googleagentscli"
 
# googleagentscli/run.sh — starts detached; exec in with attach.sh
GCLOUD_MOUNT=()
[[ -d "$HOME/.config/gcloud" ]] && \
  GCLOUD_MOUNT=(-v "$HOME/.config/gcloud:/home/agent/.config/gcloud:ro")
 
docker run \
  --detach \
  --name "googleagentscli-${USER}" \
  --restart unless-stopped \
  --add-host=host.docker.internal:host-gateway \
  -v "$HOME/agents/googleagentscli/data:/workspace" \
  -v "$HOME/agents/googleagentscli/config:/home/agent/.config/agents-cli" \
  -v "$HOME/agents/googleagentscli/cache/uv:/home/agent/.cache/uv" \
  -v "$HOME/agents/googleagentscli/cache/npm:/home/agent/.cache/npm" \
  -v "$HOME/agents/googleagentscli/evals:/home/agent/evals" \
  -v "$HOME/agents/googleagentscli/logs:/home/agent/logs" \
  "${GCLOUD_MOUNT[@]}" \
  -e GOOGLE_API_KEY="${GOOGLE_API_KEY:-YOUR_GOOGLE_API_KEY}" \
  --workdir /workspace \
  google-agents-cli:local
 
# googleagentscli/attach.sh
docker exec -it --user agent --workdir /workspace \
  "googleagentscli-${USER}" bash --login

Post-launch authentication: Option A (AI Studio key, no Cloud billing) is docker exec -it googleagentscli-${USER} agents-cli login. Option B (Google Cloud ADC for production workloads) requires running gcloud auth application-default login on the host machine; the run.sh mounts ~/.config/gcloud read-only into the container automatically.

Google Workspace CLI is a containerized gcloud environment configured with the scopes necessary to drive the Google Workspace APIs programmatically: Gmail, Drive, Calendar, Sheets, and Docs. Authenticate once inside the container; the bind-mounted credentials directory persists across container recreations.

Google Workspace CLI Deployment

# googleworkspacecli/run.sh
docker run --restart no -it \
  --name gworkspace-${USER} \
  --add-host=host.docker.internal:host-gateway \
  -v "$HOME/agents/googleworkspacecli/gcloud:/root/.config/gcloud" \
  -v "$HOME/agents/workspace:/workspace" \
  google/cloud-sdk:slim \
  bash
 
# Inside container (first time only):
# gcloud auth login
# gcloud config set project YOUR_PROJECT_ID
 
# googleworkspacecli/attach.sh
docker start -ai gworkspace-${USER}

The authentication separation between the Workspace CLI container and the rest of the stack is intentional: the container that holds the Google credentials never has access to the project workspace or to any other agent’s identity directory. Data flows through the workspace volume only.

Agent Frameworks: Agent Zero, Archon, and Mastra

The stack runs three agent frameworks that occupy distinct positions on the spectrum from fully autonomous to fully programmable.

Agent Zero

Agent Zero is the most autonomous framework in the stack, designed around the premise that the agent should be able to self-improve its own instructions and tools over the course of a session. It runs as a web UI on port 8081 and exposes a chat interface backed by a hierarchical agent system where the primary agent can spawn specialized subagents. The persistent state in a0/data/ includes the agent’s memory bank, its accumulated tool library, and its evolving system prompt, all of which carry forward across container restarts.

# a0/build.sh
docker pull agent0ai/agent-zero
mkdir -p "$HOME/agents/a0/data"
 
# a0/run.sh
docker run -d \
  --name "a0-${USER}" \
  --restart no \
  --add-host=host.docker.internal:host-gateway \
  -p 8081:80 \
  -v "$HOME/agents/a0/data:/a0/usr" \
  agent0ai/agent-zero
 
# a0/attach.sh — tail live logs (Ctrl+C safe; container keeps running)
docker logs --follow --timestamps "a0-${USER}"

Open http://localhost:8081 in a browser after the container starts.

Archon

Archon occupies a meta-level in the stack: it is an agent framework whose purpose is to help build other agent frameworks. Its Streamlit-based UI presents a development environment where I describe the agent I want to build in natural language, and Archon generates the scaffolding, tool definitions, system prompt, and evaluation harness for that agent. Archon-generated agents are configured at generation time to use the LiteLLM endpoint, so they enter the stack already wired to the unified gateway without any post-generation modification.

# archon/build.sh
docker pull ghcr.io/coleam00/archon:latest
mkdir -p "$HOME/agents/archon/data/workflows"
 
# archon/data/config.yaml — default model routing
cat > "$HOME/agents/archon/data/config.yaml" << 'EOF'
assistant: pi
assistants:
  pi:
    provider: openrouter
    model: openrouter/openrouter/free
EOF
 
# archon/run.sh — ephemeral (--rm); exits after task
[[ -z "${OPENROUTER_API_KEY:-}" ]] && \
  { echo "ERROR: OPENROUTER_API_KEY not set"; exit 1; }
 
docker run --rm \
  --name "archon-${USER}" \
  --user "$(id -u):$(id -g)" \
  -v "$HOME/agents/workspace:/home/bun/.archon/workspaces" \
  -v "$HOME/agents/archon/data:/home/bun/.archon" \
  -p 3090:3090 \
  -e OPENROUTER_API_KEY="${OPENROUTER_API_KEY}" \
  -e DEFAULT_AI_ASSISTANT=pi \
  ghcr.io/coleam00/archon:latest workflow list

Mastra

Mastra is a TypeScript-based AI agent framework that runs as a Docker Compose service exposing a REST API and a Studio UI on port 4111. It uses LibSQL for persistent conversation history, meaning that agent memory survives container restarts. Mastra occupies the programmable end of the spectrum: rather than autonomous self-direction, it provides a typed API for defining agents, tools, workflows, and memory retrievers in TypeScript.

The Mastra image is a multi-stage build. Critically, it handles the instrumentation.mjs file conditionally, since its presence varies across Mastra version upgrades.

# mastra/Dockerfile (multi-stage)
FROM node:22-alpine AS builder
WORKDIR /app
RUN apk add --no-cache gcompat
COPY package*.json ./
RUN npm install
COPY tsconfig*.json ./
COPY src ./src
RUN npx mastra build
 
FROM node:22-alpine AS runner
WORKDIR /app
RUN apk add --no-cache gcompat wget
RUN addgroup -g 1001 -S nodejs && adduser -S mastra -u 1001
COPY --from=builder --chown=mastra:nodejs /app/.mastra/output ./.mastra/output
COPY --from=builder --chown=mastra:nodejs /app/node_modules ./node_modules
COPY --from=builder --chown=mastra:nodejs /app/package.json ./package.json
RUN mkdir -p /app/data && chown mastra:nodejs /app/data
USER mastra
ENV PORT=4111
ENV NODE_ENV=production
ENV DATABASE_URL="file:/app/data/mastra.db"
EXPOSE 4111
HEALTHCHECK --interval=30s --timeout=10s --start-period=20s --retries=3 \
    CMD wget -qO- http://localhost:4111/api > /dev/null || exit 1
# Conditional instrumentation: works across Mastra versions
CMD ["sh", "-c", "if [ -f .mastra/output/instrumentation.mjs ]; then \
  node --import=./.mastra/output/instrumentation.mjs .mastra/output/index.mjs; \
  else node .mastra/output/index.mjs; fi"]

The agent definition is deliberately minimal:

// src/mastra/agents/assistant.ts
import { Agent } from "@mastra/core/agent";
 
export const assistant = new Agent({
  name: "assistant",
  instructions: "You are a helpful, concise, and accurate assistant.",
  model: "openrouter/meta-llama/llama-3.1-8b-instruct:free",
  // Other free-tier options:
  // openrouter/mistralai/mistral-7b-instruct:free
  // openrouter/google/gemma-3-12b-it:free
});
# mastra/docker-compose.yml
services:
  mastra:
    build:
      context: .
      dockerfile: Dockerfile
    container_name: mastra-${USER}
    ports:
      - "4111:4111"
    environment:
      OPENROUTER_API_KEY: ${OPENROUTER_API_KEY}
      NODE_ENV: production
      DATABASE_URL: "file:/app/data/mastra.db"
    volumes:
      - /home/${USER}/agents/mastra/data:/app/data
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "wget", "-qO-", "http://localhost:4111/api"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 20s
# mastra/build.sh — self-contained; creates src files if absent, prompts for key
cd "$HOME/agents/mastra"
docker compose up --build -d
 
# mastra/run.sh — start after reboot without rebuild
cd "$HOME/agents/mastra" && docker compose up -d
 
# mastra/attach.sh — tail live logs
docker logs --follow --timestamps "mastra-${USER}"

The agent server exposes a standard REST endpoint:

curl -X POST http://localhost:4111/api/agents/assistant/generate \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "Summarize this document."}]}'

Open Design: Collaborative Canvas with Embedded Agent

Open Design is a web-based collaborative design canvas with an embedded pi coding agent, running on port 5173. It is the visual layer of the stack, useful for design work where the agent can read and modify the canvas state directly. The pi identity directory is separate from the main pi tool identity, so Open Design’s agent configuration does not interfere with standalone pi sessions.

Open Design Deployment

# open-design/Dockerfile
FROM node:24-bookworm
 
ARG OPEN_DESIGN_REPO=https://github.com/nexu-io/open-design.git
ARG OPEN_DESIGN_REF=main
 
ENV APP_DIR=/opt/open-design
ENV PNPM_HOME=/root/.local/share/pnpm
ENV PATH=/root/.local/share/pnpm:/usr/local/bin:/usr/local/sbin:/usr/sbin:/usr/bin:/sbin:/bin
ENV PORT=5173
ENV HOST=0.0.0.0
ENV OD_HOST=0.0.0.0
ENV OD_ALLOWED_DEV_ORIGINS=127.0.0.1,localhost
 
RUN apt-get update && apt-get install -y --no-install-recommends \
    git ca-certificates curl bash python3 \
    && rm -rf /var/lib/apt/lists/*
 
RUN corepack enable
RUN npm install -g @mariozechner/pi-coding-agent
 
RUN git clone --branch "${OPEN_DESIGN_REF}" --depth 1 \
    "${OPEN_DESIGN_REPO}" "${APP_DIR}"
 
WORKDIR ${APP_DIR}
 
# Patch next.config.ts to accept OD_ALLOWED_DEV_ORIGINS from environment
RUN python3 - << 'PY'
from pathlib import Path
p = Path("apps/web/next.config.ts")
s = p.read_text()
old = "allowedDevOrigins: ['127.0.0.1'],"
new = """allowedDevOrigins: (
    process.env.OD_ALLOWED_DEV_ORIGINS
      ? process.env.OD_ALLOWED_DEV_ORIGINS.split(',').map((s) => s.trim()).filter(Boolean)
      : ['127.0.0.1']
  ),"""
if old not in s:
    raise SystemExit("Could not find allowedDevOrigins line")
p.write_text(s.replace(old, new))
PY
 
RUN corepack pnpm --version && pnpm install
 
EXPOSE 5173
CMD ["pnpm", "tools-dev", "run", "web", "--web-port", "5173"]
# open-design/build.sh
docker build -t open-design-pi "$HOME/agents/open-design"
mkdir -p "$HOME/agents/open-design/data"
mkdir -p "$HOME/agents/open-design/pi"
 
# First-time pi setup (run once to authenticate)
docker run --rm -it \
  -v "$HOME/agents/open-design/pi:/root/.pi" \
  open-design-pi \
  pi
# Inside pi session: /login
 
# open-design/run.sh
# OD_ALLOWED_DEV_ORIGINS must match the IP the browser uses to reach the container
docker run --rm -it \
  --name open-design-${USER} \
  -e "OD_ALLOWED_DEV_ORIGINS=YOUR_HOST_IP" \
  -p 5173:5173 \
  -v "$HOME/agents/open-design/data:/opt/open-design/.od" \
  -v "$HOME/agents/open-design/pi:/root/.pi" \
  open-design-pi

Open the canvas at http://YOUR_HOST_IP:5173. OD_ALLOWED_DEV_ORIGINS must match the IP address the browser uses to reach the container. To detect it automatically, substitute $(hostname -I | awk '{print $1}') for the hardcoded value.

Docker Volume Architecture: Identity, Workspace, Skills

One of the more carefully considered design decisions in this stack is the separation of Docker bind mounts into four independent tiers, which I call identity, workspace, skills, and tool data. This separation means that swapping a user identity, adding a skills package, or destroying an experimental container affects only its own tier; the other three are untouched.

TierHost Path PatternContainer MountShared?Destroyable?
Identity$HOME/agents/{tool}/homevaries per toolNoBackup first
Workspace$HOME/agents/workspace/workspaceYesNo
Skills$HOME/agents/skills/{name}/app/skills/{name}YesYes
Tool Data$HOME/agents/{tool}/datavariesNoSnapshot first

The identity hot-swap pattern is simple enough to describe in three commands: stop the container, remove it (volumes are untouched), and rerun with a different home/ path. The workspace and skills mount points are identical in both invocations. This makes it straightforward to work on the same project files under different API key contexts or with different tool configurations.

Permission repair across containers with differing internal UID/GID values is handled by a disposable Alpine container:

docker run --rm \
  -v "$HOME/agents/some-tool/home:/mnt/target" \
  alpine \
  chown -R 1000:1000 /mnt/target

Container-Isolated Tool Invocation

One of the more practically useful habits I have developed with this stack is running agentic CLI tools inside dedicated Docker containers rather than installing them to my host user environment. The motivation is threefold: environment isolation, workspace scope control, and plugin sandboxing.

Workspace scope control is where the bind-mount architecture pays its most direct dividend. Rather than giving a tool access to the entire home filesystem, I mount only the specific project directories I want it to operate on:

docker run --rm -it \
  --add-host=host.docker.internal:host-gateway \
  -e ANTHROPIC_API_KEY="${ANTHROPIC_API_KEY:-YOUR_ANTHROPIC_API_KEY}" \
  -v "$HOME/agents/commercial/claude/home:/home/agent" \
  -v "$HOME/projects/project-alpha:/workspace/project-alpha" \
  -v "$HOME/projects/project-beta:/workspace/project-beta" \
  -w /workspace/project-alpha \
  commercial-ai:latest claude

From inside the container, the agent sees exactly two project directories and nothing else. For work involving student data or grant-sensitive materials, this mount-scoping discipline is not optional; it is the architectural enforcement of the data minimization principle.

Plugin sandboxing makes it practical to evaluate new tools without risk. I can install an untrusted npm package, register a new MCP server, or try an experimental integration inside an ephemeral container with a scratch identity directory, observe its behavior against a scoped workspace mount, and discard the container entirely if I decide against it. The two-stage pattern, scratch evaluation followed by deliberate promotion to the production identity directory, is something the tiered bind-mount architecture makes nearly effortless.

Open-Source Cowork Alternatives

Anthropic’s Claude Cowork launch in January 2026 triggered a vigorous open-source response. I track several of the resulting projects.

OpenWork is the most actively developed, functioning as a control surface for agentic workflows with hot-reloadable skills, session management, and SSE event stream subscriptions. It is ejectable to OpenCode, which provides a meaningful portability guarantee.

OpenWork Deployment

# openwork/Dockerfile
FROM node:22-bookworm-slim
 
ARG OPENWORK_ORCHESTRATOR_VERSION=latest
 
RUN apt-get update \
 && apt-get install -y --no-install-recommends \
    ca-certificates curl git tar unzip \
 && rm -rf /var/lib/apt/lists/*
 
RUN npm install -g "openwork-orchestrator@${OPENWORK_ORCHESTRATOR_VERSION}"
 
ENV OPENWORK_DATA_DIR=/data/openwork-orchestrator
ENV OPENWORK_SIDECAR_DIR=/data/sidecars
ENV OPENWORK_WORKSPACE=/workspace
 
EXPOSE 8787
VOLUME ["/workspace", "/data"]
 
CMD ["openwork", "serve", "--workspace", "/workspace", "--remote-access", \
     "--openwork-port", "8787", "--opencode-host", "127.0.0.1", \
     "--opencode-port", "4096", "--connect-host", "127.0.0.1", \
     "--cors", "*", "--approval", "manual", "--no-opencode-router"]
# openwork/build.sh
docker build -t openwork:local "$HOME/agents/openwork"
mkdir -p "$HOME/agents/openwork/workspace" "$HOME/agents/openwork/data"
 
# openwork/run.sh
docker run -it \
  --restart no \
  --add-host=host.docker.internal:host-gateway \
  -p 8787:8787 \
  -v "$HOME/agents/openwork/workspace:/workspace" \
  -v "$HOME/agents/openwork/data:/data" \
  -e OPENWORK_TOKEN=dev-token \
  -e OPENWORK_HOST_TOKEN=dev-host-token \
  --name openwork-${USER} \
  openwork:local
 
# openwork/attach.sh
docker start -ai openwork-${USER}

Accomplish takes a “BYO-AI” stance, functioning as the orchestration layer (hands and eyes) while allowing model selection, supporting OpenAI, Anthropic, Google, and Ollama backends without lock-in. Kuse Cowork implements the agent runtime in Rust with Docker-based sandboxing. OpenCoworkAI explicitly commits to VM/bwrap isolation and checkpoint-rollback capability.

Multica occupies a different conceptual position as a managed agents platform rather than a desktop agent. It assigns issues to AI agents as you would assign them to human teammates, and it implements skill compounding: when an agent completes a task successfully, the solution is saved as a reusable skill that future tasks can leverage. This maps interestingly onto organizational learning theory, though a thoughtful critique in the project’s issue tracker notes that the human-management metaphor may be insufficient for genuinely autonomous AI orchestration at scale. I find this a productive tension worth thinking about seriously.

MCP Servers: Extending the Stack with Custom Tools

The Model Context Protocol (MCP), introduced by Anthropic in late 2024, defines how AI assistants communicate with external tools through four primitive types: tools (callable functions), resources (readable data streams), prompts (reusable templates), and sampling (delegated inference requests). Running MCP servers in Docker and connecting them to the local stack over a shared external network is straightforward.

docker network create mcp-shared

With this network in place, any container that declares mcp-shared as an external network can reach an MCP server at its container DNS name, such as http://mcp-bibliography:8000/mcp, without host port exposure. For backends that use OpenAI function-calling format rather than MCP’s JSON-RPC protocol, a thin Flask adapter service translates between the two on startup:

def mcp_post(method: str, params: dict, req_id: int = 1) -> dict:
    payload = json.dumps({
        "jsonrpc": "2.0", "id": req_id,
        "method": method, "params": params
    }).encode()
    req = urllib.request.Request(
        MCP_URL, data=payload,
        headers={"Content-Type": "application/json"}, method="POST"
    )
    with urllib.request.urlopen(req, timeout=10) as r:
        return json.loads(r.read())

OpenRouter: A Cloud Model Interface

OpenRouter serves as the cloud model interface for tasks I route away from local inference. It exposes multiple model providers through an OpenAI-compatible API surface, so the same client code that talks to the local LiteLLM gateway can talk to OpenRouter without modification. Model identifiers follow a provider-qualified format:

anthropic/claude-3-opus
openai/gpt-4o
mistralai/mixtral-8x7b
meta-llama/llama-3-70b-instruct

The free-tier model list changes, but as of early 2026 includes Gemini 2.5 Pro, DeepSeek Chat v3.5, and LLaMA 4 Maverick. I use OpenRouter as a fallback in the Mastra agent server, as the primary cloud provider for pi, and as the escalation path from GNHF when a batch task exceeds local model capability. All API key management is handled through environment variables; no key is ever written into a container image.

Reproducibility: The One-Shot Deploy Script

The entire stack, including all Dockerfiles, helper scripts, LiteLLM configuration, Mastra project files, and the directory tree, is generated by a single bash script called deploy-agents.sh. Running this script on a new machine, after providing the necessary API keys, produces a fully functional environment covering all the services described above with no manual steps (except placing a Dockerfile in gnhf/ for the task-bounded harness).

The script follows a fixed sequence: create the directory tree; write Dockerfiles for each custom image; write LiteLLM, LocalAI, and CCR configs; write Mastra project files; write GNHF task modules; write Agent Zero and Archon startup scripts; build custom images; clone and build OpenClaude; pull pre-built images; and finally start all daemon containers. This design means the script itself is the documentation, and any configuration drift between machines is detected by diffing the generated files against a known-good reference.

Publishing a custom Docker image to GitHub Container Registry follows the same minimal GitHub Actions pattern:

- uses: docker/login-action@v3
  with:
    registry: ghcr.io
    username: $
    password: $
 
- uses: docker/build-push-action@v6
  with:
    context: .
    push: true
    tags: $

After the workflow runs, making the resulting package public in the GitHub package settings allows anyone to docker pull ghcr.io/yourusername/yourrepo:main without further configuration.

Model Selection Notes

A few observations on local model selection from operational experience. For the think-heavy routing slot I use gemma4:e2b; for lightweight background tasks such as file summarization and classification I use qwen2.5:1.5b; for tool-calling workflows in Open WebUI and via the Hermes agent identity I use hermes3:8b; for general interactive sessions I use llama3. The selection criteria are principally RAM footprint and whether the model’s function-calling format is well-supported by the tool in question.