<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://www.billmongan.com/feed.xml" rel="self" type="application/atom+xml" /><link href="https://www.billmongan.com/" rel="alternate" type="text/html" /><updated>2026-05-25T01:13:12+00:00</updated><id>https://www.billmongan.com/feed.xml</id><title type="html">William M. Mongan, Ph.D.</title><subtitle>Homepage of Dr. Bill Mongan</subtitle><author><name>Bill Mongan</name><email>billmongan+website@gmail.com</email></author><entry><title type="html">mcpproxy: A Config-Driven MCP Host with a Built-In Web UI</title><link href="https://www.billmongan.com/posts/2026/05/mcpproxy/" rel="alternate" type="text/html" title="mcpproxy: A Config-Driven MCP Host with a Built-In Web UI" /><published>2026-05-25T00:00:00+00:00</published><updated>2026-05-25T00:00:00+00:00</updated><id>https://www.billmongan.com/posts/2026/05/mcpproxy</id><content type="html" xml:base="https://www.billmongan.com/posts/2026/05/mcpproxy/"><![CDATA[<p>The <a href="https://modelcontextprotocol.io/">Model Context Protocol</a> defines a standard
way for AI clients to discover and call tools, but standing up a personal MCP
server still usually means writing Python glue code, wiring up a framework, and
restarting a process every time you add a tool.  <a href="https://github.com/BillJr99/mcpproxy">mcpproxy</a>
takes a different approach: every tool provider is a single YAML file, the server
reloads tools at startup without any code changes to the host, and a browser-based
web UI handles the full provider lifecycle — editing, secret management, and
live command streaming — without leaving the browser.</p>

<blockquote>
  <p><strong>Experimental software — use with caution and in isolation.</strong>
<code class="language-plaintext highlighter-rouge">mcpproxy</code> is a research prototype provided as-is, with no guarantees of
security, stability, or fitness for any particular purpose.  It has not
undergone a security audit.  The web UI has no authentication.  Do not
expose it to untrusted networks or use it to process sensitive data in
production.  Run it on a trusted network, preferably in a container or VM
you are prepared to reset.  See the LICENSE for full MIT terms.</p>
</blockquote>

<h2 id="how-it-works">How It Works</h2>

<p><code class="language-plaintext highlighter-rouge">mcpproxy</code> exposes two ports: the MCP endpoint on <strong>8888</strong> (<code class="language-plaintext highlighter-rouge">http://localhost:8888/mcp</code>)
and a web UI on <strong>8889</strong> (<code class="language-plaintext highlighter-rouge">http://localhost:8889</code>).  The core server (<code class="language-plaintext highlighter-rouge">server.py</code>)
scans a <code class="language-plaintext highlighter-rouge">tools/</code> directory at startup.  Each YAML file there is a <em>provider</em>.
A provider either embeds Python <code class="language-plaintext highlighter-rouge">async def</code> handler functions directly in a
<code class="language-plaintext highlighter-rouge">code:</code> block, or delegates to an existing MCP npm package via an <code class="language-plaintext highlighter-rouge">npx:</code> block.
<code class="language-plaintext highlighter-rouge">server.py</code> executes each code block (or spawns the npx process), registers
every declared tool automatically, and serves them all through the single MCP
endpoint — no changes to <code class="language-plaintext highlighter-rouge">server.py</code> ever needed.</p>

<p>A minimal Python provider looks like this:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">code</span><span class="pi">:</span> <span class="pi">|</span>
  <span class="s">import datetime</span>
  <span class="s">from typing import Any</span>

  <span class="s">async def ping(context: dict[str, Any], message: str = "hello") -&gt; dict[str, Any]:</span>
      <span class="s">return {</span>
          <span class="s">"ok": True,</span>
          <span class="s">"echo": message,</span>
          <span class="s">"timestamp": datetime.datetime.now(datetime.timezone.utc).isoformat(),</span>
      <span class="s">}</span>

<span class="na">tools</span><span class="pi">:</span>
  <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">ping</span>
    <span class="na">function</span><span class="pi">:</span> <span class="s">ping</span>
    <span class="na">description</span><span class="pi">:</span> <span class="s">Echo a message back with a server-side UTC timestamp.</span>
    <span class="na">input_schema</span><span class="pi">:</span>
      <span class="na">type</span><span class="pi">:</span> <span class="s">object</span>
      <span class="na">properties</span><span class="pi">:</span>
        <span class="na">message</span><span class="pi">:</span>
          <span class="na">type</span><span class="pi">:</span> <span class="s">string</span>
          <span class="na">default</span><span class="pi">:</span> <span class="s2">"</span><span class="s">hello"</span>
          <span class="na">description</span><span class="pi">:</span> <span class="s">The text to echo back.</span>
      <span class="na">required</span><span class="pi">:</span> <span class="pi">[]</span>
</code></pre></div></div>

<p>An npx-based provider is even shorter — just point at the package:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">npx</span><span class="pi">:</span>
  <span class="na">command</span><span class="pi">:</span> <span class="s">npx @playwright/mcp@latest --headless --isolated</span>

<span class="na">tools</span><span class="pi">:</span>
  <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">browser_navigate</span>
    <span class="na">description</span><span class="pi">:</span> <span class="s">Navigate to a URL in a browser.</span>
    <span class="na">input_schema</span><span class="pi">:</span>
      <span class="na">type</span><span class="pi">:</span> <span class="s">object</span>
      <span class="na">properties</span><span class="pi">:</span>
        <span class="na">url</span><span class="pi">:</span>
          <span class="na">type</span><span class="pi">:</span> <span class="s">string</span>
      <span class="na">required</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">url</span><span class="pi">]</span>
</code></pre></div></div>

<h2 id="secret-injection">Secret Injection</h2>

<p>Secrets are declared in the provider YAML and injected from the environment at
call time.  The key design property is that secret values are <strong>never part of
the MCP tool schema</strong> — the LLM never sees them:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">tools</span><span class="pi">:</span>
  <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">get_weather</span>
    <span class="na">function</span><span class="pi">:</span> <span class="s">get_weather</span>
    <span class="s">...</span>
    <span class="na">secrets</span><span class="pi">:</span>
      <span class="na">env</span><span class="pi">:</span>
        <span class="na">api_key</span><span class="pi">:</span> <span class="s">WEATHER_API_KEY</span>   <span class="c1"># handler arg → env var name</span>
</code></pre></div></div>

<p>The server reads <code class="language-plaintext highlighter-rouge">WEATHER_API_KEY</code> from <code class="language-plaintext highlighter-rouge">.env</code> (loaded via Docker Compose
<code class="language-plaintext highlighter-rouge">env_file</code>) and passes it to the handler as the <code class="language-plaintext highlighter-rouge">api_key</code> argument.  The LLM’s
tool schema only shows the public parameters.</p>

<h2 id="the-web-ui">The Web UI</h2>

<p>A FastAPI frontend on port 8889 handles the full provider lifecycle.</p>

<p><strong>Tools tab</strong> — lists all loaded providers in a left panel.  Click any provider
to open a form editor with its documentation, code, and per-tool fields (name,
description, parameters).  Add or remove tools with the <strong>+ Add Tool</strong> / <strong>✕</strong>
buttons, save to disk, and restart the MCP server in place — no shell access
needed.</p>

<p><strong>+ New Provider wizard</strong> — choose between a Python code provider (write
<code class="language-plaintext highlighter-rouge">async def</code> functions) and an npx package provider (enter an npx command;
the UI auto-introspects the MCP server and populates tool definitions).  The
wizard’s final step lists all required secrets and writes them to <code class="language-plaintext highlighter-rouge">.env</code>
directly.</p>

<p><strong>🔑 Secrets panel</strong> — reads all <code class="language-plaintext highlighter-rouge">secrets.env</code> entries from the selected
provider, shows which variables are already set in <code class="language-plaintext highlighter-rouge">.env</code>, and lets you fill
in or update missing values interactively.</p>

<p><strong>🛠 Run Command</strong> — runs any shell command inside the server environment and
streams output live.  Particularly useful for npx-based providers: after adding
a Playwright provider, install the Chrome binary with
<code class="language-plaintext highlighter-rouge">npx playwright install chrome</code> right from the browser panel.</p>

<h2 id="connecting-a-client">Connecting a Client</h2>

<p>The MCP endpoint is <code class="language-plaintext highlighter-rouge">http://localhost:8888/mcp</code>.  Most major clients support
the HTTP transport natively:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Claude Code</span>
claude mcp add <span class="nt">--transport</span> http mcpproxy http://localhost:8888/mcp
</code></pre></div></div>

<p>For Claude Desktop, Cursor, Cline, Continue, OpenCode, and Windsurf, add a
JSON server entry pointing at the same URL.  For Ollama (which does not speak
MCP natively), the included <code class="language-plaintext highlighter-rouge">tests/ollama_agent.py</code> bridges MCP → Ollama
tool-calling automatically:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>python3 tests/ollama_agent.py <span class="s2">"List the tools you have available"</span>
</code></pre></div></div>

<h2 id="docker">Docker</h2>

<p>A pre-built image is published to the GitHub Container Registry on every push
to <code class="language-plaintext highlighter-rouge">main</code>:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker pull ghcr.io/billjr99/mcpproxy:latest
docker run <span class="nt">-d</span> <span class="nt">--rm</span> <span class="se">\</span>
  <span class="nt">-p</span> 8888:8888 <span class="nt">-p</span> 8889:8889 <span class="se">\</span>
  <span class="nt">--env-file</span> .env <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="si">$(</span><span class="nb">pwd</span><span class="si">)</span><span class="s2">/tools"</span>:/app/tools <span class="se">\</span>
  <span class="nt">--name</span> mcpproxy <span class="se">\</span>
  ghcr.io/billjr99/mcpproxy:latest
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">tools/</code> directory is gitignored and never baked into the image — it is
always mounted at runtime so your provider files stay outside the container.
For a persistent home directory setup that also lets the web UI’s Secrets
panel read and write <code class="language-plaintext highlighter-rouge">.env</code>, bind-mount <code class="language-plaintext highlighter-rouge">~/.mcpproxy/.env</code> into the container
and pass <code class="language-plaintext highlighter-rouge">-e MCP_ENV_FILE=/app/.env</code>.</p>

<p>A <code class="language-plaintext highlighter-rouge">docker-compose.override.yml</code> is provided for local development with
bind mounts; the base <code class="language-plaintext highlighter-rouge">docker-compose.yml</code> uses named volumes for
production/CI.</p>

<h2 id="getting-started">Getting Started</h2>

<p>The fastest path is <code class="language-plaintext highlighter-rouge">run_local.sh</code>:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git clone https://github.com/BillJr99/mcpproxy
<span class="nb">cd </span>mcpproxy
./run_local.sh
</code></pre></div></div>

<p>The script generates <code class="language-plaintext highlighter-rouge">.env.example</code> from any existing tool YAMLs, prompts for
missing secret values, creates a virtualenv, installs dependencies, and starts
both the MCP server and the web UI.  Then open <code class="language-plaintext highlighter-rouge">http://localhost:8889</code>, click
<strong>+ New Provider</strong>, and add your first tool.</p>

<p>The unit test suite covers <code class="language-plaintext highlighter-rouge">server.py</code> helpers and all <code class="language-plaintext highlighter-rouge">frontend/app.py</code>
endpoints and runs on every push via GitHub Actions:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pip <span class="nb">install</span> <span class="nt">-r</span> requirements.txt <span class="nt">-r</span> requirements-dev.txt
pytest tests/ <span class="nt">-v</span>
</code></pre></div></div>

<p>Source code and further documentation are at
<a href="https://github.com/BillJr99/mcpproxy">https://github.com/BillJr99/mcpproxy</a>.</p>]]></content><author><name>Bill Mongan</name><email>billmongan+website@gmail.com</email></author><category term="ai" /><category term="mcp" /><category term="agentic" /><category term="python" /><category term="docker" /><summary type="html"><![CDATA[The Model Context Protocol defines a standard way for AI clients to discover and call tools, but standing up a personal MCP server still usually means writing Python glue code, wiring up a framework, and restarting a process every time you add a tool. mcpproxy takes a different approach: every tool provider is a single YAML file, the server reloads tools at startup without any code changes to the host, and a browser-based web UI handles the full provider lifecycle — editing, secret management, and live command streaming — without leaving the browser.]]></summary></entry><entry><title type="html">AutoGUI: A Vendor-Neutral Desktop Automation Agent for LLMs</title><link href="https://www.billmongan.com/posts/2026/05/autogui/" rel="alternate" type="text/html" title="AutoGUI: A Vendor-Neutral Desktop Automation Agent for LLMs" /><published>2026-05-15T00:00:00+00:00</published><updated>2026-05-15T00:00:00+00:00</updated><id>https://www.billmongan.com/posts/2026/05/autogui</id><content type="html" xml:base="https://www.billmongan.com/posts/2026/05/autogui/"><![CDATA[<p>Most LLM agents can read files, call APIs, and run shell commands, but they have
no reliable way to operate a graphical desktop. They cannot click a button in a
running application, verify that a dialog appeared, fill a form field, or observe
what is currently on screen. <a href="https://github.com/BillJr99/AutoGUI">AutoGUI</a> is a
research prototype that fills that gap. It connects any OpenAI-compatible LLM —
including models served locally through <a href="https://openwebui.com/">OpenWebUI</a> or
directly through <a href="https://ollama.com/">Ollama</a> — to a full suite of OS-level desktop
controls via a ReAct-style agentic loop.</p>

<blockquote>
  <p><strong>Experimental software — use at your own risk.</strong>
AutoGUI is a research prototype. It is not intended for, and has not been
evaluated or deemed suitable for, any particular purpose, production
use, or critical workload. No warranty is provided, express or implied.
The agent operates at OS level and can run shell commands, click anything,
type anywhere, read and write files, and take screenshots.
<strong>Run it only in a sandbox, VM, or container that you are willing to reset.</strong>
Restrict the REST API to loopback (<code class="language-plaintext highlighter-rouge">AUTOGUI_API_HOST=127.0.0.1</code>) and consider
disabling shell access (<code class="language-plaintext highlighter-rouge">"allowed_shell": false</code>) if you do not fully trust
the task or the model driving it.</p>
</blockquote>

<h2 id="two-delivery-modes">Two Delivery Modes</h2>

<p>AutoGUI ships in two forms. The standalone Python CLI/TUI agent connects any
OpenWebUI instance (or any OpenAI-compatible endpoint) to your desktop. The
native TypeScript Pi extension brings the same tools into the
<a href="https://pi.dev">Pi</a> terminal harness behind a single <code class="language-plaintext highlighter-rouge">/autogui</code> command, with
no dependency on the Python agent or OpenWebUI.</p>

<p>Both share the same tool surface — shell execution, filesystem access, pixel
and accessibility-tree clicking, Playwright browser automation — but they differ
in how the LLM loop is owned. The standalone agent runs its own ReAct loop;
the Pi extension delegates loop ownership to Pi.</p>

<h2 id="architecture">Architecture</h2>

<p>The standalone Python agent is organized around five main components.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main.py             Entry point — validation, component wiring, TUI/CLI dispatch
│
├── agent.py        ReAct loop + typed-plan controller
│   └─ controller   Preflight, predicate checks, replan-on-block, budget ceilings
│
├── tools.py        Tool registry
│   ├─ shell_run / fs_read / fs_write / fs_list
│   ├─ desktop_screenshot / click / type / hotkey / scroll / launch
│   ├─ desktop_click_element   (a11y-first: UIAutomation / AT-SPI)
│   ├─ desktop_click_mark      (Set-of-Mark grounding)
│   ├─ browser_navigate / click / fill / eval   (Playwright)
│   ├─ skill_save / skill_list / skill_run
│   └─ memory_get / memory_note
│
├── backends/       Platform-specific automation backends
│   ├─ windows.py   UIAutomation + SendInput (user32)
│   ├─ macos.py     screencapture + osascript
│   ├─ linux_x11.py xdotool + wmctrl
│   └─ linux_wayland.py  grim + ydotool + swaymsg
│
├── api.py          FastAPI REST server (auto-started in background)
│   ├─ POST /api/task          Submit task
│   ├─ GET  /api/task/{id}/stream  SSE live event stream
│   └─ GET  /api/healthz
│
└── tui.py          Textual TUI (status bar, model picker, tool visibility toggle)
</code></pre></div></div>

<p>The agentic loop in <code class="language-plaintext highlighter-rouge">agent.py</code> follows a standard ReAct pattern — append the
user message to history, POST the history and tool schemas to the LLM, receive
either a tool call or a stop response, execute the tool, append the result, and
repeat. On top of that loop sits the controller, which adds:</p>

<ul>
  <li><strong>Planner</strong> — one extra LLM call up front that produces a numbered, high-level
plan injected as a <code class="language-plaintext highlighter-rouge">[PLAN]</code> block into the executor’s context</li>
  <li><strong>Plan critique</strong> — a second LLM call that reviews the plan and returns a
revised version when issues are found</li>
  <li><strong>Preflight</strong> — before the first state-changing action, verifies that apps are
on PATH, files exist, URLs are TCP-reachable, and named tools are registered</li>
  <li><strong>Predicate checks</strong> — when a plan step declares a typed post-condition
(<code class="language-plaintext highlighter-rouge">window_title_contains</code>, <code class="language-plaintext highlighter-rouge">file_exists</code>, <code class="language-plaintext highlighter-rouge">text_visible</code>, etc.), the controller
verifies it deterministically after each step completes</li>
  <li><strong>Replan-on-block</strong> — when a step is classified as BLOCKED, the controller
re-invokes the planner with the failure reason as context</li>
  <li><strong>Visual diff</strong> — perceptual hash of pre/post screenshots flags silent no-ops
where a state-changing action left the screen unchanged</li>
  <li><strong>Watchdog</strong> — detects when the loop is stuck by hashing the per-iteration
signature and routing repeated matches through the BLOCKED path</li>
</ul>

<h2 id="platform-support">Platform Support</h2>

<p>The correct backend is detected automatically at startup via <code class="language-plaintext highlighter-rouge">platform_detect.detect()</code>.
No configuration is required.</p>

<table>
  <thead>
    <tr>
      <th>Platform</th>
      <th>Screenshot</th>
      <th>Click/Type</th>
      <th>Accessibility Tree</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><strong>Windows</strong></td>
      <td>pyautogui</td>
      <td>SendInput (user32)</td>
      <td>UIAutomation</td>
    </tr>
    <tr>
      <td><strong>WSL</strong></td>
      <td>pyautogui</td>
      <td>pyautogui</td>
      <td>PowerShell UIAutomation</td>
    </tr>
    <tr>
      <td><strong>macOS</strong></td>
      <td>screencapture</td>
      <td>pyautogui</td>
      <td>osascript</td>
    </tr>
    <tr>
      <td><strong>Linux X11</strong></td>
      <td>pyautogui</td>
      <td>xdotool</td>
      <td>AT-SPI</td>
    </tr>
    <tr>
      <td><strong>Linux Wayland</strong></td>
      <td>grim</td>
      <td>ydotool</td>
      <td>AT-SPI</td>
    </tr>
  </tbody>
</table>

<p>On Windows, click and type operations go through <code class="language-plaintext highlighter-rouge">user32.SendInput</code> directly
via ctypes, producing real INPUT events indistinguishable from a physical
keyboard and mouse, with correct per-monitor DPI handling and full Unicode
support. On Linux, the <code class="language-plaintext highlighter-rouge">desktop_click_element</code> tool talks to the accessibility
tree via AT-SPI, letting the agent click real UI controls by name and role
rather than guessing pixel positions.</p>

<h2 id="a11y-first-clicking-and-set-of-mark-grounding">A11y-First Clicking and Set-of-Mark Grounding</h2>

<p>AutoGUI implements two layers above raw pixel clicking.</p>

<p><code class="language-plaintext highlighter-rouge">desktop_click_element(name, control_type)</code> resolves the target UI control
through the OS accessibility API and clicks it by logical identity rather than
screen position. This survives window moves, DPI scaling, and async UI redraws
that would break a coordinate-based click.</p>

<p>For cases where the accessibility tree is sparse or unavailable, Set-of-Mark
grounding overlays numbered boxes on detected UI elements in a screenshot. The
model selects an element by ID via <code class="language-plaintext highlighter-rouge">desktop_click_mark(mark_id)</code> rather than
guessing coordinates. The fallback ladder is: <code class="language-plaintext highlighter-rouge">desktop_click_element</code> →
<code class="language-plaintext highlighter-rouge">desktop_click_text</code> (OCR anchor) → <code class="language-plaintext highlighter-rouge">desktop_click_mark</code> → <code class="language-plaintext highlighter-rouge">desktop_click(x, y)</code>.</p>

<h2 id="skill-library-and-app-memory">Skill Library and App Memory</h2>

<p>Two optional persistence layers accumulate knowledge across tasks.</p>

<p>The skill library records successful tool sequences with <code class="language-plaintext highlighter-rouge">skill_save</code> and
retruns them by keyword with <code class="language-plaintext highlighter-rouge">skill_list</code> and <code class="language-plaintext highlighter-rouge">skill_run</code>. At the start of each
task the planner receives the top-3 matching skills as few-shot exemplars, so
recurring workflows get faster and more reliable over time. Skill creation is
gated by <code class="language-plaintext highlighter-rouge">agent.skills_enabled</code> (default false); reads always work.</p>

<p>The app-memory store records per-app quirks, failure counts, and free-form notes
via <code class="language-plaintext highlighter-rouge">memory_note</code>. At the start of each task the planner receives app memory
hints for any visible applications, biasing plans toward strategies that worked
before. Memory creation is gated by <code class="language-plaintext highlighter-rouge">agent.memory.enabled</code> (default false);
reads always work.</p>

<h2 id="rest-api-and-tui">REST API and TUI</h2>

<p>The FastAPI REST server starts automatically in the background whenever you run
<code class="language-plaintext highlighter-rouge">main.py</code>. It exposes task submission, SSE live event streaming, and a liveness
probe, making AutoGUI accessible to web UIs, scripts, and CI pipelines without
the terminal UI.</p>

<p>The Textual TUI provides a scrollable conversation pane with a status bar
showing the current model name, conversation length, and tool visibility state.
The live model picker (Ctrl+P → “Change Model”) fetches the current model list
from the server; selecting a model takes effect immediately and can optionally
be persisted to <code class="language-plaintext highlighter-rouge">config.json</code>.</p>

<h2 id="experimental-nature-and-safety-considerations">Experimental Nature and Safety Considerations</h2>

<p>AutoGUI is a research prototype, not a production tool. A few properties of the
current design are worth understanding before running it.</p>

<p>The destructive command guard in <code class="language-plaintext highlighter-rouge">shell_run</code> blocks patterns like <code class="language-plaintext highlighter-rouge">rm -rf</code>,
<code class="language-plaintext highlighter-rouge">DROP TABLE</code>, and <code class="language-plaintext highlighter-rouge">dd if=</code>, but it is a regex filter, not a sandbox. For
untrusted tasks, set <code class="language-plaintext highlighter-rouge">"allowed_shell": false</code> or run the agent inside a
container with a disposable filesystem.</p>

<p>The REST API has no authentication and binds to <code class="language-plaintext highlighter-rouge">0.0.0.0</code> by default. Set
<code class="language-plaintext highlighter-rouge">AUTOGUI_API_HOST=127.0.0.1</code> for loopback-only access or
<code class="language-plaintext highlighter-rouge">AUTOGUI_DISABLE_API=1</code> to disable the background API entirely.</p>

<p>The agent operates at OS level. It can click anything, type anywhere, read and
write files, and take screenshots. The safety countdown (<code class="language-plaintext highlighter-rouge">safety.command_confirm_delay_seconds</code>,
default 5 seconds) gives a visible window before each tool dispatch, with
Escape-to-cancel, but it is not a substitute for running in an environment you
are prepared to reset.</p>

<p>For the same reason, AutoGUI should not be pointed at tasks that involve
sensitive data or credentials unless the environment is appropriately isolated.
Screen content captured during observation passes through the model’s context
window; if you are using a cloud-hosted model, treat everything visible on
screen as potentially logged.</p>

<h2 id="getting-started">Getting Started</h2>

<p>Clone the repository from <a href="https://github.com/BillJr99/AutoGUI">https://github.com/BillJr99/AutoGUI</a>,
install the Python dependencies, copy <code class="language-plaintext highlighter-rouge">config.json.example</code> to <code class="language-plaintext highlighter-rouge">config.json</code>,
and set your OpenWebUI URL, API key, and model. Run <code class="language-plaintext highlighter-rouge">python main.py --check</code>
to verify connectivity before launching the TUI or issuing single-command
tasks.</p>

<p>The optional install script at <code class="language-plaintext highlighter-rouge">scripts/install-dependencies.sh</code> (or <code class="language-plaintext highlighter-rouge">.cmd</code>/<code class="language-plaintext highlighter-rouge">.ps1</code>
on Windows) adds Tesseract for OCR-anchored clicking, Playwright for browser
automation, AT-SPI on Linux for accessibility-tree element clicking, and
ImageMagick for Set-of-Mark overlays and failure GIF recording.</p>

<p>A pytest suite under <code class="language-plaintext highlighter-rouge">tests/</code> exercises the controller, predicates, budget
ceilings, preflight, and visual diff modules with no live model and no desktop
required — useful for validating changes to the orchestration logic without
burning real API calls.</p>]]></content><author><name>Bill Mongan</name><email>billmongan+website@gmail.com</email></author><category term="ai" /><category term="agents" /><category term="desktop" /><category term="automation" /><category term="python" /><category term="typescript" /><summary type="html"><![CDATA[Most LLM agents can read files, call APIs, and run shell commands, but they have no reliable way to operate a graphical desktop. They cannot click a button in a running application, verify that a dialog appeared, fill a form field, or observe what is currently on screen. AutoGUI is a research prototype that fills that gap. It connects any OpenAI-compatible LLM — including models served locally through OpenWebUI or directly through Ollama — to a full suite of OS-level desktop controls via a ReAct-style agentic loop.]]></summary></entry><entry><title type="html">BetterWebUI: A Faculty-Friendly Agentic Front End for OpenWebUI</title><link href="https://www.billmongan.com/posts/2026/05/betterwebui/" rel="alternate" type="text/html" title="BetterWebUI: A Faculty-Friendly Agentic Front End for OpenWebUI" /><published>2026-05-15T00:00:00+00:00</published><updated>2026-05-15T00:00:00+00:00</updated><id>https://www.billmongan.com/posts/2026/05/betterwebui</id><content type="html" xml:base="https://www.billmongan.com/posts/2026/05/betterwebui/"><![CDATA[<p>Most large language model interfaces are designed for developers or for a general consumer audience.
Faculty who want to use an AI assistant to help with grading, research, or course preparation either accept the
limitations of a consumer chat interface or invest significant time learning to run and configure a developer-grade
setup. <a href="https://github.com/BillJr99/BetterWebUI">BetterWebUI</a> is an attempt to close that gap.
It is a local Python/FastAPI server with a pure-HTML front end that connects to an existing
<a href="https://github.com/open-webui/open-webui">OpenWebUI</a> instance and layers on the features that make an agentic
assistant genuinely useful in a higher-education context: workspaces, skills, MCP server management, CLI shortcuts,
math rendering, and a suite of integrations with sibling agentic services.</p>

<blockquote>
  <p><strong>Experimental software — use at your own risk.</strong>
BetterWebUI is a research prototype. It is not intended for, and has not been
evaluated or deemed suitable for, any particular purpose, production
use, or critical workload. No warranty is provided, express or implied.
Shell commands approved in the chat interface execute directly on the host machine;
integrated services (CLK, AutoGUI, OSScreenObserver) may take real actions on your desktop.
Run this software only in an isolated, sandboxed environment and review
every command before approving it. By using this software you accept all
associated risks.</p>

  <p>Contributions, bug reports, and ideas are very welcome — feel free to
open an issue or pull request!</p>
</blockquote>

<h2 id="the-problem">The Problem</h2>

<p>OpenWebUI is an excellent self-hosted model interface with a wide feature set, but its feature set is also its
complexity. A faculty member who wants to run a grading assistant, attach a course rubric to every conversation in
that context, switch to a research assistant context, and then have the assistant read a file from the local
filesystem is navigating four separate configuration surfaces. BetterWebUI reduces that to a single workspace
dropdown and a file picker, while keeping all data local.</p>

<p>A second problem is tooling reach. An AI assistant that can only generate text is useful; one that can run a
pandoc conversion, call a GitHub API, fetch a web page, read the current screen state, or orchestrate a multi-step
research workflow is qualitatively more useful for the kinds of tasks faculty actually do. BetterWebUI provides
a unified approval-gated interface for all of these capabilities.</p>

<h2 id="architecture">Architecture</h2>

<p>BetterWebUI is organized around three layers: a FastAPI backend that manages state and proxies model requests;
a zero-build-step HTML/CSS/JS frontend that communicates with the backend over localhost REST calls; and an
integration layer that wraps three sibling agentic services behind a common <code class="language-plaintext highlighter-rouge">/api/services/*</code> routing namespace.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>betterwebui/
├── app.py              # FastAPI backend — model proxy, MCP manager, approval flow
├── static/             # HTML/CSS/JS frontend
├── skills/             # skill markdown files (loaded on demand)
├── services/           # integration clients (CLK, AutoGUI, OSSO)
└── data/               # config.json, conversations, workspaces, uploads
</code></pre></div></div>

<p>There is no external state. All persistent data — settings, conversations, workspaces, skill files, and
uploaded attachments — lives in the <code class="language-plaintext highlighter-rouge">data/</code> directory on the local machine. The only network traffic is
outbound to the configured OpenWebUI instance (which may itself be local) and, optionally, to the three
sibling services.</p>

<h3 id="frontend">Frontend</h3>

<p>The frontend is intentionally dependency-free from a build perspective. There is no npm build step, no webpack,
and no framework. Every feature is plain HTML, CSS, and vanilla JavaScript, loaded directly from the <code class="language-plaintext highlighter-rouge">static/</code>
directory. KaTeX is loaded from a CDN for math rendering. This makes the frontend trivially auditable and
deployable: <code class="language-plaintext highlighter-rouge">docker compose up</code> or <code class="language-plaintext highlighter-rouge">./start.sh</code> is sufficient, and the UI is available immediately at
<code class="language-plaintext highlighter-rouge">http://localhost:8765</code>.</p>

<h3 id="backend">Backend</h3>

<p>The FastAPI backend handles five primary concerns:</p>

<ul>
  <li><strong>Model proxying</strong>: BetterWebUI auto-detects which API path OpenWebUI exposes (<code class="language-plaintext highlighter-rouge">/api</code>, <code class="language-plaintext highlighter-rouge">/v1</code>, <code class="language-plaintext highlighter-rouge">/openai/v1</code>,
etc.) and forwards requests transparently, enriching them with the active workspace’s system prompt, skill
content, and any persistent files attached to the workspace.</li>
  <li><strong>MCP server management</strong>: MCP servers are spawned as subprocesses (via <code class="language-plaintext highlighter-rouge">npx</code> or <code class="language-plaintext highlighter-rouge">uvx</code>) and managed over
stdio. The backend tracks process health and surfaces startup errors in the UI.</li>
  <li><strong>Skill loading</strong>: When the assistant calls <code class="language-plaintext highlighter-rouge">load_skill</code>, the backend reads the requested skill markdown file
and injects its content into the conversation context.</li>
  <li><strong>CLI tool dispatch</strong>: The <code class="language-plaintext highlighter-rouge">cli_call</code> tool constructs a command from a registered template and the
assistant’s arguments, presents it for user approval, and executes it on approval.</li>
  <li><strong>Service integration</strong>: The <code class="language-plaintext highlighter-rouge">/api/services/*</code> namespace routes requests to CLK, AutoGUI, or OSSO,
enforces the per-service enable/disable state, and handles graceful degradation when a service is
unreachable.</li>
</ul>

<h3 id="approval-flow">Approval Flow</h3>

<p>Every action that touches the host machine is gated. Shell commands (from <code class="language-plaintext highlighter-rouge">cli_call</code> or a raw <code class="language-plaintext highlighter-rouge">run</code> block)
display a dialog showing the exact command and the assistant’s stated reason before execution. File saves
show a filename preview. File reads open a native file picker so the assistant only ever sees what the user
explicitly selects. Side-effect calls to CLK (<code class="language-plaintext highlighter-rouge">clk_research</code>), AutoGUI (<code class="language-plaintext highlighter-rouge">autogui_task</code>), and OSSO
(<code class="language-plaintext highlighter-rouge">screen_action</code>) are similarly gated. Read-only service calls (<code class="language-plaintext highlighter-rouge">screen_windows</code>, <code class="language-plaintext highlighter-rouge">screen_description</code>,
<code class="language-plaintext highlighter-rouge">screen_screenshot</code>) run without an approval prompt.</p>

<p>Shell execution can be disabled entirely from Settings, in which case <code class="language-plaintext highlighter-rouge">cli_call</code> and raw shell blocks
return a descriptive error rather than presenting an approval dialog.</p>

<h2 id="workspaces">Workspaces</h2>

<p>A workspace is a saved bundle of context: a system prompt, a chosen subset of skills, a chosen subset of MCP
servers, a chosen subset of CLI shortcuts, persistent files that are attached to every new chat in that
workspace, and an optional default model. The workspace dropdown at the top of the chat interface switches
between them with a single click, reloading the entire context.</p>

<p>The workspace model reflects how faculty actually work. A “Grading” workspace has the grading-rubric skill
loaded, a PDF of the current rubric attached, and a system prompt that frames the assistant as a grading
helper. A “Research” workspace has the research-citations skill, the Fetch and Brave Search MCP servers
enabled, and a system prompt oriented toward literature review. Switching contexts does not require
re-configuring anything; the workspace carries all of that state.</p>

<h2 id="skills">Skills</h2>

<p>Skills are markdown files in the <code class="language-plaintext highlighter-rouge">skills/</code> directory, each with a YAML frontmatter header:</p>

<div class="language-markdown highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nn">---</span>
<span class="na">name</span><span class="pi">:</span> <span class="s">Grading Rubric Helper</span>
<span class="na">description</span><span class="pi">:</span> <span class="s">When the user wants to evaluate student work against a rubric</span>
<span class="nn">---</span>

When this skill is loaded, apply the attached rubric to the student submission...
</code></pre></div></div>

<p>The assistant sees a list of available skills and their descriptions. When a user request matches a skill’s
description, the assistant calls <code class="language-plaintext highlighter-rouge">load_skill</code> to read the full instructions and follow them. Skills are
loaded on demand rather than injected into every conversation, which keeps the context window efficient.</p>

<p>Three example skills ship with the repository: a rubric helper, a citation helper, and a computer helper.
New skills can be added from the Skills sidebar or by dropping a <code class="language-plaintext highlighter-rouge">.md</code> file into the <code class="language-plaintext highlighter-rouge">skills/</code> folder.</p>

<h2 id="mcp-servers">MCP Servers</h2>

<p>BetterWebUI manages MCP server subprocesses directly and presents them through a curated registry in the
Tools sidebar:</p>

<table>
  <thead>
    <tr>
      <th>Server</th>
      <th>Transport</th>
      <th>Purpose</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Filesystem</td>
      <td><code class="language-plaintext highlighter-rouge">npx</code></td>
      <td>Read/write files in a chosen directory</td>
    </tr>
    <tr>
      <td>GitHub</td>
      <td><code class="language-plaintext highlighter-rouge">npx</code></td>
      <td>Repos, issues, PRs (requires a PAT)</td>
    </tr>
    <tr>
      <td>Fetch</td>
      <td><code class="language-plaintext highlighter-rouge">uvx</code></td>
      <td>Retrieve and parse web pages</td>
    </tr>
    <tr>
      <td>Brave Search</td>
      <td><code class="language-plaintext highlighter-rouge">npx</code></td>
      <td>Web search (requires a Brave API key)</td>
    </tr>
    <tr>
      <td>Memory</td>
      <td><code class="language-plaintext highlighter-rouge">npx</code></td>
      <td>Persistent knowledge graph</td>
    </tr>
    <tr>
      <td>Git</td>
      <td><code class="language-plaintext highlighter-rouge">uvx</code></td>
      <td>Read a local git repo’s history</td>
    </tr>
    <tr>
      <td>Sequential Thinking</td>
      <td><code class="language-plaintext highlighter-rouge">npx</code></td>
      <td>Stepped reasoning</td>
    </tr>
    <tr>
      <td>Time</td>
      <td><code class="language-plaintext highlighter-rouge">uvx</code></td>
      <td>Accurate time and timezone conversion</td>
    </tr>
  </tbody>
</table>

<p>Custom entries can be registered with an arbitrary command and argument list. If a server fails to start,
the error message from the subprocess is surfaced in the server’s UI row so the prerequisite (typically
a missing <code class="language-plaintext highlighter-rouge">npx</code> or <code class="language-plaintext highlighter-rouge">uvx</code>) is immediately visible.</p>

<h2 id="service-integrations">Service Integrations</h2>

<p>BetterWebUI exposes three sibling agentic services through a unified <code class="language-plaintext highlighter-rouge">/api/services/*</code> routing namespace.
Each service can be enabled or disabled independently from Settings → Services. Disabled services return
HTTP 503 immediately. Enabled but unreachable services return a descriptive HTTP 503 rather than crashing.</p>

<h3 id="cognitiveloopkernel-clk">CognitiveLoopKernel (CLK)</h3>

<p><a href="https://github.com/BillJr99/CognitiveLoopKernel">CLK</a> is a local-first multi-agent development harness that
takes a natural-language idea and iterates it toward a working implementation through dynamic agent casting,
YAML workflow orchestration, and automatic git commits. Through BetterWebUI, the <code class="language-plaintext highlighter-rouge">/research &lt;topic&gt;</code> slash
command starts a CLK workflow, and the <code class="language-plaintext highlighter-rouge">clk_research</code> tool allows the assistant to initiate, monitor, and
retrieve artifacts from research loops. Every invocation requires a one-click approval showing the workflow
and command before anything executes.</p>

<h3 id="autogui">AutoGUI</h3>

<p>AutoGUI provides desktop GUI automation via a ReAct-style loop. The <code class="language-plaintext highlighter-rouge">/automate &lt;task&gt;</code> slash command sends a
task description to AutoGUI. The <code class="language-plaintext highlighter-rouge">autogui_task</code> tool surfaces the task for approval before execution. AutoGUI
runs in dry-run mode by default, which reports what it would do without taking real actions.</p>

<h3 id="osscreenobserver-osso">OSScreenObserver (OSSO)</h3>

<p><a href="https://github.com/BillJr99/OSScreenObserver">OSScreenObserver</a> exposes the operating system’s UI
accessibility tree, OCR text, and ASCII spatial sketches through an MCP interface. Through BetterWebUI, the
<code class="language-plaintext highlighter-rouge">/observe</code> slash command returns a description of the current screen. Read-only tools (<code class="language-plaintext highlighter-rouge">screen_windows</code>,
<code class="language-plaintext highlighter-rouge">screen_description</code>, <code class="language-plaintext highlighter-rouge">screen_screenshot</code>) run without an approval prompt; the <code class="language-plaintext highlighter-rouge">screen_action</code> tool, which
can click, type, and press keys on real OS controls, requires explicit approval.</p>

<h2 id="math-and-markdown">Math and Markdown</h2>

<p>The frontend renders the assistant’s responses as full markdown — headings, lists, tables, code blocks, and
links — and passes mathematical expressions through KaTeX for display-quality typesetting. Both inline
(<code class="language-plaintext highlighter-rouge">$...$</code>, <code class="language-plaintext highlighter-rouge">\(...\)</code>) and display (<code class="language-plaintext highlighter-rouge">$$...$$</code>, <code class="language-plaintext highlighter-rouge">\[...\]</code>) delimiters are supported. The assistant is
explicitly told in its system context that these delimiters are available, so it uses them naturally when
the conversation calls for mathematical notation.</p>

<h2 id="safety-considerations">Safety Considerations</h2>

<p>BetterWebUI’s approval flow is a usability safeguard, not a security boundary. Shell commands execute on
the host machine with the permissions of the user running the server. The approval dialog makes every
proposed command visible and requires an explicit click before execution, but it does not sandbox or
restrict what an approved command can do. Similarly, AutoGUI and OSScreenObserver can take real actions on
the desktop once approved.</p>

<p>This means BetterWebUI should be run in a controlled, sandboxed environment — a dedicated virtual machine
or container with limited access to sensitive files and services is the appropriate deployment context for
any non-trivial use. Multi-user deployment is not supported in the current prototype; the <code class="language-plaintext highlighter-rouge">data/</code>
directory is a flat, single-user store with no access controls.</p>

<p>Shell execution can be disabled entirely from Settings. When disabled, <code class="language-plaintext highlighter-rouge">cli_call</code> and any raw shell blocks
return a descriptive error rather than presenting an approval dialog, which reduces the tool surface to
model inference, skill loading, MCP tool calls, and read-only service queries.</p>

<h2 id="getting-started">Getting Started</h2>

<p>The fastest path is Docker:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker compose up
<span class="c"># open http://localhost:8765</span>
</code></pre></div></div>

<p>Or directly with Python:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># macOS/Linux</span>
./start.sh

<span class="c"># Windows</span>
start.bat
</code></pre></div></div>

<p>On first run, open Settings, paste your OpenWebUI URL and API key, click <strong>Save &amp; test</strong>, and pick a
default model. If you have CLK, AutoGUI, or OSScreenObserver running on their default ports, enable them
from Settings → Services. Everything else — workspaces, skills, MCP servers — can be configured from the
sidebar without a restart.</p>

<p>The source and full documentation are available on <a href="https://github.com/BillJr99/BetterWebUI">GitHub</a>
under the MIT License.</p>]]></content><author><name>Bill Mongan</name><email>billmongan+website@gmail.com</email></author><category term="ai" /><category term="agentic" /><category term="technical" /><category term="software" /><summary type="html"><![CDATA[Most large language model interfaces are designed for developers or for a general consumer audience. Faculty who want to use an AI assistant to help with grading, research, or course preparation either accept the limitations of a consumer chat interface or invest significant time learning to run and configure a developer-grade setup. BetterWebUI is an attempt to close that gap. It is a local Python/FastAPI server with a pure-HTML front end that connects to an existing OpenWebUI instance and layers on the features that make an agentic assistant genuinely useful in a higher-education context: workspaces, skills, MCP server management, CLI shortcuts, math rendering, and a suite of integrations with sibling agentic services.]]></summary></entry><entry><title type="html">OSScreenObserver: Giving AI Agents Eyes and Hands on Your Desktop</title><link href="https://www.billmongan.com/posts/2026/05/os-screen-observer/" rel="alternate" type="text/html" title="OSScreenObserver: Giving AI Agents Eyes and Hands on Your Desktop" /><published>2026-05-11T00:00:00+00:00</published><updated>2026-05-11T00:00:00+00:00</updated><id>https://www.billmongan.com/posts/2026/05/osscreenobserver</id><content type="html" xml:base="https://www.billmongan.com/posts/2026/05/os-screen-observer/"><![CDATA[<p>Most AI agents, whether a
large language model assistant running locally or a cloud-hosted agentic framework, have no reliable way to see or
interact with the desktop applications running on the machine they are supposed to be helping with. They can read
files, call APIs, and run shell commands, but they cannot observe that a dialog box appeared, that a form field is
waiting for input, or that an application is in a specific state.
<a href="https://github.com/BillJr99/OSScreenObserver">OSScreenObserver</a> is a prototype that changes that. It exposes the
operating system’s UI accessibility tree, textual descriptions from multiple sources, and ASCII spatial sketches of
the current screen layout through two simultaneous interfaces: a browser-based web inspector for humans and an MCP
sees are always consistent.</p>

<blockquote>
  <p><strong>Experimental software — use at your own risk.</strong>
This is a research prototype. It is not intended for, and has not been
evaluated or deemed suitable for, any particular purpose, production
use, or critical workload. No warranty is provided, express or implied.
By using this software you accept all associated risks. In particular, 
I sandbox this software and use it only with local AI 
to isolate the environment and to reduce the risk of leaking
sensitive information from the desktop environment.</p>

  <p>Contributions, bug reports, and ideas are very welcome — feel free to
open an issue or pull request!</p>
</blockquote>

<h2 id="the-problem-agents-without-peripheral-vision">The Problem: Agents Without Peripheral Vision</h2>

<p>The practical limitation that motivated this project is easy to state. You are running an AI agent that is supposed to
 help you complete a task in a desktop application. The agent can reason about what to do. It can call tools. But it
cannot see the application. It cannot verify that a dialog appeared after it clicked a button. It cannot read the text
 that is currently visible on screen. It cannot detect that the application has changed state between one observation
and the next. Without that feedback loop, agentic workflows that involve desktop applications either require the human
 to narrate the screen state continuously or break silently when an unexpected dialog or error appears.</p>

<p>The two standard approaches to this problem, taking screenshots and reading them with a vision model, and using the
operating system’s accessibility API to read the UI element tree, are complementary rather than competing. Screenshots
 capture everything visible but require a model call to interpret. Accessibility trees give structured, queryable
information about UI elements but miss applications that do not instrument the accessibility API. OCR bridges the gap
for applications that are neither fully instrumented nor processed by a vision model. OSScreenObserver supports all
three modalities and lets the calling agent choose which one to use, or use all three in combination.</p>

<h2 id="architecture-two-interfaces-one-observer">Architecture: Two Interfaces, One Observer</h2>

<p>The architecture follows a principle that turns out to matter a great deal in practice: any new capability must land
simultaneously on both the REST API and the MCP server, backed by shared logic. There is no divergence between what
you can do from a browser and what an agent can do over MCP.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>┌─────────────────────────────────────────────────────────────────────────┐
│  main.py                                                                │
│  ┌──────────────────────┐      ┌───────────────────────────────────┐    │
│  │  Flask web inspector │      │  MCP stdio server                 │    │
│  │  (background thread) │      │  (main thread, stdin/stdout)      │    │
│  └──────────┬───────────┘      └──────────────────┬────────────────┘    │
│             │                                     │                     │
│             └──────────────┬──────────────────────┘                     │
│                            ▼                                            │
│                    ScreenObserver                                       │
│                   /      │       \                                      │
│          Accessibility  ASCII    Description                            │
│             Tree      Renderer   Generator                              │
│           (observer)             (description)                          │
│                                  ┌──── accessibility (tree prose)       │
│                                  ├──── ocr (Tesseract)                  │
│                                  └──── vlm (Claude Vision)              │
└─────────────────────────────────────────────────────────────────────────┘
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">ScreenObserver</code> facade sits below both interfaces and provides three types of observation: the accessibility
element tree (structured JSON that describes every visible UI element, its role, name, value, bounds, and parent-child
 relationships), a textual description generated from the tree, from OCR, or from a vision language model, and an
ASCII spatial sketch that renders the layout of the screen as a character grid using Unicode box-drawing characters.
The sketch is more useful than it sounds: it gives an agent a token-efficient representation of the spatial
arrangement of UI elements without requiring image processing.</p>

<h2 id="observation-modalities">Observation Modalities</h2>

<h3 id="accessibility-tree">Accessibility Tree</h3>

<p>On Windows, the accessibility tree is populated using the UI Automation API, which gives full element-level
information for any application that instruments UIA. This includes most standard Windows applications: browsers,
Office applications, system dialogs, and many third-party tools. The tree traversal respects a configurable maximum
depth to avoid excessive latency on complex windows such as a browser with many DOM-mapped UIA nodes.</p>

<p>On macOS and Linux, window enumeration and screenshot capture are fully functional. Full accessibility tree support
requires additional platform libraries (pyobjc on macOS for the AX API, pyatspi on Linux for AT-SPI), and the adapter
stubs in <code class="language-plaintext highlighter-rouge">observer.py</code> provide the correct extension points for anyone who wants to contribute those implementations.</p>

<h3 id="ocr">OCR</h3>

<p>Tesseract-backed OCR runs on a screenshot of the target window and extracts text with per-word confidence scores.
Words below a configurable confidence threshold are discarded. OCR is the most broadly applicable modality: it works
on any application regardless of accessibility instrumentation, though it obviously captures only visible text and not
 the structural relationships between elements.</p>

<h3 id="vision-language-model-descriptions">Vision Language Model Descriptions</h3>

<p>When <code class="language-plaintext highlighter-rouge">vlm.enabled</code> is set to <code class="language-plaintext highlighter-rouge">true</code> in <code class="language-plaintext highlighter-rouge">config.json</code> and an Anthropic API key is present, the server can send a
screenshot to Claude and request a structured description of what is visible. The VLM description is the richest of
the three modalities in terms of semantic content, and the most expensive in terms of latency and token cost. It is
best reserved for situations where the accessibility tree is sparse (applications that do not instrument UIA), OCR is
insufficient (images, icons, non-text UI elements), or the agent needs a holistic interpretation of what is happening
on screen rather than a list of element properties.</p>

<h2 id="mcp-integration">MCP Integration</h2>

<p>The MCP server speaks JSON-RPC 2.0 over stdin/stdout and is compatible with Claude Desktop and Claude Code. Adding it
to the Claude Desktop configuration exposes a set of tools that an agent can call to observe and interact with the
desktop:</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"mcpServers"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="nl">"os-screen-observer"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
      </span><span class="nl">"command"</span><span class="p">:</span><span class="w"> </span><span class="s2">"python"</span><span class="p">,</span><span class="w">
      </span><span class="nl">"args"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
        </span><span class="s2">"/absolute/path/to/screen_observer/main.py"</span><span class="p">,</span><span class="w">
        </span><span class="s2">"--mode"</span><span class="p">,</span><span class="w"> </span><span class="s2">"both"</span><span class="w">
      </span><span class="p">]</span><span class="w">
    </span><span class="p">}</span><span class="w">
  </span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>The tool surface covers observation and interaction:</p>

<table>
  <thead>
    <tr>
      <th>Tool</th>
      <th>Description</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">list_windows</code></td>
      <td>Enumerate all visible top-level windows</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">get_window_structure</code></td>
      <td>Full accessibility element tree as JSON</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">get_screen_description</code></td>
      <td>Prose description (accessibility / ocr / vlm / combined)</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">get_screen_sketch</code></td>
      <td>ASCII spatial layout diagram</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">get_screenshot</code></td>
      <td>Screenshot as base64 PNG</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">get_full_screenshot</code></td>
      <td>Screenshot + ASCII sketch in one call</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">get_visible_areas</code></td>
      <td>Visible non-occluded bounding boxes for a window</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">bring_to_foreground</code></td>
      <td>Raise a window above others</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">click_at</code></td>
      <td>Click at pixel coordinates</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">type_text</code></td>
      <td>Type text into the focused element</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">press_key</code></td>
      <td>Press a key combination</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">scroll</code></td>
      <td>Scroll the mouse wheel at an optional screen position</td>
    </tr>
  </tbody>
</table>

<p>The get_full_screenshot tool is particularly useful for agentic workflows because it combines a screenshot with an
ASCII sketch in a single call, giving the agent both a pixel-level image and a token-efficient structural
representation without two round-trips.</p>

<h2 id="agentic-features">Agentic Features</h2>

<p>The current codebase is a working prototype. A parallel design document, agentic_features_design.md, specifies a
production-grade feature set organized into six implementation phases. The decisions in that document reflect what it
actually takes to build a reliable agentic observation loop rather than a demo.</p>

<h3 id="stable-window-identity">Stable window identity</h3>

<p>The prototype uses positional window indices, which break silently when windows open, close,
or reorder between tool calls. The agentic design introduces window_uid, a stable opaque identifier that persists
until the window closes. On Windows it is win:{pid}:{hwnd}; on macOS, mac:{cg_window_number}; on Linux,
x11:{wmctrl_id}. Every tool that accepts a window index also accepts a window UID, and a stale UID returns a typed
WindowGone error rather than silently operating on the wrong window.</p>

<h3 id="element-selectors">Element selectors</h3>

<p>Clicking at pixel coordinates is brittle. The agentic design specifies a selector grammar, both
XPath-ish and CSS-ish, both compiling to the same AST, that lets an agent refer to elements by their role, name,
value, or ancestry path through the tree. Window[name=”Notepad”]/Pane/Button[name=”OK”] is stable across window moves
and size changes in a way that a pixel coordinate is not.</p>

<h3 id="actionreceipt">ActionReceipt</h3>

<p>Every input action, click, type, key press, scroll, returns a structured receipt that includes the
before and after tree hashes, whether the tree changed as a result of the action, and whether any new dialogs
appeared. This gives an agent the feedback it needs to decide what to do next without a separate observation call.</p>

<h3 id="observe-with-diff">Observe with diff</h3>

<p>Full tree snapshots are expensive in tokens. The agentic design specifies that every tree-producing
 tool returns a tree_token, and that passing since=<tree_token> returns only a diff against the previous observation.
The diff format is a custom structure by default, with an option to request RFC 6902 JSON Patch instead.</tree_token></p>

<h3 id="wait-and-synchronize">Wait and synchronize</h3>

<p>Agents cannot reliably click a button and immediately check the result, because the result may
not be visible yet. The wait_for tool takes a list of conditions, element appears, element disappears, text becomes
visible, window appears, tree changes, and polls until one of them is satisfied or a timeout is reached. The response
includes which condition matched and how many polls it took.</p>

<h3 id="typed-error-taxonomy">Typed error taxonomy</h3>

<p>Rather than returning a generic error string, every failure returns a structured object with a
code, a recoverable flag, and a suggested_next_tool. ElementNotFound suggests find_element. ElementOccluded suggests
bring_to_foreground. ConfirmationRequired suggests propose_action. An agent that branches on error.code can recover
from most transient failures without human intervention.</p>

<h3 id="record-replay-and-evaluation">Record, replay, and evaluation</h3>

<p>Building an evaluation substrate for desktop agents requires the ability to record
what an agent did and replay it against a consistent environment. The design specifies a tracing format (JSONL with
per-step screenshots), a replay engine with per-tool comparison rules that know which fields are deterministic and
which are not, and a YAML scenario DSL for scripting mock environments with state-machine reactions. An agent can be
evaluated against a scripted scenario, its trace replayed in verify mode, and the divergences surfaced as structured
data.</p>

<h3 id="confirmation-tokens">Confirmation tokens</h3>

<p>Some actions are destructive. The confirmation token flow lets an agent propose an action and
receive a token bound to the target element’s position and identity. The actual action only proceeds if the agent
presents a valid, unexpired token and the element has not moved by more than a configurable pixel tolerance. This is a
 lightweight safeguard against the category of error where an agent clicks the wrong thing because the UI shifted
between the proposal and the execution.</p>

<h3 id="redaction">Redaction</h3>

<p>Screen content may contain sensitive information: passwords, PINs, SSNs, anything that should not appear in
 a trace or in the context window of a cloud-hosted model. The redaction system matches element names, values, OCR
text, and window titles against configurable patterns and replaces matches with a replacement string before they
appear in any tool response or trace entry. Screenshot blur is available as an opt-in for visual redaction.</p>

<h2 id="the-web-inspector">The Web Inspector</h2>

<p>The web inspector at <code class="language-plaintext highlighter-rouge">localhost:5001</code> provides five tabs for human-facing observation.</p>

<p>The <code class="language-plaintext highlighter-rouge">STRUCTURE</code> tab renders the accessibility element hierarchy as an interactive collapsible JSON tree. This is the
fastest way to understand what elements a particular application exposes and how they are organized.</p>

<p>The <code class="language-plaintext highlighter-rouge">DESCRIPTION</code> tab shows a prose description of the selected window, with a mode selector that switches between
accessibility tree prose, OCR output, VLM output, and a combined view.</p>

<p>The <code class="language-plaintext highlighter-rouge">SKETCH</code> tab renders the ASCII spatial diagram of the window layout, the same output that the get_screen_sketch MCP
tool returns.</p>

<p>The <code class="language-plaintext highlighter-rouge">SCREENSHOT</code> tab shows the pixel screenshot alongside visible-area bounding boxes and the ASCII sketch in a single
panel.</p>

<p>The <code class="language-plaintext highlighter-rouge">ACTIONS</code> tab provides a UI for clicking at coordinates, typing text, and pressing key combinations, which is useful
 for testing input flows interactively before encoding them in an agent script.</p>

<p>The sidebar lists all visible windows. Clicking one selects it, and all tabs update to reflect the selected window.
Auto-refresh polls every three seconds.</p>

<h2 id="mock-mode-and-testing">Mock Mode and Testing</h2>

<p>A persistent problem with tools that depend on a running desktop is that they are difficult to test in CI.
OSScreenObserver includes a mock adapter that returns scripted window and tree data without requiring any OS access.
Tests run under –mock or via direct module imports and never need a desktop session. The CI workflow runs ruff for
linting and pytest for the test suite, with no Docker or virtual display required.</p>

<p>The agentic feature design extends mock mode with the scenario DSL, which lets you script a realistic application
environment with state-machine reactions: set a username field and the element’s value changes; click the login button
 with the right credentials and the application transitions to the welcome screen. This is the substrate on which
agent evaluation becomes reproducible rather than dependent on whatever application happens to be running.</p>

<h2 id="conclusion">Conclusion</h2>

<p>Running OSScreenObserver against real applications quickly reveals which applications are well-instrumented for
accessibility and which are not. Standard Windows applications, system dialogs, and browsers expose rich trees.
Electron applications with custom renderers, games, and some creative tools produce sparse trees where OCR and VLM
descriptions carry most of the weight. The three-modality design is not over-engineering; it reflects the actual
distribution of desktop applications that an agent might encounter.</p>

<p>The prompt injection risk is worth naming explicitly: screen content is included verbatim in tool results, which means
 that malicious content visible on screen, a web page with injected text, a document with embedded instructions, could
 attempt to influence the agent’s behavior. The same trust boundaries that apply to any tool that reads external
content apply here. The redaction system is a partial mitigation, but the appropriate response to this risk depends on
 the deployment context.</p>

<p>The setup cost is low for Windows users who want full functionality. Python, Tesseract for OCR, an Anthropic API key
for VLM descriptions if desired, and the package from the repository (<a href="https://github.com/BillJr99/OSScreenObserver">https://github.com/BillJr99/OSScreenObserver</a>).
The mock mode means you can explore the API and the web inspector without any platform-specific setup at all.</p>

<p>What OSScreenObserver ultimately provides is a bridge between the textual nature of large language models and the inherently visual, event-driven world of desktop applications. Rather than forcing agents to reason about pixel coordinates or raw images, the accessibility tree, OCR output, and ASCII spatial sketches give them structured, token-efficient representations of the desktop that align naturally with how LLMs process information. An agent that can read the UI as text, act on it through typed tool calls, and receive structured feedback about what changed is operating in a modality much closer to its native one — and that alignment is what makes reliable agentic desktop interaction tractable rather than brittle.</p>]]></content><author><name>Bill Mongan</name><email>billmongan+website@gmail.com</email></author><category term="ai" /><category term="agents" /><category term="mcp" /><category term="accessibility" /><category term="python" /><summary type="html"><![CDATA[Most AI agents, whether a large language model assistant running locally or a cloud-hosted agentic framework, have no reliable way to see or interact with the desktop applications running on the machine they are supposed to be helping with. They can read files, call APIs, and run shell commands, but they cannot observe that a dialog box appeared, that a form field is waiting for input, or that an application is in a specific state. OSScreenObserver is a prototype that changes that. It exposes the operating system’s UI accessibility tree, textual descriptions from multiple sources, and ASCII spatial sketches of the current screen layout through two simultaneous interfaces: a browser-based web inspector for humans and an MCP sees are always consistent.]]></summary></entry><entry><title type="html">A Private AI Knowledge Base: Obsidian, GitHub Sync, and Cross-Platform AI Context</title><link href="https://www.billmongan.com/posts/2026/05/obsidian-ai-vault/" rel="alternate" type="text/html" title="A Private AI Knowledge Base: Obsidian, GitHub Sync, and Cross-Platform AI Context" /><published>2026-05-02T00:00:00+00:00</published><updated>2026-05-02T00:00:00+00:00</updated><id>https://www.billmongan.com/posts/2026/05/obsidianai</id><content type="html" xml:base="https://www.billmongan.com/posts/2026/05/obsidian-ai-vault/"><![CDATA[<p>For the past year I have been building a knowledge management system with a specific design constraint in mind: every AI system I work with, whether a cloud-hosted assistant, a local agentic coding tool, or an automated GitHub Action, should be able to read the same authoritative description of who I am, what I am working on, and how I want to interact. More importantly, those systems should be able to write back into the knowledge base and have their work appear seamlessly in Obsidian on my local machine the next time I open the app. The proliferation of capable AI tools in 2025-2026 made both sides of this problem, reading and writing, tractable in a way they had not been before. This post documents the architecture I settled on: an Obsidian vault hosted on GitHub, synchronized via the Gitless Sync plugin, structured around three canonical files that any AI system can read and act on, and organized into a curated wiki that agents can query, extend, and maintain across platforms.</p>

<h2 id="motivation-context-as-a-first-class-artifact">Motivation: Context as a First-Class Artifact</h2>

<p>The practical problem is straightforward. Suppose you work with five or six AI tools on a regular basis: a web-based assistant with a custom instructions field, a local agentic coding CLI with a project context file, a GitHub-hosted agent that runs on repository events, and a few others. Each of these tools has its own mechanism for persistent context, and none of them are the same. You end up with five slightly different versions of your professional profile, your project list, and your working preferences, stored in five incompatible formats in five different places. When something changes, such as a new role, a new project, or a preference update, you update one and forget the others.</p>

<p>The deeper problem is that these tools are increasingly doing things that matter: drafting documents, running code, making commits, composing correspondence. Inconsistent context means inconsistent decisions about what to include, what to assume, and how to frame outputs. Getting this right is worth the architectural investment.</p>

<p>The solution is to store context in a Git repository, maintain it with the same discipline you would apply to any other codebase, and give every AI system a well-specified path to read from and write to it. Obsidian provides the human-readable and human-editable interface. GitHub provides the hosting, versioning, and webhook surface that agents need. The Gitless Sync plugin provides the bridge between the two, and the key insight of the design is that this bridge works in both directions: agents can push changes to GitHub and those changes will appear in Obsidian on the next sync, as long as they follow the metadata protocol described in <code class="language-plaintext highlighter-rouge">AGENTS.md</code>.</p>

<h2 id="hardware-and-local-setup">Hardware and Local Setup</h2>

<p>The vault lives on my primary workstation. Obsidian is installed and runs against a directory in the home filesystem. This placement is intentional: on machines that also run local agentic tools, the vault directory is on the same filesystem as the agent workspace, so local agents can read from or write to it without any network call by bind-mounting the directory into their container.</p>

<p>The GitHub repository that backs the vault is a standard private repository. The local Obsidian directory and the GitHub repository are not connected through standard Git tooling on the workstation. There is no <code class="language-plaintext highlighter-rouge">.git</code> directory in the vault folder, no <code class="language-plaintext highlighter-rouge">git pull</code> or <code class="language-plaintext highlighter-rouge">git push</code> in any shell script, and no SSH key configured for this purpose. Synchronization is handled entirely by the Obsidian plugin described in the next section. This is a deliberate choice: it removes any risk of a merge conflict or detached HEAD state caused by operations outside of Obsidian, and it means the sync mechanism is consistent regardless of whether I am running Obsidian on a desktop, a laptop, or the mobile app.</p>

<h2 id="github-gitless-sync-plugin-setup">GitHub Gitless Sync: Plugin Setup</h2>

<p><a href="https://github.com/silvanocerza/obsidian-github-sync">Obsidian Gitless Sync</a> is a community plugin that synchronizes an Obsidian vault directly with a GitHub repository using the GitHub REST API, without requiring Git to be installed locally and without creating a <code class="language-plaintext highlighter-rouge">.git</code> directory. Each file operation, create, modify, rename, delete, is translated into the corresponding GitHub API call, and the plugin tracks the synchronization state of every file in a local metadata file at <code class="language-plaintext highlighter-rouge">.obsidian/github-sync-metadata.json</code>.</p>

<p>Installation follows the standard community plugin path in Obsidian Settings: disable safe mode, browse community plugins, search for Gitless Sync, install, and enable. Configuration requires four values: a GitHub Personal Access Token with <code class="language-plaintext highlighter-rouge">repo</code> scope, the repository owner, the repository name, and the branch to synchronize against.</p>

<p>After configuration, the plugin performs an initial sync that either pushes the local vault to the empty repository or pulls a pre-existing repository into the local vault. Subsequent syncs are triggered manually through the command palette or automatically on a configurable interval. The sync direction is bidirectional: local changes are pushed to GitHub, and remote changes, including those made by agents or GitHub Actions, are pulled to the local vault.</p>

<p>The operational requirement worth emphasizing is that the GitHub PAT must have <code class="language-plaintext highlighter-rouge">repo</code> scope and must not expire during active use. A token expiration silently breaks sync in a way that is not immediately obvious: Obsidian continues to operate against the local files, and the failure only becomes apparent when an agent’s changes from two days ago have not appeared.</p>

<h2 id="the-key-insight-agents-can-write-not-just-read">The Key Insight: Agents Can Write, Not Just Read</h2>

<p>A common misunderstanding about this setup is that it is read-only from the agent’s perspective, with agents merely consuming context and humans maintaining the vault. The actual design is the opposite. Agents are the primary authors of <code class="language-plaintext highlighter-rouge">/wiki/</code>. Their job is to read source material from <code class="language-plaintext highlighter-rouge">/raw/</code>, synthesize it into structured Markdown, and write the results into <code class="language-plaintext highlighter-rouge">/wiki/</code> through the GitHub REST API. Those writes propagate to Obsidian on the next sync, and the result appears in the local vault as fully navigable, cross-linked notes.</p>

<p>The mechanism that makes this work is the metadata file. Any process that creates or modifies vault files through GitHub must also update <code class="language-plaintext highlighter-rouge">.obsidian/github-sync-metadata.json</code> in the same atomic commit. Without that update, Obsidian’s sync plugin has no record of the change and will not pull it on the next sync. The entire agent write protocol is designed around maintaining this invariant. The full specification of how agents must behave is documented in <code class="language-plaintext highlighter-rouge">AGENTS.md</code>, linked and described in detail below.</p>

<h2 id="the-sha-computation-protocol">The SHA Computation Protocol</h2>

<p>The <code class="language-plaintext highlighter-rouge">sha</code> field in the sync metadata is not a plain SHA-1 of the file’s bytes. It is a Git blob SHA, which Git computes by prepending a header to the raw file content before hashing. Any agent writing to the repository and updating the metadata must implement this correctly, or the sync plugin will treat the file as modified on the next pull.</p>

<p>The algorithm is:</p>

<ol>
  <li>Read the raw bytes of the file. Use the byte length, not the character count; this distinction matters for any file containing multi-byte Unicode characters.</li>
  <li>Construct the header string <code class="language-plaintext highlighter-rouge">blob {N}\0</code>, where <code class="language-plaintext highlighter-rouge">{N}</code> is the ASCII decimal representation of the byte length and <code class="language-plaintext highlighter-rouge">\0</code> is a literal null byte (0x00).</li>
  <li>Concatenate the header bytes and the raw file bytes.</li>
  <li>Apply SHA-1 to the concatenated byte stream.</li>
  <li>Encode the result as a 40-character lowercase hexadecimal string.
This is exactly what <code class="language-plaintext highlighter-rouge">git hash-object</code> produces and what the GitHub REST API returns in the <code class="language-plaintext highlighter-rouge">sha</code> field of blob responses. A common mistake is to hash the string representation of the content rather than the raw bytes, or to apply SHA-1 without the header. Both produce incorrect values that will not match what GitHub stores.</li>
</ol>

<p>In practice, there is a simpler path for most agent use cases: set <code class="language-plaintext highlighter-rouge">sha</code> to <code class="language-plaintext highlighter-rouge">null</code> and <code class="language-plaintext highlighter-rouge">dirty</code> to <code class="language-plaintext highlighter-rouge">true</code> in the metadata entry after creating or modifying a file. The sync plugin will upload the file on the next sync and replace the null with the API-confirmed SHA. The full SHA computation is only necessary when you want to pre-cache the expected value and avoid a redundant upload. Each metadata entry follows this schema:</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"path"</span><span class="p">:</span><span class="w"> </span><span class="s2">"relative/path/from/vault/root/to/file.md"</span><span class="p">,</span><span class="w">
  </span><span class="nl">"sha"</span><span class="p">:</span><span class="w"> </span><span class="s2">"&lt;40-char hex string or null&gt;"</span><span class="p">,</span><span class="w">
  </span><span class="nl">"dirty"</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span><span class="w">
  </span><span class="nl">"justDownloaded"</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="p">,</span><span class="w">
  </span><span class="nl">"lastModified"</span><span class="p">:</span><span class="w"> </span><span class="mi">1234567890000</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>After creating or modifying a file, set <code class="language-plaintext highlighter-rouge">dirty: true</code> and <code class="language-plaintext highlighter-rouge">lastModified</code> to <code class="language-plaintext highlighter-rouge">Date.now()</code> in milliseconds. The file and the metadata update must be committed together, atomically. Splitting them across separate commits will cause the sync plugin to mis-handle the change.</p>

<p>For deletions, the entry must include <code class="language-plaintext highlighter-rouge">deleted: true</code> and <code class="language-plaintext highlighter-rouge">deletedAt</code> alongside the path. The deleted file and the metadata update must again be committed in a single atomic operation.</p>

<h2 id="vault-structure-the-three-zone-design">Vault Structure: The Three-Zone Design</h2>

<p>The three-zone structure closely follows the layered architecture Andrej Karpathy described in his April 2026 LLM Wiki gist (<a href="https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f">gist.github.com/karpathy/442a6bf555914893e9891c11519de94f</a>): immutable raw sources that the LLM reads but never modifies, a wiki layer of structured Markdown that the LLM writes and maintains, and a schema file (his term for what this vault calls <code class="language-plaintext highlighter-rouge">AGENTS.md</code>) that specifies conventions and workflows.</p>

<p>The vault is organized into three functional zones with strictly enforced read/write boundaries.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>vault/
├── AGENTS.md           # Agent instructions (authoritative)
├── LLMMEMORIES.md      # Persistent user context for AI systems
├── SYSTEMPROMPT.md     # Standing system prompt and interaction preferences
├── raw/                # READ-ONLY source material inbox
│   └── *.md, *.pdf     # Unprocessed documents; never modified by agents
├── wiki/               # Curated, cross-linked knowledge base
│   ├── index.md        # Hub: overview and active work
│   └── */              # Subdirectories organized by domain
└── .obsidian/
    └── github-sync-metadata.json   # Sync state (managed by plugin + agents)
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">raw/</code> directory is a one-way inbox. Documents are dropped there and never touched again by anyone, human or agent. Agents read from <code class="language-plaintext highlighter-rouge">raw/</code> and write exclusively to <code class="language-plaintext highlighter-rouge">wiki/</code>. This separation means that the curated knowledge base in <code class="language-plaintext highlighter-rouge">wiki/</code> is never contaminated by unprocessed source material, and source documents are never accidentally overwritten by agent activity.</p>

<p>The root-level canonical files (<code class="language-plaintext highlighter-rouge">AGENTS.md</code>, <code class="language-plaintext highlighter-rouge">LLMMEMORIES.md</code>, <code class="language-plaintext highlighter-rouge">SYSTEMPROMPT.md</code>) are the only files that belong at the repository root. All authored content, every wiki page, every topic note, every hub index, belongs inside <code class="language-plaintext highlighter-rouge">wiki/</code> under an appropriate subdirectory. The root is intentionally clean. If a change appears to require a new root-level file or directory, the right answer is almost certainly that it belongs inside <code class="language-plaintext highlighter-rouge">wiki/</code> instead.</p>

<h2 id="agentsmd-the-complete-agent-specification">AGENTS.md: The Complete Agent Specification</h2>

<p>The <code class="language-plaintext highlighter-rouge">AGENTS.md</code> file is the single most important document in the repository. It is the file that GitHub-hosted agents, agentic CLI tools, GitHub Actions workflows, and any other automated process reads first when it encounters this repository.</p>

<p>The file opens with a non-negotiable preamble that instructs any agent encountering the repository to stop and read the file completely before taking any action. This is the architectural guarantee that makes the system self-documenting: the repository itself carries the complete specification for how to operate on it.</p>

<h3 id="repository-role-and-core-mandate">Repository Role and Core Mandate</h3>

<p>The opening section establishes the conceptual model precisely. <code class="language-plaintext highlighter-rouge">/raw/</code> is the unprocessed source-material inbox. <code class="language-plaintext highlighter-rouge">/wiki/</code> is the curated knowledge base. The agent’s responsibility is to transform material from <code class="language-plaintext highlighter-rouge">/raw/</code> into a structured, cross-linked, maintainable Markdown knowledge base in <code class="language-plaintext highlighter-rouge">/wiki/</code>. The core mandate is stated as a numbered list: open and inspect the repository; read relevant materials in <code class="language-plaintext highlighter-rouge">/raw/</code>; build, maintain, and improve a coherent knowledge base in <code class="language-plaintext highlighter-rouge">/wiki/</code>; create, update, reorganize, and cross-link Markdown notes in <code class="language-plaintext highlighter-rouge">/wiki/</code>; commit and push changes back to the repository when changes are made; and when asked questions, answer using <code class="language-plaintext highlighter-rouge">/wiki/</code> first.</p>

<h3 id="boundary-rules-what-agents-must-never-touch">Boundary Rules: What Agents Must Never Touch</h3>

<p><code class="language-plaintext highlighter-rouge">AGENTS.md</code> specifies three zones that are completely off-limits to agents, and the specificity of these prohibitions is what makes them enforceable.</p>

<p><code class="language-plaintext highlighter-rouge">/raw/</code> is strictly read-only. Agents must never modify, delete, rename, move, or create files in <code class="language-plaintext highlighter-rouge">/raw/</code>. All synthesis, editing, organization, and authored content must go into <code class="language-plaintext highlighter-rouge">/wiki/</code>. The source inbox is preserved exactly as it was received.</p>

<p><code class="language-plaintext highlighter-rouge">.obsidian/</code> is managed exclusively by Obsidian and must not be modified by agents, with one precisely stated exception: <code class="language-plaintext highlighter-rouge">.obsidian/github-sync-metadata.json</code> must be updated whenever an agent creates, modifies, renames, or deletes any file outside of Obsidian. No other file inside <code class="language-plaintext highlighter-rouge">.obsidian/</code> may be touched under any circumstances.</p>

<p><code class="language-plaintext highlighter-rouge">/.trash/</code> is managed exclusively by Obsidian offline and must be left entirely alone. Agents must not read, modify, delete, or create files in <code class="language-plaintext highlighter-rouge">/.trash/</code>, and must ignore its contents entirely during all operations. This directory accumulates files that Obsidian has soft-deleted, and any agent interaction with it could corrupt Obsidian’s delete state.</p>

<h3 id="quick-start-workflow">Quick Start Workflow</h3>

<p>For any agent starting a session against this repository, <code class="language-plaintext highlighter-rouge">AGENTS.md</code> provides a numbered quick-start workflow that specifies exactly what to do and in what order. Open the repository; read <code class="language-plaintext highlighter-rouge">AGENTS.md</code> completely; inspect the current structure and contents of <code class="language-plaintext highlighter-rouge">/wiki/</code>; inspect relevant material in <code class="language-plaintext highlighter-rouge">/raw/</code>; decide what should be created, updated, merged, linked, or reorganized in <code class="language-plaintext highlighter-rouge">/wiki/</code>; apply changes in <code class="language-plaintext highlighter-rouge">/wiki/</code> only; commit and push the changes; and if a question was asked, answer it using the curated knowledge in <code class="language-plaintext highlighter-rouge">/wiki/</code>.</p>

<p>The sequencing is intentional. Reading the existing wiki structure before making any changes prevents agents from creating duplicate pages, overwriting useful content, or restructuring areas that are already well-organized.</p>

<h3 id="operational-rules-synthesis-not-mirroring">Operational Rules: Synthesis, Not Mirroring</h3>

<p>A critical section of <code class="language-plaintext highlighter-rouge">AGENTS.md</code> specifies how agents should treat source material. The wiki is a curated layer, not a verbatim mirror of <code class="language-plaintext highlighter-rouge">/raw/</code>. Agents are expected to synthesize, summarize, normalize, deduplicate, and organize source material rather than transcribing it. They should prefer updating existing canonical notes over creating duplicates, and should create new notes when a topic clearly deserves its own page. Existing wiki content should be preserved unless it is redundant, obsolete, inaccurate, or clearly inferior to better source material.</p>

<p>The rules also require maintaining factual nuance and uncertainty. Agents must not overstate claims found in source materials. If source materials conflict, the disagreement must be explicitly noted rather than silently resolved. If a source is incomplete, fragmented, or ambiguous, the uncertainty must be marked in the resulting wiki page rather than smoothed over.</p>

<h3 id="organization-requirements">Organization Requirements</h3>

<p><code class="language-plaintext highlighter-rouge">AGENTS.md</code> specifies that <code class="language-plaintext highlighter-rouge">/wiki/</code> must be organized into clear, intuitive, scalable topical categories expressed as high-level directories with meaningful subdirectories. The wiki must never be a flat dump of loose notes at its top level. Related notes must be grouped under shared parent directories, with finer topics nested beneath broader domains. New top-level categories should be introduced when they have earned their place, and subdirectories should be split further as a topic grows.</p>

<p>The file explicitly names the page types that should be maintained where appropriate: overview pages, topic pages, concept and reference pages, project pages, people pages, hub and index pages, and chronology or notes pages. Category and hub pages should be created when they improve navigation. Filenames should be stable and human-readable. The guiding principle is a small number of well-maintained canonical pages over many fragmented or overlapping notes.</p>

<p>Reorganization is permitted and encouraged when it improves hierarchy, clarity, or navigability, but specifically prohibited when it is cosmetic or churns structure for its own sake. The instruction is direct: do not degrade a well-organized area with unnecessary restructuring.</p>

<h3 id="linking-and-navigation">Linking and Navigation</h3>

<p><code class="language-plaintext highlighter-rouge">AGENTS.md</code> requires that agents use Obsidian wikilinks extensively and meaningfully:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[[Page Name]]
</code></pre></div></div>

<p>Every wiki note should connect related ideas across the vault. “Related” or “See also” sections should be added where useful. Large blocks of content should be linked rather than duplicated. Each page should fit meaningfully into the larger structure, and hub pages should be created when they improve discoverability.</p>

<h3 id="formatting-expectations">Formatting Expectations</h3>

<p>All authored content must be Markdown only, clearly titled, readable and well-structured, concise but sufficiently specific, and polished rather than raw. Headings should be used when they improve structure, bullet points when they improve clarity, and tables only when they are genuinely useful. Pages should begin with a short summary when appropriate. Generic filler summaries that erase important details are explicitly prohibited.</p>

<h3 id="document-ingestion-protocol">Document Ingestion Protocol</h3>

<p>When a new document appears in <code class="language-plaintext highlighter-rouge">/raw/</code> or an agent is asked to ingest a document, <code class="language-plaintext highlighter-rouge">AGENTS.md</code> specifies a complete ingestion protocol. Read the document fully before making any changes. Identify content that maps to existing wiki pages and merge or enrich those pages rather than duplicating them. Create new wiki pages when a topic is substantial enough to deserve its own page. Reorganize files and directory structures when the new content warrants it, for example by introducing a new subdirectory if a category of content has grown, or renaming pages, or moving pages between directories. Update all hub and index pages to reflect any new pages created or any structural changes made. Cross-link newly created or updated pages into related pages using Obsidian wikilinks. Update <code class="language-plaintext highlighter-rouge">.obsidian/github-sync-metadata.json</code> for every file created or modified, following the SHA computation protocol. Commit and push all changes atomically when the ingestion is complete.</p>

<h3 id="wiki-linter-a-built-in-maintenance-agent-task">Wiki Linter: A Built-In Maintenance Agent Task</h3>

<p>One of the more distinctive features of <code class="language-plaintext highlighter-rouge">AGENTS.md</code> is a detailed specification for a wiki linter, a maintenance agent task that audits and repairs the vault’s structural integrity, link validity, and metadata completeness. The linter is described as a seven-step workflow that any agent can execute.</p>

<p>Step 1 identifies the vault metadata JSON file by checking the standard locations for community plugins before proceeding. Step 2 recursively enumerates all vault files, including those in <code class="language-plaintext highlighter-rouge">.trash/</code>, building complete lists of markdown files and non-markdown assets. Step 3 scans every markdown file for internal Obsidian wikilinks and standard relative markdown links, resolves each link against the full file list using Obsidian’s resolution rules (case-insensitive match on filename without extension, shortest-path-wins for ambiguous names), and classifies each as valid, broken due to file not found, broken due to ambiguity, or broken due to a missing heading. Broken links are repaired according to a precisely specified strategy: links with no plausible near-match have their syntax removed with an inline comment; links with a near-match (edit distance of 2 or less on the base filename) have the target corrected with an inline comment; broken heading links have the fragment removed while the file link is preserved.</p>

<p>Step 4 audits the metadata JSON for completeness, verifying that every markdown file has a corresponding entry with all required fields. Missing entries are synthesized using the existing schema as a template, populated from the file’s YAML frontmatter and filesystem timestamps where available. Step 5 validates the metadata JSON for well-formedness: valid JSON with no trailing commas or comments, all path keys using forward slashes, all date fields in ISO 8601 format, no raw newlines within string values, and UTF-8 encoding without BOM.</p>

<p>Step 6 updates <code class="language-plaintext highlighter-rouge">AGENTS.md</code> itself and its metadata entry to reflect the current UTC datetime, so that Obsidian Sync recognizes the file as modified and pulls the updated version on the next sync. Step 7 writes a linting report file to the repository root with a full summary of findings, broken links, metadata gaps, items requiring manual review, and files modified during the run.</p>

<p>The linter’s execution rules are notable for their conservatism: every file write is preceded by a diff against the current content, and the file is not written if the diff is empty. Ambiguous or destructive repairs, such as removing more than a link fragment, are flagged for manual review rather than applied automatically. The linter never modifies any file in <code class="language-plaintext highlighter-rouge">.obsidian/</code> except the metadata JSON, and if running in an environment without write access it produces a dry-run diff report instead of writing files.</p>

<h3 id="question-answering-mode">Question-Answering Mode</h3>

<p><code class="language-plaintext highlighter-rouge">AGENTS.md</code> specifies how agents should behave when asked a question and told to use the vault. They must open the repository and read <code class="language-plaintext highlighter-rouge">/wiki/</code> first as the primary and authoritative curated knowledge source. <code class="language-plaintext highlighter-rouge">/raw/</code> is consulted only to fill gaps, verify details, or incorporate newly added material not yet reflected in <code class="language-plaintext highlighter-rouge">/wiki/</code>. If <code class="language-plaintext highlighter-rouge">/wiki/</code> is incomplete or outdated relative to <code class="language-plaintext highlighter-rouge">/raw/</code>, the agent should update <code class="language-plaintext highlighter-rouge">/wiki/</code> first when appropriate, before answering the question.</p>

<p>This sequencing, wiki first and raw as a fallback rather than a primary source, is what makes the curated knowledge base progressively more valuable over time. Each question that exposes a gap is an opportunity to improve the wiki before answering.</p>

<h2 id="llmmemoriesmd-cross-platform-persistent-memory">LLMMEMORIES.md: Cross-Platform Persistent Memory</h2>

<p><code class="language-plaintext highlighter-rouge">LLMMEMORIES.md</code> is the canonical record of persistent user context. It is structured as a Markdown document with clearly delineated sections covering identity and roles, academic credentials, teaching responsibilities, research background, active funded projects, collaborators, and working preferences.</p>

<p>The key design principle, stated explicitly in <code class="language-plaintext highlighter-rouge">AGENTS.md</code>, is that this file is bidirectionally synchronized. When an AI system forms a more accurate or more detailed picture of some aspect of your context through extended interaction, that updated understanding should be written back into <code class="language-plaintext highlighter-rouge">LLMMEMORIES.md</code>. When <code class="language-plaintext highlighter-rouge">LLMMEMORIES.md</code> is updated in the repository, any agent starting a new session should read the new version and treat it as authoritative. The file is, in effect, an externalized, version-controlled memory store that persists across tools, sessions, and platforms.</p>

<p>When starting any session, <code class="language-plaintext highlighter-rouge">AGENTS.md</code> instructs agents to read both <code class="language-plaintext highlighter-rouge">LLMMEMORIES.md</code> and <code class="language-plaintext highlighter-rouge">SYSTEMPROMPT.md</code> and treat them as authoritative context alongside <code class="language-plaintext highlighter-rouge">AGENTS.md</code> itself.</p>

<h2 id="systempromptmd-twelve-sections-of-standing-instructions">SYSTEMPROMPT.md: Twelve Sections of Standing Instructions</h2>

<p><code class="language-plaintext highlighter-rouge">SYSTEMPROMPT.md</code> captures the behavioral and stylistic instructions that would otherwise need to be pasted into the system prompt or custom instructions field of each tool individually. It is organized into twelve numbered sections.</p>

<p><strong>Section 1: Identity and Role Context</strong> establishes professional identity and secondary identities relevant to task framing, including domain-specific calibration instructions. The purpose is not to describe the person to themselves but to give AI tools enough context to calibrate depth, vocabulary, and framing automatically without requiring re-explanation each session.</p>

<p><strong>Section 2: Communication and Output Style</strong> covers prose and academic writing, email and correspondence, and document formatting. Academic writing uses complex and compound-complex sentences with commas, a challenges-first structure, first-person plural (“we”) for academic contexts, hedging by scope condition rather than weakened claims, precise technical vocabulary, and lists framed by prose. The prohibition on em dashes is explicit: they must be replaced with commas, subordinate clauses, or restructured sentences. Email style is short, direct, and warm, opening with “Hi [Name]!” and closing with “Bill”, committing rather than hedging, and appropriate for mild humor in familiar professional contexts.</p>

<p><strong>Section 3: Technical and Coding Preferences</strong> specifies exception handling conventions (print with a location-specific prefix string plus <code class="language-plaintext highlighter-rouge">traceback.print_exc()</code>, never silently swallow exceptions), a requirement to provide complete revised function definitions rather than partial snippets or ellipsis-truncated fragments, a preference for externalizing all configuration into JSON files with configurable logging levels, a preferred technical stack for ML and pipeline work, and instructional code conventions that interleave mathematical derivations and conceptual explanations with implementation.</p>

<p><strong>Section 4: Task Execution Behavior</strong> covers three sub-protocols. Before starting any task, an agent must read any <code class="language-plaintext highlighter-rouge">context/</code> or <code class="language-plaintext highlighter-rouge">ABOUT-ME/</code> directory, any <code class="language-plaintext highlighter-rouge">SYSTEM-RULES.md</code> or <code class="language-plaintext highlighter-rouge">HOW-I-WORK.md</code> file, and any project-specific subfolder or template that applies to the task. The clarification protocol prohibits beginning execution if the goal, intended audience, output format, or scope is ambiguous in ways that would materially affect the output. Output discipline requires stating assumptions explicitly and flagging uncertainty inline rather than omitting it.</p>

<p><strong>Section 5: Domain-Specific Standing Instructions</strong> covers machine learning and AI (maintaining precision with probabilistic claims and applying existing pipeline architecture by default), computer science education and SoTL (grounding pedagogical recommendations in literature and noting evidentiary basis), grant and sponsored research writing (active voice, outcome-oriented framing, explicit distinctions between completed and proposed work), aviation and drone operations (applying FAA regulatory context from 14 CFR Parts 61, 91, and 107 as appropriate), and amateur radio (applying ITU and FCC Part 97 context, distinguishing between amateur and emergency communication contexts).</p>

<p><strong>Section 6: Memory and Context Hygiene</strong> addresses the stateless nature of many AI sessions. Workspace folder context files are the authoritative persistent memory, not inferred prior session context. If a task produces information that should persist across sessions, the instruction is to suggest saving it to the appropriate context file and offer to do so.</p>

<p><strong>Section 7: General Constraints</strong> includes an absolute prohibition on hallucinating citations, a requirement to prefer depth and accuracy over speed, and a constraint against producing outputs that could be mistaken for official institutional communication without explicit authorization.</p>

<p><strong>Section 8: Confirmation Gates</strong> is the section with the most direct operational consequence for agentic use. It defines six categories of action that require explicit affirmative confirmation before execution, and specifies precisely what information must be displayed when requesting that confirmation. A general instruction to “go ahead and handle everything” explicitly does not constitute confirmation for any of these categories.</p>

<p>Gate 8.1 covers destructive or significant file system alterations: deleting any file or directory, overwriting any existing file (with a requirement to propose a versioned backup first), renaming or moving more than three files in bulk, modifying any file outside the active workspace folder, or clearing a database. Gate 8.2 covers external communications: sending any email or message via any connected service. Drafting is permitted without confirmation; sending is not. Gate 8.3 covers version control actions: committing to any repository, pushing to any branch including feature branches, creating or deleting or merging branches, tagging a release, or force-pushing (the last of which is additionally flagged as high-risk). Gate 8.4 covers web and cloud publishing: deploying web content, updating live pages or API endpoints, uploading to publicly accessible cloud storage, submitting forms or proposals, or modifying DNS or SSL configuration. Gate 8.5 covers financial, administrative, and credentialed actions: financial transactions, institutional system access, grant portal submissions, or anything generating a binding commitment on the user’s behalf. Gate 8.6 establishes a threshold for batch operations: any automated operation affecting more than five files, records, or API calls in a single execution requires confirmation before the batch runs, with a display of the first two or three operations so the pattern can be verified.</p>

<p><strong>Section 9: Plan-First Protocol</strong> requires that for any task involving more than three discrete steps, or touching more than one confirmation gate category, a written execution plan must be presented before any action is taken. The plan must include a numbered list of all steps in sequence, the tool or method and expected output for each step, an explicit list of any confirmation gates that will be triggered and at which step, and an explicit request for approval before Step 1 executes.</p>

<p><strong>Section 10: Sensitive Data Handling</strong> establishes FERPA-based handling requirements for student data (no transmission, caching, or summarization of student records outside the local workspace without explicit authorization), IRB-based handling requirements for human subjects research data (flag any task appearing to involve such data and request confirmation of the applicable IRB protocol before proceeding), and a general prohibition on logging or outputting API keys, passwords, OAuth tokens, or unpublished grant narratives.</p>

<p><strong>Section 11: Task Logging and Audit Trail</strong> requires appending a session log to <code class="language-plaintext highlighter-rouge">logs/session_log.md</code> at the conclusion of any session in which files were created, modified, sent, or published. The log entry must include the date and time, a one-paragraph plain-English summary of what was accomplished, a list of files created or modified with their paths, a list of any external actions taken with confirmation that each was explicitly authorized, and any open items or follow-up tasks identified during the session.</p>

<p><strong>Section 12: Escalation and Uncertainty Protocol</strong> is the most operationally important safety instruction for autonomous use. If at any point during execution an agent encounters a situation not covered by the instructions, a conflict between two instructions, an unexpected error or permission failure or ambiguous system state, or a result that does not match the expected output, the instruction is to stop execution immediately, report the situation clearly, and wait for guidance. Autonomous recovery from unexpected states by taking additional actions is explicitly prohibited. The protocol frames this as a cost-benefit argument: the cost of pausing is always lower than the cost of an unintended irreversible action.</p>

<h2 id="connecting-ai-systems-to-the-vault">Connecting AI Systems to the Vault</h2>

<p>The practical integration pattern differs by tool type, but the underlying idea is the same in every case: give the tool a path to <code class="language-plaintext highlighter-rouge">AGENTS.md</code>, <code class="language-plaintext highlighter-rouge">LLMMEMORIES.md</code>, and <code class="language-plaintext highlighter-rouge">SYSTEMPROMPT.md</code>, and let those files do the context-setting work.</p>

<p>For <strong>agentic CLI tools</strong> like Claude Code and OpenCode, the integration is a <code class="language-plaintext highlighter-rouge">CLAUDE.md</code> or <code class="language-plaintext highlighter-rouge">AGENTS.md</code> file at the root of each project that begins with a fetch-and-read instruction pointing at the vault repository:</p>

<div class="language-markdown highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gu">## Context</span>
 
Before beginning any task, fetch and read the following files from the
knowledge repository:
<span class="p"> 
-</span> https://raw.githubusercontent.com/YOUR_GITHUB_USERNAME/Obsidian-Vault/main/AGENTS.md
<span class="p">-</span> https://raw.githubusercontent.com/YOUR_GITHUB_USERNAME/Obsidian-Vault/main/LLMMEMORIES.md
<span class="p">-</span> https://raw.githubusercontent.com/YOUR_GITHUB_USERNAME/Obsidian-Vault/main/SYSTEMPROMPT.md
 
Treat the contents of those files as authoritative context for all decisions
in this session.
</code></pre></div></div>

<p>For <strong>GitHub Actions workflows</strong>, the vault repository is checked out as a secondary input in the workflow definition:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Checkout knowledge vault</span>
  <span class="na">uses</span><span class="pi">:</span> <span class="s">actions/checkout@v4</span>
  <span class="na">with</span><span class="pi">:</span>
    <span class="na">repository</span><span class="pi">:</span> <span class="s">YOUR_GITHUB_USERNAME/Obsidian-Vault</span>
    <span class="na">token</span><span class="pi">:</span> <span class="s">$</span>
    <span class="na">path</span><span class="pi">:</span> <span class="s">vault</span>
 
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Build context</span>
  <span class="na">run</span><span class="pi">:</span> <span class="pi">|</span>
    <span class="s">cat vault/AGENTS.md vault/LLMMEMORIES.md vault/SYSTEMPROMPT.md &gt; /tmp/context.md</span>
</code></pre></div></div>

<p>For <strong>web-based assistants</strong> with a custom instructions field, I maintain a condensed version of <code class="language-plaintext highlighter-rouge">SYSTEMPROMPT.md</code> in that field, with a reference to the full repository for agents that can fetch URLs. The condensed version covers the highest-impact sections: identity, writing style, confirmation gates, and the location of the canonical files.</p>

<p>For <strong>local containerized agents</strong>, the vault directory is bind-mounted directly into the container’s filesystem:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker run <span class="nt">--rm</span> <span class="nt">-it</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/obsidian-vault:/vault:ro"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/projects/current:/workspace"</span> <span class="se">\</span>
  my-agent:latest
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">:ro</code> flag applies when the local agent is consuming context rather than writing wiki content. For agents whose primary job is to write to <code class="language-plaintext highlighter-rouge">/wiki/</code>, the mount is read-write, and the agent commits its changes through the GitHub REST API so that the metadata protocol is satisfied correctly. Thanks to the system prompt details about updating the Github Gitless Sync plugin metadata json file, this is optional, and agents can also modify the repository to be synchronized back to Obsidian. In this way, I really only use Obsidian as a convenient viewer.</p>

<h2 id="a-concrete-example-your-cv-as-a-raw-source">A Concrete Example: Your CV as a Raw Source</h2>

<p>The most direct way to see how the write path works is to walk through the simplest possible workflow: drop a document into <code class="language-plaintext highlighter-rouge">raw/</code>, point an agent at the repository, and watch it process the document into <code class="language-plaintext highlighter-rouge">wiki/</code>.</p>

<p>Suppose you drop a PDF CV into <code class="language-plaintext highlighter-rouge">raw/</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>raw/
└── my-cv.pdf
</code></pre></div></div>

<p>You then give an agentic tool the following instruction, which is all it needs because the rest is specified in <code class="language-plaintext highlighter-rouge">AGENTS.md</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Clone https://github.com/YOUR_GITHUB_USERNAME/Obsidian-Vault, read AGENTS.md,
and follow the instructions to process any unprocessed documents in raw/.
</code></pre></div></div>

<p>The agent reads <code class="language-plaintext highlighter-rouge">AGENTS.md</code>, which tells it that <code class="language-plaintext highlighter-rouge">raw/</code> is a read-only inbox, that processed content belongs in <code class="language-plaintext highlighter-rouge">wiki/</code>, and that it must update <code class="language-plaintext highlighter-rouge">.obsidian/github-sync-metadata.json</code> for every file it creates. It then reads the CV, extracts the relevant structured information, and builds out the wiki accordingly, creating pages for education, positions held, publications, funded projects, skills, and service roles, cross-linking them to each other and to an <code class="language-plaintext highlighter-rouge">index.md</code> hub. It commits the new wiki files and the updated metadata back to the repository in a single atomic commit.</p>

<p>The next time you open Obsidian and trigger a sync, those wiki pages appear in your vault as fully navigable, cross-linked notes. The CV has been processed exactly once, its information now lives in a structured and queryable form, and the original PDF is preserved untouched in <code class="language-plaintext highlighter-rouge">raw/</code>.</p>

<p>The same workflow applies to any source material: a conference paper, a project brief, a set of meeting notes exported from another application, a transcript, a published technical document. Whatever arrives in <code class="language-plaintext highlighter-rouge">raw/</code> becomes an input to the next agent run, which extends the wiki without touching anything that already exists. Over time, the wiki accumulates a comprehensive, cross-linked knowledge base built from actual documents, maintained by agents that follow the same explicit instructions every time.</p>

<h2 id="the-wiki-as-a-queryable-knowledge-base">The Wiki as a Queryable Knowledge Base</h2>

<p>The <code class="language-plaintext highlighter-rouge">wiki/</code> directory is the output of the system, the structured knowledge base that gets built and maintained over time. Its organization should reflect the categories of information it needs to represent. <code class="language-plaintext highlighter-rouge">AGENTS.md</code> specifies that the wiki must be organized into clear, intuitive, scalable topical categories with high-level directories and meaningful subdirectories, never left as a flat dump of loose notes at its top level.</p>

<p>The <code class="language-plaintext highlighter-rouge">wiki/index.md</code> serves as the hub for the entire knowledge base. When an agent is asked a question and instructed to use the vault, <code class="language-plaintext highlighter-rouge">AGENTS.md</code> specifies that it starts at <code class="language-plaintext highlighter-rouge">wiki/index.md</code>, identifies the relevant area, navigates to the appropriate subdirectory, reads the relevant files, and synthesizes an answer, updating the wiki first if it is incomplete or outdated relative to available source material in <code class="language-plaintext highlighter-rouge">raw/</code>.</p>

<p>This sequence, hub to subdirectory to relevant pages to synthesis, is the same workflow a human would follow when using Obsidian, which makes the agent behavior predictable and auditable. The wiki is structured for both human navigation and machine traversal, because those two requirements are compatible when the underlying format is plain Markdown with explicit cross-links.</p>

<h2 id="the-raw-inbox-and-reference-cataloging">The raw/ Inbox and Reference Cataloging</h2>

<p>The <code class="language-plaintext highlighter-rouge">raw/</code> directory handles a problem that anyone who works across multiple document formats encounters regularly: you receive a PDF, or an exported summary, or a transcript, and you need to incorporate the information into your knowledge base without losing the original source. The <code class="language-plaintext highlighter-rouge">raw/</code> inbox holds originals in whatever format they arrive, and they are never modified.</p>

<p>The non-destructive contract of <code class="language-plaintext highlighter-rouge">raw/</code> means that the inbox can grow without risk. A new document dropped into <code class="language-plaintext highlighter-rouge">raw/</code> will be picked up on the next agent run and incorporated into the wiki, while the original is preserved for future reference or re-processing. When an agent creates a wiki note from a source document, it links back to the original in <code class="language-plaintext highlighter-rouge">raw/</code> using a relative wikilink:</p>

<div class="language-markdown highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="ge">*Source: [[../raw/my_cv.pdf]]*</span>
</code></pre></div></div>

<p>Obsidian renders this as a clickable link that opens the original document; an agent resolves it as a vault-relative path. This citation chain means that every wiki page can be traced back to its source, and the source can be re-processed with a better agent or a different prompt without losing either the original document or the previously generated wiki content.</p>

<h2 id="synchronization-in-practice">Synchronization in Practice</h2>

<p>The full synchronization cycle, from a local Obsidian edit to an agent seeing the change, involves three steps: edit in Obsidian, trigger a sync via the Gitless Sync plugin, and have the agent pull from the GitHub API or clone the repository fresh. The reverse cycle, from an agent commit to Obsidian seeing the change, is: agent writes to GitHub via the REST API, updating metadata correctly in the same atomic commit; Obsidian pulls on the next sync; and the new or modified file appears in the vault.</p>

<p>The metadata protocol is what makes the reverse cycle work reliably. An agent that creates files without updating the metadata will find that those files appear to Obsidian as if they were created outside the sync system and may not be pulled on the next sync. An agent that updates metadata with incorrect SHA values will trigger a spurious re-download of files that have not changed. The atomic commit requirement, where the file change and the metadata update land in the same commit, prevents a state where one exists without the other.</p>

<h2 id="practical-takeaways">Practical Takeaways</h2>

<p>Running this system for the past year has produced a clear picture of what works and what requires ongoing attention. The three-file root structure (<code class="language-plaintext highlighter-rouge">AGENTS.md</code>, <code class="language-plaintext highlighter-rouge">LLMMEMORIES.md</code>, <code class="language-plaintext highlighter-rouge">SYSTEMPROMPT.md</code>) has proven to be the right level of granularity: enough separation to allow selective reading, not so much fragmentation that agents need to synthesize across many files to get a complete picture. The <code class="language-plaintext highlighter-rouge">raw/</code> read-only constraint has been invaluable; on two occasions, agents attempted to modify source documents, and the explicit prohibition in <code class="language-plaintext highlighter-rouge">AGENTS.md</code> was what stopped them. The bidirectional memory update convention for <code class="language-plaintext highlighter-rouge">LLMMEMORIES.md</code> is theoretically correct but requires discipline in practice: it is easy to accept a session’s refined understanding and forget to write it back to the file.</p>

<p>The confirmation gate protocol in <code class="language-plaintext highlighter-rouge">SYSTEMPROMPT.md</code> has proven its value in agentic contexts most sharply. Having the protocol specified in a canonical file rather than per-tool configuration means that a GitHub Actions workflow, a local container running Claude Code, and a web-based assistant all inherit the same safety constraints automatically. The cost of reading the file is negligible; the cost of not having the protocol is an agent that commits, pushes, or deploys without checking first.</p>

<p>The setup cost is low. The vault structure requires perhaps two hours of initial authoring: writing <code class="language-plaintext highlighter-rouge">AGENTS.md</code>, writing an initial <code class="language-plaintext highlighter-rouge">LLMMEMORIES.md</code> from whatever context you already maintain elsewhere, and writing an initial <code class="language-plaintext highlighter-rouge">SYSTEMPROMPT.md</code> from the instructions you already paste into various tools. The return, in terms of consistent and well-calibrated AI assistance across tools and sessions, scales with the quality of those three files. The vault has become the single source of truth for the context that every AI tool I use draws from, and the discipline of maintaining it in one place is considerably less costly than the alternative of maintaining five inconsistent versions of the same information in five incompatible formats.</p>

<hr />]]></content><author><name>Bill Mongan</name><email>billmongan+website@gmail.com</email></author><category term="ai" /><category term="obsidian" /><category term="knowledge-management" /><category term="agents" /><category term="markdown" /><summary type="html"><![CDATA[For the past year I have been building a knowledge management system with a specific design constraint in mind: every AI system I work with, whether a cloud-hosted assistant, a local agentic coding tool, or an automated GitHub Action, should be able to read the same authoritative description of who I am, what I am working on, and how I want to interact. More importantly, those systems should be able to write back into the knowledge base and have their work appear seamlessly in Obsidian on my local machine the next time I open the app. The proliferation of capable AI tools in 2025-2026 made both sides of this problem, reading and writing, tractable in a way they had not been before. This post documents the architecture I settled on: an Obsidian vault hosted on GitHub, synchronized via the Gitless Sync plugin, structured around three canonical files that any AI system can read and act on, and organized into a curated wiki that agents can query, extend, and maintain across platforms.]]></summary></entry><entry><title type="html">Cognitive Loop Kernel: A Local-First Multi-Agent Development Harness</title><link href="https://www.billmongan.com/posts/2026/05/cognitivelloopkernel/" rel="alternate" type="text/html" title="Cognitive Loop Kernel: A Local-First Multi-Agent Development Harness" /><published>2026-05-01T00:00:00+00:00</published><updated>2026-05-01T00:00:00+00:00</updated><id>https://www.billmongan.com/posts/2026/05/cognitiveloopkernel</id><content type="html" xml:base="https://www.billmongan.com/posts/2026/05/cognitivelloopkernel/"><![CDATA[<p>The emergence of capable code-generation models has prompted a wave of experiments in autonomous software development, where LLM agents plan, implement, test, and revise code with minimal human intervention. Most of these systems, however, rely on cloud orchestration services, opaque runtimes, or monolithic agent designs that make it difficult to inspect, customize, or extend the underlying behavior. The <a href="https://github.com/BillJr99/CognitiveLoopKernel">Cognitive Loop Kernel (CLK)</a> is a local-first, multi-agent development harness that attempts to address these constraints directly. You give it an idea, and a dynamically assembled team of agents iterates that idea into a working system through repeated agentic development cycles, all under your local filesystem.</p>

<p>This article provides a detailed walkthrough of CLK’s architecture, its principal subsystems, and the design decisions that distinguish it from similar tools.</p>

<blockquote>
  <p><strong>Experimental software — use at your own risk.</strong>
CLK is a research prototype. It is not intended for, and has not been
evaluated or deemed suitable for, any particular purpose, production
use, or critical workload. No warranty is provided, express or implied.
By using this software you accept all associated risks.</p>

  <p>Contributions, bug reports, and ideas are very welcome — feel free to
open an issue or pull request!</p>
</blockquote>

<h2 id="why-a-local-first-harness">Why a Local-First Harness?</h2>

<p>The core motivation for CLK is reproducibility and ownership. Everything the harness produces, including agent prompts, provider configurations, per-run logs, intermediate file states, and git history, lives under a single <code class="language-plaintext highlighter-rouge">.clk/</code> directory inside a project folder on your local disk. There is no external orchestration service, no account required, and no state stored elsewhere. If you want to inspect what a particular agent said in iteration four of last Tuesday’s engineering loop, you look at <code class="language-plaintext highlighter-rouge">.clk/runs/</code>. If you want to understand why the chief cast a <code class="language-plaintext highlighter-rouge">security_auditor</code> role, you read <code class="language-plaintext highlighter-rouge">.clk/state/casting.log</code>. The harness treats transparency as a first-class property of the system.</p>

<p>A second motivation is provider agnosticism. CLK supports Claude Code, OpenAI Codex, Google Gemini, any OpenAI-compatible HTTP server (via OpenWebUI), local Ollama models, Pi, and a built-in shell dummy provider for testing and CI. Switching providers requires a single configuration change, and different agents within the same workflow can be bound to different providers, so you can route heavy reasoning tasks to a cloud model while keeping low-stakes validation steps on a local Ollama instance.</p>

<h2 id="high-level-architecture">High-Level Architecture</h2>

<p>CLK is organized around three principal concerns: state management, agent orchestration, and provider abstraction. The package layout reflects this separation.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>clk_harness/
  cli.py                 # argparse entrypoint and command dispatch
  config.py              # paths, default configs, JSON load/save helpers
  git_ops.py             # init, commit, revert, and status wrappers
  providers/             # claude, codex, gemini, ollama, openwebui, pi, shell
  orchestration/         # agent runner, workflow runner, ralph/autoresearch loops
  templates/             # bundled prompts and default workflow YAML
  utils/                 # structured logging and activity tracking
</code></pre></div></div>

<p>The harness state, written by <code class="language-plaintext highlighter-rouge">clk init</code> and extended by every subsequent command, is isolated under <code class="language-plaintext highlighter-rouge">.clk/</code> in the project root:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>.clk/
  config/
    clk.config.json      # project-wide configuration
    providers.json       # provider registry and active selection
    agents.json          # agent-to-prompt-and-provider mapping
    workflows/*.yaml     # Archon-style workflow definitions
  prompts/               # per-agent system prompt templates
  state/
    idea.json            # the captured project idea
    system_brief.md      # initial brief
    prd.json             # product manager output
    progress.md          # human-readable development timeline
    decisions.md         # log of architectural decisions
    experiments.jsonl    # per-iteration outcome records
    agent_memory.jsonl   # all agent invocations with token usage
    casting.log          # JSONL of every roster change
    done.md              # written when completion criteria are met
  blackboard/            # cross-agent shared scratchpad (POST blocks)
  logs/                  # session and per-command logs
  runs/                  # per-invocation prompt and response capture
  backups/               # safety copies of files before mutation
</code></pre></div></div>

<p>This layout means that deleting <code class="language-plaintext highlighter-rouge">.clk/</code> resets the harness entirely without touching anything the agents actually built. The project source, tests, documentation, and configuration live in the project root proper and are managed by an ordinary git repository.</p>

<h2 id="dynamic-agent-casting">Dynamic Agent Casting</h2>

<p>The most distinctive architectural decision in CLK is dynamic team assembly. Rather than shipping a fixed roster of roles, the harness ships three baseline agents that cannot be removed, and then lets the <code class="language-plaintext highlighter-rouge">chief</code> agent invent and register project-specific specialists on the fly.</p>

<p>The three baseline agents are <code class="language-plaintext highlighter-rouge">chief</code>, which decomposes objectives, casts the team, and authors workflow YAML; <code class="language-plaintext highlighter-rouge">ralph</code>, which drives both the iterative refinement loop and Karpathy-style autoresearch cycles; and <code class="language-plaintext highlighter-rouge">qa</code>, which validates stage outputs. Every other agent, including <code class="language-plaintext highlighter-rouge">engineer</code>, is a dynamic role that the chief creates by emitting a structured <code class="language-plaintext highlighter-rouge">PROPOSE_ROLE</code> block in its response text:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>PROPOSE_ROLE: data_steward
ROLE: ensure data integrity and schema versioning
PROVIDER: claude
PROMPT:
You are the **Data Steward** agent.
Objective: $objective
State: $state_summary
...
END_ROLE
</code></pre></div></div>

<p>The harness parses this block, writes the generated prompt to <code class="language-plaintext highlighter-rouge">.clk/prompts/data_steward.md</code>, registers the role in <code class="language-plaintext highlighter-rouge">.clk/config/agents.json</code>, and makes it available to subsequent stages in the same workflow cycle, all without pausing execution. Every roster decision is appended as a JSONL entry to <code class="language-plaintext highlighter-rouge">.clk/state/casting.log</code>, giving you a complete provenance record of how the team evolved over the project’s lifetime.</p>

<p>The chief can also propose custom workflows:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>PROPOSE_WORKFLOW: engineering
YAML:
name: engineering
stages:
  - id: decompose
    agent: chief
    objective: Decompose the current top-level objective.
  - id: implement
    agent: data_steward
    depends_on: [decompose]
    validation: "pytest -q"
    commit: true
END_WORKFLOW
</code></pre></div></div>

<p>The harness validates the YAML (both via PyYAML when available and a built-in mini-parser for environments where <code class="language-plaintext highlighter-rouge">ensurepip</code> is unavailable), writes the file to <code class="language-plaintext highlighter-rouge">.clk/config/workflows/</code>, and routes the next <code class="language-plaintext highlighter-rouge">clk run</code> invocation through the new stage graph. This architecture means the project workflow is not hardcoded into the harness; it is a first-class artifact that the chief authors and evolves as it learns more about the project’s requirements.</p>

<p>The name <code class="language-plaintext highlighter-rouge">engineer</code> is reserved and protected. The harness rejects attempts to create aliases such as <code class="language-plaintext highlighter-rouge">engineering</code>, <code class="language-plaintext highlighter-rouge">coder</code>, or <code class="language-plaintext highlighter-rouge">developer</code>, and feeds the rejection back into the chief’s context as casting feedback so it learns to use the canonical name directly. A cap from <code class="language-plaintext highlighter-rouge">clk.config.json::casting.max_dynamic_roles</code> (default 12) prevents unbounded role proliferation.</p>

<h2 id="the-action-protocol">The Action Protocol</h2>

<p>Agents drive real changes by emitting structured <code class="language-plaintext highlighter-rouge">ACTION:</code> blocks. A response that merely describes what should happen but emits no ACTION blocks has no side effects, which is a deliberate design choice: the harness makes it impossible for an agent to accidentally mutate the project simply by describing a plan.</p>

<p>The supported action types are:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">ACTION: write</code> creates or overwrites a file at the specified path.</li>
  <li><code class="language-plaintext highlighter-rouge">ACTION: edit</code> applies a targeted textual replacement within an existing file, analogous to a very simple diff application.</li>
  <li><code class="language-plaintext highlighter-rouge">ACTION: append</code> adds content to the end of an existing file.</li>
  <li><code class="language-plaintext highlighter-rouge">ACTION: delete</code> removes a file.</li>
  <li><code class="language-plaintext highlighter-rouge">ACTION: run</code> executes a shell command from the project root with output captured to the log.</li>
  <li><code class="language-plaintext highlighter-rouge">ACTION: done</code> writes <code class="language-plaintext highlighter-rouge">.clk/state/done.md</code> and signals all loops to terminate.</li>
</ul>

<p>All paths are validated against the project root before execution. Any path that resolves into <code class="language-plaintext highlighter-rouge">.clk/</code> is rejected outright (with the single exception of <code class="language-plaintext highlighter-rouge">.clk/blackboard/</code>, which agents may write to via POST blocks). Attempted path traversal outside the project root is similarly rejected. Before any file is mutated, the original is backed up to <code class="language-plaintext highlighter-rouge">.clk/backups/&lt;run_id&gt;/</code>, so every overwrite is recoverable. A per-response cap of 25 file actions prevents runaway agents from restructuring the entire project in a single cycle.</p>

<p>Following each successful batch of action applications, the harness generates a structured git commit with the agent name, objective, files changed, commands run, and token totals embedded in the commit body. The git log thus becomes a faithful narrative of the project’s development, authored by the agents themselves.</p>

<h2 id="blackboard-cross-agent-communication">Blackboard: Cross-Agent Communication</h2>

<p>Agents in a multi-stage workflow often need to share findings without routing everything through the chief as a relay. CLK provides a structured shared scratchpad called the blackboard, which lives at <code class="language-plaintext highlighter-rouge">.clk/blackboard/</code> as a collection of JSON files.</p>

<p>Rather than writing to the blackboard directly via ACTION blocks (which the harness would reject as <code class="language-plaintext highlighter-rouge">.clk/</code> access), agents emit POST blocks:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>POST: finding
TITLE: Database schema analysis complete
PRODUCES: schema_contract
BODY:
Three tables identified: users, sessions, events.
The sessions table lacks a foreign key constraint.
Recommend adding: ALTER TABLE sessions ADD FOREIGN KEY ...
END_POST
</code></pre></div></div>

<p>The harness parses this block, adds metadata (timestamp, author, stage ID, workflow name), and writes the post as a JSON file under <code class="language-plaintext highlighter-rouge">.clk/blackboard/</code>. When the next stage runs, the workflow runner injects a filtered digest of relevant posts into each agent’s prompt via the <code class="language-plaintext highlighter-rouge">$blackboard_digest</code> placeholder. Stage definitions can declare <code class="language-plaintext highlighter-rouge">inputs</code> to specify which posts they want to consume and <code class="language-plaintext highlighter-rouge">outputs</code> to specify which contract keys they are expected to produce, enabling the workflow runner to verify that inter-agent contracts are satisfied before committing a stage.</p>

<p>Posts are immutable. To revise a finding, an agent writes a new post with the original post ID listed in the <code class="language-plaintext highlighter-rouge">CONSUMES</code> field, creating an auditable revision chain rather than silent overwriting.</p>

<h2 id="workflow-execution-and-dependency-resolution">Workflow Execution and Dependency Resolution</h2>

<p>Workflows are YAML files describing a directed acyclic graph of stages, each bound to an agent and an objective:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">name</span><span class="pi">:</span> <span class="s">engineering</span>
<span class="na">description</span><span class="pi">:</span> <span class="s">Single development cycle.</span>
<span class="na">stages</span><span class="pi">:</span>
  <span class="pi">-</span> <span class="na">id</span><span class="pi">:</span> <span class="s">decompose</span>
    <span class="na">agent</span><span class="pi">:</span> <span class="s">chief</span>
    <span class="na">objective</span><span class="pi">:</span> <span class="s">Decompose the current top-level objective.</span>
  <span class="pi">-</span> <span class="na">id</span><span class="pi">:</span> <span class="s">implement</span>
    <span class="na">agent</span><span class="pi">:</span> <span class="s">engineer</span>
    <span class="na">objective</span><span class="pi">:</span> <span class="s">Implement the smallest vertical slice.</span>
    <span class="na">depends_on</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">decompose</span><span class="pi">]</span>
    <span class="na">validation</span><span class="pi">:</span> <span class="s2">"</span><span class="s">pytest</span><span class="nv"> </span><span class="s">-q"</span>
    <span class="na">commit</span><span class="pi">:</span> <span class="no">true</span>
  <span class="pi">-</span> <span class="na">id</span><span class="pi">:</span> <span class="s">validate</span>
    <span class="na">agent</span><span class="pi">:</span> <span class="s">qa</span>
    <span class="na">objective</span><span class="pi">:</span> <span class="s">Audit implementation and confirm test passage.</span>
    <span class="na">depends_on</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">implement</span><span class="pi">]</span>
</code></pre></div></div>

<p>The workflow runner (<code class="language-plaintext highlighter-rouge">orchestration/workflow.py</code>) topologically sorts stages by their <code class="language-plaintext highlighter-rouge">depends_on</code> declarations and executes stages that share no dependencies in parallel using a <code class="language-plaintext highlighter-rouge">ThreadPoolExecutor</code>. Each stage may declare a <code class="language-plaintext highlighter-rouge">validation</code> shell command; the command must exit 0 before the harness will commit the stage’s changes. Failed validations leave the working tree untouched. The parallel execution design means that a large workflow with independent research and implementation tracks can run those tracks concurrently rather than serially.</p>

<p>The default <code class="language-plaintext highlighter-rouge">engineering.yaml</code> workflow ends with a <code class="language-plaintext highlighter-rouge">supervise</code> stage, where the chief evaluates whether the user’s original prompt has been fully addressed. The chief either emits an <code class="language-plaintext highlighter-rouge">ACTION: done</code> block (writing <code class="language-plaintext highlighter-rouge">done.md</code> and terminating the loop) or emits a <code class="language-plaintext highlighter-rouge">PROPOSE_WORKFLOW</code> block describing the next iteration’s stages, upon which the runner picks them up and executes another cycle. This supervisor loop is capped at <code class="language-plaintext highlighter-rouge">clk.config.json::supervise.max_cycles</code> (default 5) to prevent runaway iteration.</p>

<h2 id="self-healing-on-unmet-dependencies">Self-Healing on Unmet Dependencies</h2>

<p>When a workflow stage’s declared dependencies fail, the harness does not silently skip the stage or crash. Instead, it dispatches the <code class="language-plaintext highlighter-rouge">chief</code> agent in recovery mode, providing the exact failure reasons (agent error text, validation command output, QA report) and asking the chief to either re-cast the workflow, emit ACTION blocks that fix the upstream failure directly, or propose a specialist agent that can resolve the issue. This recovery dispatch is capped at three passes per stage (configurable via <code class="language-plaintext highlighter-rouge">clk.config.json::recovery.max_per_stage</code>) to prevent infinite recovery loops. The design treats agent and validation failures as information that the system can reason about, rather than hard stops requiring human intervention.</p>

<h2 id="iterative-improvement-loops">Iterative Improvement Loops</h2>

<p>CLK ships two distinct iterative loop modes, both driven through the <code class="language-plaintext highlighter-rouge">ralph</code> agent.</p>

<h3 id="the-ralph-refinement-loop">The Ralph Refinement Loop</h3>

<p>The Ralph loop (<code class="language-plaintext highlighter-rouge">/loop ralph N</code> in the TUI, or <code class="language-plaintext highlighter-rouge">clk loop --max-iterations N</code> on the CLI) implements a simple but effective iterative improvement cycle. In each iteration, Ralph reads the current state files and git log, identifies one measurable improvement, and produces a plan. The engineer implements the plan, QA validates it, and the harness runs any configured validation commands. If all checks pass and the working tree has changed, the harness commits the iteration with full metadata. If validation fails, <code class="language-plaintext highlighter-rouge">git reset --hard</code> reverts the working tree to the pre-iteration HEAD, and the outcome is recorded in <code class="language-plaintext highlighter-rouge">experiments.jsonl</code> so subsequent iterations can learn from the failure without contaminating the committed project state. The loop terminates when <code class="language-plaintext highlighter-rouge">done.md</code> is created or <code class="language-plaintext highlighter-rouge">max_iterations</code> is reached.</p>

<h3 id="the-autoresearch-loop">The Autoresearch Loop</h3>

<p>The autoresearch loop (<code class="language-plaintext highlighter-rouge">/loop autoresearch N</code>) implements a Karpathy-style research cycle oriented toward hypothesis generation and experimental learning rather than feature delivery. Each iteration, Ralph surveys the current state to identify the highest-value open question (recorded in the project’s <code class="language-plaintext highlighter-rouge">decisions.md</code> or blackboard), designs a small targeted experiment to address it, runs the experiment via ACTION blocks, and records the outcome in <code class="language-plaintext highlighter-rouge">experiments.jsonl</code> regardless of whether the experiment succeeded or failed. The philosophy is that failed experiments are useful data, and the loop accumulates a structured research log that informs subsequent iterations. Both loop modes respect the <code class="language-plaintext highlighter-rouge">done.md</code> termination condition and can be stopped gracefully via the TUI’s <code class="language-plaintext highlighter-rouge">/stop</code> command, which signals the loop to halt after the current iteration completes rather than aborting mid-iteration.</p>

<h2 id="the-provider-system">The Provider System</h2>

<p>All agent invocations flow through a provider abstraction defined in <code class="language-plaintext highlighter-rouge">providers/base.py</code>. Each provider implements <code class="language-plaintext highlighter-rouge">invoke(request: AgentRequest) -&gt; AgentResponse</code>, where the request carries the rendered prompt and metadata, and the response carries the text output, token counts, and a success flag. The harness ships seven provider implementations.</p>

<table>
  <thead>
    <tr>
      <th>Provider</th>
      <th>Invocation method</th>
      <th>Notes</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">shell</code></td>
      <td>Always available</td>
      <td>Dummy provider; echoes prompts and writes stub files. Useful for tests and dry runs.</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">claude</code></td>
      <td><code class="language-plaintext highlighter-rouge">claude</code> on PATH</td>
      <td>Runs <code class="language-plaintext highlighter-rouge">claude --print</code> non-interactively; supports both CLI auth and direct API key auth via the Anthropic Messages endpoint.</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">codex</code></td>
      <td><code class="language-plaintext highlighter-rouge">codex</code> on PATH</td>
      <td>Runs <code class="language-plaintext highlighter-rouge">codex exec</code>.</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">gemini</code></td>
      <td><code class="language-plaintext highlighter-rouge">gemini</code> on PATH</td>
      <td>Runs the Google Gemini CLI; prompt fed on stdin.</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">pi</code></td>
      <td><code class="language-plaintext highlighter-rouge">pi</code> on PATH or <code class="language-plaintext highlighter-rouge">.clk/tools/pi/</code></td>
      <td>Extensible terminal harness.</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">ollama</code></td>
      <td>HTTP endpoint</td>
      <td>Local LLM via HTTP; no API key required.</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">openwebui</code></td>
      <td>HTTP endpoint</td>
      <td>Any OpenAI-compatible server; configure <code class="language-plaintext highlighter-rouge">endpoint</code>, <code class="language-plaintext highlighter-rouge">api_key</code>, and <code class="language-plaintext highlighter-rouge">model</code> in <code class="language-plaintext highlighter-rouge">providers.json</code>.</td>
    </tr>
  </tbody>
</table>

<p>For the CLI-driven providers (<code class="language-plaintext highlighter-rouge">claude</code>, <code class="language-plaintext highlighter-rouge">codex</code>, <code class="language-plaintext highlighter-rouge">gemini</code>), CLK supports two authentication modes. The <code class="language-plaintext highlighter-rouge">cli</code> mode (default) spawns the provider’s local CLI as a subprocess and trusts whatever session authentication that CLI already has. The <code class="language-plaintext highlighter-rouge">apikey</code> mode bypasses the local CLI entirely and calls the upstream HTTP API directly, using standard environment variables (<code class="language-plaintext highlighter-rouge">ANTHROPIC_API_KEY</code>, <code class="language-plaintext highlighter-rouge">OPENAI_API_KEY</code>, <code class="language-plaintext highlighter-rouge">GEMINI_API_KEY</code>). This separation makes CLK usable in both developer workstation contexts (where a persistent CLI session is already authenticated) and CI/CD environments (where API keys are injected as secrets). Per-agent provider binding in <code class="language-plaintext highlighter-rouge">agents.json</code> means you can route different agents to different providers within the same workflow cycle, which is useful when you want cloud reasoning for the chief and local generation for lower-stakes implementation stages.</p>

<h2 id="the-tui-dashboard">The TUI Dashboard</h2>

<p>For interactive use, CLK ships a terminal user interface (TUI, implemented in <code class="language-plaintext highlighter-rouge">clk_harness/tui.py</code> using curses) that provides live visibility into the multi-agent execution. The TUI displays a card for each active agent showing its current state (idle, working, done, or failed), a scrollable status log that mirrors the session log file, a live heartbeat indicating whether a running agent subprocess is actively streaming or silent, and a command input field styled after Claude Code’s <code class="language-plaintext highlighter-rouge">&gt;</code> prompt. A bottom band reports running totals: agent count, total tokens consumed (input and output separately), peak concurrent runs, and files written.</p>

<p>The TUI also supports interactive steering via typed commands:</p>

<table>
  <thead>
    <tr>
      <th>Command</th>
      <th>Effect</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">free text</code></td>
      <td>First message becomes the project idea and triggers casting + <code class="language-plaintext highlighter-rouge">engineering</code>; subsequent messages append context and re-cast + re-run.</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">/idea &lt;text&gt;</code></td>
      <td>Replace the captured idea.</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">/cast</code></td>
      <td>Force a fresh chief casting pass against current state.</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">/roles list</code></td>
      <td>Print the current roster.</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">/roles add NAME "description"</code></td>
      <td>Add a dynamic role.</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">/roles drop NAME</code></td>
      <td>Remove a dynamic role (baseline agents cannot be removed).</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">/run [workflow]</code></td>
      <td>Execute a single workflow cycle (default: <code class="language-plaintext highlighter-rouge">engineering</code>).</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">/loop ralph N</code></td>
      <td>Start N iterations of the Ralph refinement loop.</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">/loop autoresearch N</code></td>
      <td>Start N iterations of the autoresearch loop.</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">/stop</code></td>
      <td>Request the active loop to halt after the current iteration.</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">/abort</code></td>
      <td>SIGTERM any running CLI subprocess.</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">/provider &lt;name&gt;</code></td>
      <td>Switch the active provider.</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">/quit</code></td>
      <td>Exit the TUI.</td>
    </tr>
  </tbody>
</table>

<p>The heartbeat mechanism deserves specific mention. CLI providers stream their subprocess’s stdout and stderr live to the status pane, so every line the CLI prints (authentication status messages, connection attempts, retry notices) appears within milliseconds. The heartbeat fires approximately every 15 seconds while an agent is working and distinguishes between an agent that is processing slowly and one that is genuinely hung. If a subprocess has been silent for more than two minutes, the TUI suggests typing <code class="language-plaintext highlighter-rouge">/abort</code>. This real-time observability is a significant quality-of-life improvement over systems that present a spinner with no indication of underlying activity.</p>

<h2 id="getting-started">Getting Started</h2>

<p>The fastest path is the kickoff script, which copies the harness into a fresh timestamped directory, initializes a git repository, and launches the TUI:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Optional: set defaults non-interactively by copying .env.example to .env</span>
./kickoff.sh <span class="s2">"A local-first journaling app that summarizes my week"</span>

<span class="c"># Or omit the prompt and type your idea into the TUI:</span>
./kickoff.sh
</code></pre></div></div>

<p>The kickoff directory is intentionally isolated: <code class="language-plaintext highlighter-rouge">.clk/</code> holds all harness state, the project tree receives the agents’ output, and the original CLK source directory is never modified.</p>

<p>If you prefer the command-line interface without the TUI:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./scripts/install_local.sh           <span class="c"># creates .clk/venv and installs PyYAML</span>
./scripts/clk init
./scripts/clk idea <span class="s2">"A local-first journaling app that summarizes my week"</span>
./scripts/clk plan
./scripts/clk run
./scripts/clk loop <span class="nt">--max-iterations</span> 10
</code></pre></div></div>

<p>Setting <code class="language-plaintext highlighter-rouge">CLK_NO_TUI=true</code> in your environment makes <code class="language-plaintext highlighter-rouge">kickoff.sh</code> fall back to the non-interactive pipeline automatically, which is the appropriate mode for CI/CD use.</p>

<h2 id="docker-deployment">Docker Deployment</h2>

<p>CLK ships a <code class="language-plaintext highlighter-rouge">Dockerfile</code> for containerized use. The default mode is the interactive TUI, so you launch with <code class="language-plaintext highlighter-rouge">-it</code>. Kickoff directories are created under <code class="language-plaintext highlighter-rouge">workspace/</code> inside the container; mounting a volume there preserves them across runs.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker build <span class="nt">-t</span> clk <span class="nb">.</span>
docker volume create clk-workspace

<span class="c"># Interactive TUI, with your idea passed as an argument:</span>
docker run <span class="nt">--rm</span> <span class="nt">-it</span> <span class="se">\</span>
  <span class="nt">-v</span> clk-workspace:/app/workspace <span class="se">\</span>
  clk <span class="s2">"A local-first journaling app that summarizes my week"</span>

<span class="c"># Non-interactive CI mode with Claude API key:</span>
docker run <span class="nt">--rm</span> <span class="se">\</span>
  <span class="nt">-v</span> clk-workspace:/app/workspace <span class="se">\</span>
  <span class="nt">-e</span> <span class="nv">CLK_NO_TUI</span><span class="o">=</span><span class="nb">true</span> <span class="se">\</span>
  <span class="nt">-e</span> <span class="nv">CLK_PROVIDER</span><span class="o">=</span>claude <span class="se">\</span>
  <span class="nt">-e</span> <span class="nv">CLK_AUTH_MODE</span><span class="o">=</span>apikey <span class="se">\</span>
  <span class="nt">-e</span> <span class="nv">ANTHROPIC_API_KEY</span><span class="o">=</span>sk-ant-... <span class="se">\</span>
  clk <span class="s2">"A local-first journaling app that summarizes my week"</span>
</code></pre></div></div>

<p>A setup wizard (<code class="language-plaintext highlighter-rouge">--setup</code>) walks through every configuration option and writes the results to a bind-mounted <code class="language-plaintext highlighter-rouge">.env</code> file, which is appropriate for first-run configuration in both local and container contexts.</p>

<h2 id="safety-and-reliability-mechanisms">Safety and Reliability Mechanisms</h2>

<p>CLK incorporates several safeguards designed to make autonomous operation safer without sacrificing the ability to take real action.</p>

<p>Failed work is never silently discarded. The Ralph loop uses <code class="language-plaintext highlighter-rouge">git reset --hard &lt;pre-iteration-sha&gt;</code> to revert failed iterations, and the pre-revert state is preserved in <code class="language-plaintext highlighter-rouge">.clk/runs/</code> for inspection. The backup system in <code class="language-plaintext highlighter-rouge">.clk/backups/</code> provides per-overwrite copies of every file the harness mutates, with backup filenames keyed to the run ID so you can trace a backup to the specific agent invocation that produced it.</p>

<p>Operations touching more than five files are logged with a warning before execution, and operations exceeding twenty-five files in a single response batch are refused entirely (the cap is configurable). This provides a meaningful check against an agent that attempts to restructure the entire project in one pass. The <code class="language-plaintext highlighter-rouge">run</code> action sanitizes shell commands before execution, rejecting <code class="language-plaintext highlighter-rouge">sudo</code> and recognizable destructive patterns.</p>

<p>All exceptions are caught and logged with a location-specific prefix string and a full traceback, consistent with the project’s exception-handling convention: <code class="language-plaintext highlighter-rouge">print(f"[module:function] {e}")</code> followed by <code class="language-plaintext highlighter-rouge">traceback.print_exc()</code>. This means that a provider failure, a YAML parse error, or a filesystem permission issue is always surfaced with enough context to diagnose the problem without reading the full run log.</p>

<h2 id="completion-criteria">Completion Criteria</h2>

<p>CLK considers the project “done” when <code class="language-plaintext highlighter-rouge">.clk/state/done.md</code> exists. By convention, the harness creates this file only when the chief determines that the MVP runs locally, the test suite passes, the README explains setup, a deployment plan and checklist exist, and at least one user-facing interaction path has been implemented. These criteria are not enforced programmatically; they are encoded in the chief’s system prompt as the completion standard it reasons about when deciding whether to emit <code class="language-plaintext highlighter-rouge">ACTION: done</code> or propose another workflow cycle.</p>

<h2 id="customization">Customization</h2>

<p>CLK is designed to be customized at every layer. Editing <code class="language-plaintext highlighter-rouge">.clk/prompts/</code> changes individual agent behavior without touching harness code. Editing <code class="language-plaintext highlighter-rouge">.clk/config/agents.json</code> rebinds specific agents to specific providers, for example routing the <code class="language-plaintext highlighter-rouge">engineer</code> agent to <code class="language-plaintext highlighter-rouge">claude</code> while keeping <code class="language-plaintext highlighter-rouge">researcher</code> on a local <code class="language-plaintext highlighter-rouge">ollama</code> model. Adding new YAML files to <code class="language-plaintext highlighter-rouge">.clk/config/workflows/</code> introduces new execution modes, accessible via <code class="language-plaintext highlighter-rouge">clk run --workflow NAME</code> or <code class="language-plaintext highlighter-rouge">/run workflow_name</code> in the TUI. Project-wide parameters including the supervise cycle cap, the recovery pass limit, the maximum dynamic role count, and the file-action batch limit are all configurable via <code class="language-plaintext highlighter-rouge">clk configure --set key=value</code>, which updates <code class="language-plaintext highlighter-rouge">clk.config.json</code> in place.</p>

<h2 id="pidev-extension">pi.dev Extension</h2>

<p>CLK exposes <code class="language-plaintext highlighter-rouge">/clk &lt;idea&gt;</code> and <code class="language-plaintext highlighter-rouge">/clk-abort</code> commands if run with <code class="language-plaintext highlighter-rouge">pi -e pi-extension/src/index.ts</code> which starts a similar dispatch loop as that described here for the standalone tool.</p>

<h2 id="summary">Summary</h2>

<p>The Cognitive Loop Kernel addresses a practical gap in the current landscape of agentic development tools: the need for a local-first, inspectable, provider-agnostic harness that can assemble a project-specific multi-agent team, execute real file operations under safety constraints, iterate toward completion through structured loops, and maintain a complete, human-readable audit trail throughout. By treating the workflow definition, agent roster, blackboard contents, and git history as first-class project artifacts, CLK makes the behavior of the multi-agent system legible and modifiable without requiring changes to the harness itself. The source is available on <a href="https://github.com/BillJr99/CognitiveLoopKernel">GitHub</a> under the MIT License.</p>]]></content><author><name>Bill Mongan</name><email>billmongan+website@gmail.com</email></author><category term="technical" /><category term="ai" /><category term="agentic" /><category term="software" /><summary type="html"><![CDATA[The emergence of capable code-generation models has prompted a wave of experiments in autonomous software development, where LLM agents plan, implement, test, and revise code with minimal human intervention. Most of these systems, however, rely on cloud orchestration services, opaque runtimes, or monolithic agent designs that make it difficult to inspect, customize, or extend the underlying behavior. The Cognitive Loop Kernel (CLK) is a local-first, multi-agent development harness that attempts to address these constraints directly. You give it an idea, and a dynamically assembled team of agents iterates that idea into a working system through repeated agentic development cycles, all under your local filesystem.]]></summary></entry><entry><title type="html">Building a Private AI Stack: From Mini PC to Autonomous Agents</title><link href="https://www.billmongan.com/posts/2026/05/privateaistack/" rel="alternate" type="text/html" title="Building a Private AI Stack: From Mini PC to Autonomous Agents" /><published>2026-05-01T00:00:00+00:00</published><updated>2026-05-01T00:00:00+00:00</updated><id>https://www.billmongan.com/posts/2026/05/localai</id><content type="html" xml:base="https://www.billmongan.com/posts/2026/05/privateaistack/"><![CDATA[<p>For the past several years I have been thinking carefully about what it means to run AI infrastructure that I actually own, control, and understand from the ground up. The rapid proliferation of frontier model APIs, agentic coding tools, and open-weight model releases in 2025-2026 finally made this tractable at a price and complexity point that a single person could manage. This post documents the architecture I settled on: a self-hosted, Docker-based stack running on a mini PC, unified by a single OpenAI-compatible model gateway, and surfaced through a collection of local inference servers, agentic CLI tools, autonomous agent frameworks, open-source Cowork alternatives, and a task-bounded command harness built around structured queues. My goals were privacy, sovereignty, reproducibility, and the ability to swap components without rebuilding everything from scratch.</p>

<h2 id="motivation-why-self-host">Motivation: Why Self-Host?</h2>

<p>The short answer is control. When I work on grant-funded AI research, on student data in the context of my courses, or on institutional planning, I want to know exactly where inference is happening and what data is leaving my environment. The longer answer is pedagogical: I cannot credibly teach AI literacy, AI ethics, and responsible deployment if I have not made serious, hands-on architectural decisions myself. Running your own stack is humbling in the right ways.</p>

<p>There is also a strategic argument. As the open-source AI agent framework ecosystem matured through early 2026, it became clear that the layered architecture of these systems, separating the model layer, the orchestration layer, the tool-access layer, and the user-interface layer, was stabilizing into recognizable patterns. A well-designed self-hosted stack can plug into any of these layers without being locked to a single vendor. That flexibility is worth the setup cost.</p>

<h2 id="hardware-the-mini-pc">Hardware: The Mini PC</h2>

<p>The physical foundation of the stack is a mini PC running Linux Mint, which gives me a clean Debian-lineage environment with full access to the upstream Docker Engine repository and no virtualization layer between the container workloads and the host kernel. Everything runs directly on the host, which simplifies networking, volume permissions, and service lifecycle management considerably compared to hypervisor-based setups.</p>

<h3 id="installing-docker-engine">Installing Docker Engine</h3>

<p>Linux Mint ships an older Docker build from the distribution mirror, so I install from the upstream Docker repository instead. The setup sequence is:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Remove any distribution-packaged versions</span>
<span class="nb">sudo </span>apt-get remove docker docker-engine docker.io containerd runc
 
<span class="nb">sudo </span>apt-get update
<span class="nb">sudo </span>apt-get <span class="nb">install</span> <span class="nt">-y</span> ca-certificates curl gnupg lsb-release
 
<span class="c"># Add Docker's official GPG key</span>
<span class="nb">sudo mkdir</span> <span class="nt">-m</span> 0755 <span class="nt">-p</span> /etc/apt/keyrings
curl <span class="nt">-fsSL</span> https://download.docker.com/linux/ubuntu/gpg | <span class="se">\</span>
  <span class="nb">sudo </span>gpg <span class="nt">--dearmor</span> <span class="nt">-o</span> /etc/apt/keyrings/docker.gpg
 
<span class="c"># Point the apt source at the Ubuntu codename that Mint is based on</span>
<span class="nb">echo</span> <span class="se">\</span>
  <span class="s2">"deb [arch=</span><span class="si">$(</span>dpkg <span class="nt">--print-architecture</span><span class="si">)</span><span class="s2"> </span><span class="se">\</span><span class="s2">
  signed-by=/etc/apt/keyrings/docker.gpg] </span><span class="se">\</span><span class="s2">
  https://download.docker.com/linux/ubuntu </span><span class="se">\</span><span class="s2">
  </span><span class="si">$(</span><span class="nb">.</span> /etc/os-release <span class="o">&amp;&amp;</span> <span class="nb">echo</span> <span class="nv">$UBUNTU_CODENAME</span><span class="si">)</span><span class="s2"> stable"</span> | <span class="se">\</span>
  <span class="nb">sudo tee</span> /etc/apt/sources.list.d/docker.list <span class="o">&gt;</span> /dev/null
 
<span class="nb">sudo </span>apt-get update
<span class="nb">sudo </span>apt-get <span class="nb">install</span> <span class="nt">-y</span> docker-ce docker-ce-cli containerd.io <span class="se">\</span>
  docker-buildx-plugin docker-compose-plugin
 
<span class="nb">sudo </span>docker run hello-world
<span class="nb">sudo </span>usermod <span class="nt">-aG</span> docker <span class="nv">$USER</span>
newgrp docker
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">UBUNTU_CODENAME</code> variable in the source line is the key detail for Mint: it resolves to the underlying Ubuntu release name rather than the Mint release name, which is what the Docker repository actually indexes.</p>

<h3 id="installing-ollama">Installing Ollama</h3>

<p>Ollama runs as a native Linux service, not inside a container, which keeps model-weight I/O off the Docker networking path and avoids bind-mount overhead for the weight files:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>curl <span class="nt">-fsSL</span> https://ollama.com/install.sh | sh
systemctl status ollama
</code></pre></div></div>

<p>Ollama listens on <code class="language-plaintext highlighter-rouge">http://localhost:11434</code> by default and is reachable from within containers via <code class="language-plaintext highlighter-rouge">--add-host=host.docker.internal:host-gateway</code>, which every container in the stack declares. This flag is required on Linux Docker Engine; unlike Docker Desktop, the Linux engine does not inject <code class="language-plaintext highlighter-rouge">host.docker.internal</code> automatically.</p>

<h3 id="workspace-layout">Workspace Layout</h3>

<p>All agent state lives under a single root in the home directory:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$HOME/agents/
├── workspace/              # Shared project files (all TUI tools mount here)
├── skills/core/            # Read-only skills package (mounted :ro)
├── litellm/                # LiteLLM config and Compose files
├── kilocode/home/          # KiloCode identity
├── opencode/home/          # opencode.ai identity
├── pi/home/                # pi.dev identity, models.json, settings.json
├── hermes/home/            # Hermes agent identity
├── gnhf/                   # Good Night Have Fun harness
├── open-design/            # Open Design app data and pi identity
├── commercial/             # Claude Code, Codex, Gemini CLI, Copilot, CCR, bash
├── mastra/data/            # Mastra SQLite database
├── a0/data/                # Agent Zero persistent user data
├── archon/data/            # Archon workflow state
├── ollama/data/            # Downloaded model weights (~/.ollama)
├── localai/                # LocalAI models/, backends/, config/
├── googleworkspacecli/     # Google Workspace CLI (gcloud/)
├── googleagentscli/        # Google ADK CLI
└── openwebui/data/         # Open WebUI persistent data
</code></pre></div></div>

<p>Creating the full tree is a one-liner drawn from the deploy script:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">BASE</span><span class="o">=</span><span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents"</span>
<span class="nb">mkdir</span> <span class="nt">-p</span> <span class="se">\</span>
  <span class="s2">"</span><span class="nv">$BASE</span><span class="s2">/workspace"</span> <span class="s2">"</span><span class="nv">$BASE</span><span class="s2">/skills/core"</span> <span class="s2">"</span><span class="nv">$BASE</span><span class="s2">/litellm"</span> <span class="se">\</span>
  <span class="s2">"</span><span class="nv">$BASE</span><span class="s2">/kilocode/home"</span> <span class="s2">"</span><span class="nv">$BASE</span><span class="s2">/opencode/home"</span> <span class="s2">"</span><span class="nv">$BASE</span><span class="s2">/pi/home"</span> <span class="se">\</span>
  <span class="s2">"</span><span class="nv">$BASE</span><span class="s2">/hermes/home"</span> <span class="s2">"</span><span class="nv">$BASE</span><span class="s2">/ollama/data"</span> <span class="se">\</span>
  <span class="s2">"</span><span class="nv">$BASE</span><span class="s2">/localai/models"</span> <span class="s2">"</span><span class="nv">$BASE</span><span class="s2">/localai/backends"</span> <span class="s2">"</span><span class="nv">$BASE</span><span class="s2">/localai/config"</span> <span class="se">\</span>
  <span class="s2">"</span><span class="nv">$BASE</span><span class="s2">/mastra/data"</span> <span class="s2">"</span><span class="nv">$BASE</span><span class="s2">/openwebui/data"</span> <span class="se">\</span>
  <span class="s2">"</span><span class="nv">$BASE</span><span class="s2">/commercial/claude/workspace"</span> <span class="s2">"</span><span class="nv">$BASE</span><span class="s2">/commercial/claude/home"</span> <span class="se">\</span>
  <span class="s2">"</span><span class="nv">$BASE</span><span class="s2">/commercial/claude/npm"</span> <span class="s2">"</span><span class="nv">$BASE</span><span class="s2">/commercial/claude/config"</span> <span class="se">\</span>
  <span class="s2">"</span><span class="nv">$BASE</span><span class="s2">/commercial/claude/cache"</span> <span class="se">\</span>
  <span class="s2">"</span><span class="nv">$BASE</span><span class="s2">/googleworkspacecli/gcloud"</span> <span class="se">\</span>
  <span class="s2">"</span><span class="nv">$BASE</span><span class="s2">/googleagentscli/data"</span> <span class="s2">"</span><span class="nv">$BASE</span><span class="s2">/googleagentscli/config"</span> <span class="se">\</span>
  <span class="s2">"</span><span class="nv">$BASE</span><span class="s2">/googleagentscli/cache/uv"</span> <span class="s2">"</span><span class="nv">$BASE</span><span class="s2">/googleagentscli/cache/npm"</span> <span class="se">\</span>
  <span class="s2">"</span><span class="nv">$BASE</span><span class="s2">/googleagentscli/evals"</span> <span class="s2">"</span><span class="nv">$BASE</span><span class="s2">/googleagentscli/logs"</span> <span class="se">\</span>
  <span class="s2">"</span><span class="nv">$BASE</span><span class="s2">/archon/data"</span> <span class="s2">"</span><span class="nv">$BASE</span><span class="s2">/a0/data"</span> <span class="se">\</span>
  <span class="s2">"</span><span class="nv">$BASE</span><span class="s2">/gnhf/home"</span> <span class="s2">"</span><span class="nv">$BASE</span><span class="s2">/gnhf/npm"</span> <span class="s2">"</span><span class="nv">$BASE</span><span class="s2">/gnhf/config"</span> <span class="s2">"</span><span class="nv">$BASE</span><span class="s2">/gnhf/cache"</span> <span class="se">\</span>
  <span class="s2">"</span><span class="nv">$BASE</span><span class="s2">/open-design/data"</span> <span class="s2">"</span><span class="nv">$BASE</span><span class="s2">/open-design/pi"</span>
<span class="nb">echo</span> <span class="s2">"Directory tree created under </span><span class="nv">$BASE</span><span class="s2">"</span>
</code></pre></div></div>

<h2 id="the-core-design-principle-a-unified-model-gateway">The Core Design Principle: A Unified Model Gateway</h2>

<p>The single most important architectural decision I made was to route all LLM inference through a single OpenAI-compatible endpoint rather than having each tool reach out to Ollama, Anthropic, or OpenRouter independently. I use <a href="https://github.com/BerriAI/litellm">LiteLLM</a> for this. Every service in the stack sends its requests to <code class="language-plaintext highlighter-rouge">http://localhost:4000/v1</code> with the bearer token <code class="language-plaintext highlighter-rouge">sk-litellm-local</code>. LiteLLM translates those requests to whatever backend is appropriate, whether a local Ollama model, a LocalAI GGUF endpoint, or an OpenRouter free-tier model, according to a YAML routing configuration.</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">model_list</span><span class="pi">:</span>
  <span class="pi">-</span> <span class="na">model_name</span><span class="pi">:</span> <span class="s">llama3</span>
    <span class="na">litellm_params</span><span class="pi">:</span>
      <span class="na">model</span><span class="pi">:</span> <span class="s">ollama/llama3</span>
      <span class="na">api_base</span><span class="pi">:</span> <span class="s">http://host.docker.internal:11434</span>
 
  <span class="pi">-</span> <span class="na">model_name</span><span class="pi">:</span> <span class="s">qwen2.5-3b</span>
    <span class="na">litellm_params</span><span class="pi">:</span>
      <span class="na">model</span><span class="pi">:</span> <span class="s">ollama/qwen2.5:3b</span>
      <span class="na">api_base</span><span class="pi">:</span> <span class="s">http://host.docker.internal:11434</span>
 
  <span class="pi">-</span> <span class="na">model_name</span><span class="pi">:</span> <span class="s">hermes3</span>
    <span class="na">litellm_params</span><span class="pi">:</span>
      <span class="na">model</span><span class="pi">:</span> <span class="s">ollama/hermes3:8b</span>
      <span class="na">api_base</span><span class="pi">:</span> <span class="s">http://host.docker.internal:11434</span>
 
  <span class="pi">-</span> <span class="na">model_name</span><span class="pi">:</span> <span class="s">openrouter/auto</span>
    <span class="na">litellm_params</span><span class="pi">:</span>
      <span class="na">model</span><span class="pi">:</span> <span class="s">openrouter/auto</span>
      <span class="na">api_key</span><span class="pi">:</span> <span class="s">${OPENROUTER_API_KEY}</span>
      <span class="na">api_base</span><span class="pi">:</span> <span class="s">https://openrouter.ai/api/v1</span>
</code></pre></div></div>

<p>The practical consequence is that changing a model, adding a provider, or adjusting routing requires editing one YAML file and restarting one service. No other container needs to know that anything changed. This is the same composability principle I teach in software engineering: minimize coupling, maximize cohesion.</p>

<p>Connecting a commercial CLI tool like Claude Code to the local stack requires only two environment variable exports:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">export </span><span class="nv">ANTHROPIC_BASE_URL</span><span class="o">=</span>http://localhost:4000
<span class="nb">export </span><span class="nv">ANTHROPIC_API_KEY</span><span class="o">=</span>sk-litellm-local
</code></pre></div></div>

<h3 id="litellm-deployment">LiteLLM Deployment</h3>

<p>LiteLLM runs as a Docker Compose service. The <code class="language-plaintext highlighter-rouge">docker-compose.yml</code> pulls the pre-built image; everything else is bind-mounted configuration.</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># litellm/docker-compose.yml</span>
<span class="na">version</span><span class="pi">:</span> <span class="s2">"</span><span class="s">3.9"</span>
<span class="na">services</span><span class="pi">:</span>
  <span class="na">litellm</span><span class="pi">:</span>
    <span class="na">image</span><span class="pi">:</span> <span class="s">ghcr.io/berriai/litellm:main-latest</span>
    <span class="na">container_name</span><span class="pi">:</span> <span class="s">litellm-${USER}</span>
    <span class="na">restart</span><span class="pi">:</span> <span class="s">unless-stopped</span>
    <span class="na">ports</span><span class="pi">:</span>
      <span class="pi">-</span> <span class="s2">"</span><span class="s">4000:4000"</span>
    <span class="na">volumes</span><span class="pi">:</span>
      <span class="pi">-</span> <span class="s">./litellm_config.yaml:/app/config.yaml:ro</span>
    <span class="na">env_file</span><span class="pi">:</span>
      <span class="pi">-</span> <span class="s">.env</span>
    <span class="na">command</span><span class="pi">:</span> <span class="pi">[</span><span class="s2">"</span><span class="s">--config"</span><span class="pi">,</span> <span class="s2">"</span><span class="s">/app/config.yaml"</span><span class="pi">,</span> <span class="s2">"</span><span class="s">--port"</span><span class="pi">,</span> <span class="s2">"</span><span class="s">4000"</span><span class="pi">,</span> <span class="s2">"</span><span class="s">--num_workers"</span><span class="pi">,</span> <span class="s2">"</span><span class="s">2"</span><span class="pi">]</span>
    <span class="na">extra_hosts</span><span class="pi">:</span>
      <span class="pi">-</span> <span class="s2">"</span><span class="s">host.docker.internal:host-gateway"</span>
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">.env</code> file holds the master key and any cloud provider keys:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># litellm/.env</span>
<span class="nv">LITELLM_MASTER_KEY</span><span class="o">=</span>sk-litellm-local
<span class="nv">OPENROUTER_API_KEY</span><span class="o">=</span>YOUR_OPENROUTER_API_KEY
</code></pre></div></div>

<p>Build and run scripts are minimal wrappers around Compose:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># litellm/build.sh — pull image, start, verify endpoint</span>
<span class="nb">cd</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/litellm"</span>
docker compose pull
docker compose up <span class="nt">-d</span>
<span class="nb">sleep </span>20
curl <span class="nt">-s</span> http://localhost:4000/models <span class="se">\</span>
  <span class="nt">-H</span> <span class="s2">"Authorization: Bearer sk-litellm-local"</span> | python3 <span class="nt">-m</span> json.tool | <span class="nb">head</span> <span class="nt">-20</span>
 
<span class="c"># litellm/run.sh — start after reboot</span>
<span class="nb">cd</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/litellm"</span> <span class="o">&amp;&amp;</span> docker compose up <span class="nt">-d</span>
 
<span class="c"># litellm/attach.sh — tail live logs</span>
<span class="nb">cd</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/litellm"</span> <span class="o">&amp;&amp;</span> docker compose logs <span class="nt">--tail</span><span class="o">=</span>30 <span class="nt">-f</span> litellm-<span class="k">${</span><span class="nv">USER</span><span class="k">}</span>
</code></pre></div></div>

<p>Restart after a config change: <code class="language-plaintext highlighter-rouge">cd $HOME/agents/litellm &amp;&amp; docker compose down &amp;&amp; docker compose up -d</code>.</p>

<h2 id="the-full-stack-services-and-ports">The Full Stack: Services and Ports</h2>

<table>
  <thead>
    <tr>
      <th>Service</th>
      <th>Port</th>
      <th>Purpose</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>LiteLLM</td>
      <td>4000</td>
      <td>Unified model gateway (OpenAI-compatible)</td>
    </tr>
    <tr>
      <td>Ollama</td>
      <td>11434</td>
      <td>Local LLM inference (native systemd service)</td>
    </tr>
    <tr>
      <td>LocalAI</td>
      <td>8080</td>
      <td>GGUF model inference (OpenAI-compatible)</td>
    </tr>
    <tr>
      <td>Open WebUI</td>
      <td>3000</td>
      <td>Browser-based LLM frontend with MCP tool calling</td>
    </tr>
    <tr>
      <td>Mastra</td>
      <td>4111</td>
      <td>TypeScript AI agent server (API + Studio UI)</td>
    </tr>
    <tr>
      <td>Agent Zero</td>
      <td>8081</td>
      <td>Autonomous hierarchical agent with web UI</td>
    </tr>
    <tr>
      <td>Archon</td>
      <td>3090</td>
      <td>Workflow-driven agent runner</td>
    </tr>
    <tr>
      <td>Open Design</td>
      <td>5173</td>
      <td>Collaborative design canvas with embedded pi agent</td>
    </tr>
    <tr>
      <td>Portainer</td>
      <td>9000</td>
      <td>Docker management UI</td>
    </tr>
  </tbody>
</table>

<h2 id="local-inference-ollama-and-localai">Local Inference: Ollama and LocalAI</h2>

<p>The stack runs two local inference backends with distinct tradeoffs, and LiteLLM routes between them transparently based on model alias.</p>

<p><strong>Ollama</strong> is the primary inference backend. Its systemd service model keeps it available before Docker is fully up, its model management CLI is clean, and its HTTP API is stable. The models I maintain are selected for RAM footprint first: <code class="language-plaintext highlighter-rouge">phi4-mini</code>, <code class="language-plaintext highlighter-rouge">smollm2</code>, <code class="language-plaintext highlighter-rouge">gemma4:e2b</code>, <code class="language-plaintext highlighter-rouge">qwen2.5:1.5b</code>, <code class="language-plaintext highlighter-rouge">qwen2.5:3b</code>, <code class="language-plaintext highlighter-rouge">llama3</code>, and <code class="language-plaintext highlighter-rouge">hermes3:8b</code>.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># ollama/run.sh</span>
docker run <span class="nt">-d</span> <span class="se">\</span>
  <span class="nt">--name</span> ollama-<span class="k">${</span><span class="nv">USER</span><span class="k">}</span> <span class="se">\</span>
  <span class="nt">--restart</span> no <span class="se">\</span>
  <span class="nt">--add-host</span><span class="o">=</span>host.docker.internal:host-gateway <span class="se">\</span>
  <span class="nt">-p</span> 11434:11434 <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/ollama/data:/root/.ollama"</span> <span class="se">\</span>
  ollama/ollama:latest
 
<span class="c"># Pull models after starting</span>
<span class="k">for </span>model <span class="k">in </span>phi4-mini smollm2 <span class="s2">"gemma4:e2b"</span> <span class="s2">"qwen2.5:1.5b"</span> <span class="s2">"qwen2.5:3b"</span> llama3 <span class="s2">"hermes3:8b"</span><span class="p">;</span> <span class="k">do
  </span>docker <span class="nb">exec </span>ollama-<span class="k">${</span><span class="nv">USER</span><span class="k">}</span> ollama pull <span class="s2">"</span><span class="nv">$model</span><span class="s2">"</span>
<span class="k">done</span>
</code></pre></div></div>

<p><strong>LocalAI</strong> (<a href="https://github.com/mudler/LocalAI">github.com/mudler/LocalAI</a>) is the secondary inference backend, running on port 8080 behind an OpenAI-compatible API surface. It supports llama.cpp for text generation, whisper.cpp for speech transcription, and stable diffusion for image generation, all behind the same endpoint. LocalAI is organized around three bind-mounted directories: <code class="language-plaintext highlighter-rouge">models/</code> holds GGUF weight files, <code class="language-plaintext highlighter-rouge">backends/</code> holds compiled backend binaries, and <code class="language-plaintext highlighter-rouge">config/</code> holds per-model YAML configuration. Note that the host directory is named <code class="language-plaintext highlighter-rouge">config/</code> while the container mount path is <code class="language-plaintext highlighter-rouge">/configuration</code>; this asymmetry is intentional and must be preserved.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># localai/run.sh</span>
docker run <span class="nt">-d</span> <span class="se">\</span>
  <span class="nt">--name</span> localai-<span class="k">${</span><span class="nv">USER</span><span class="k">}</span> <span class="se">\</span>
  <span class="nt">--restart</span> no <span class="se">\</span>
  <span class="nt">--add-host</span><span class="o">=</span>host.docker.internal:host-gateway <span class="se">\</span>
  <span class="nt">-p</span> 8080:8080 <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/localai/models:/models"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/localai/backends:/backends"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/localai/config:/configuration"</span> <span class="se">\</span>
  localai/localai:latest
</code></pre></div></div>

<p>A per-model YAML config controls backend selection and context length:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># localai/config/phi4-mini.yaml</span>
<span class="na">name</span><span class="pi">:</span> <span class="s">phi4-mini</span>
<span class="na">backend</span><span class="pi">:</span> <span class="s">llama</span>
<span class="na">parameters</span><span class="pi">:</span>
  <span class="na">model</span><span class="pi">:</span> <span class="s">phi-4-mini-instruct.Q4_K_M.gguf</span>
  <span class="na">context_size</span><span class="pi">:</span> <span class="m">8192</span>
  <span class="na">threads</span><span class="pi">:</span> <span class="m">8</span>
</code></pre></div></div>

<h2 id="browser-frontend-open-webui">Browser Frontend: Open WebUI</h2>

<p><a href="https://github.com/open-webui/open-webui">Open WebUI</a> is the browser-based interface for direct LLM interaction, running on port 3000. It connects to Ollama directly and enumerates available models automatically. Its native MCP tool-calling support (version 0.4 and later) intercepts tool-call responses, dispatches them to registered MCP servers, and injects results back into the conversation.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># openwebui/run.sh</span>
docker run <span class="nt">-d</span> <span class="nt">-p</span> 3000:8080 <span class="se">\</span>
  <span class="nt">--add-host</span><span class="o">=</span>host.docker.internal:host-gateway <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/openwebui/data:/app/backend/data"</span> <span class="se">\</span>
  <span class="nt">--name</span> open-webui-<span class="k">${</span><span class="nv">USER</span><span class="k">}</span> <span class="se">\</span>
  ghcr.io/open-webui/open-webui:main
</code></pre></div></div>

<p>After launch, connect to LiteLLM via Admin Settings → Connections → OpenAI: set the URL to <code class="language-plaintext highlighter-rouge">http://host.docker.internal:4000/v1</code> and the API key to <code class="language-plaintext highlighter-rouge">sk-litellm-local</code>. Models confirmed to work reliably with Open WebUI tool calling include <code class="language-plaintext highlighter-rouge">hermes3:8b</code>, <code class="language-plaintext highlighter-rouge">llama3.1:8b</code>, <code class="language-plaintext highlighter-rouge">qwen2.5:7b</code>, <code class="language-plaintext highlighter-rouge">qwen2.5:14b</code>, and <code class="language-plaintext highlighter-rouge">mistral-nemo:12b</code>.</p>

<h2 id="agentic-cli-tools-a-comparative-survey">Agentic CLI Tools: A Comparative Survey</h2>

<p>By April 2026, a mature set of agentic coding CLI tools has emerged with distinct architectural philosophies, and I run all of them through the unified LiteLLM gateway. Each tool runs inside a dedicated Docker container with an identity bind mount, a shared workspace mount, and an optional skills mount. The sections below show the Dockerfile, build script, and run script for each.</p>

<p><strong>Claude Code</strong> (Anthropic, Node.js) is the most fully featured in terms of built-in subagent support, MCP integration, and permission gate granularity. It uses <code class="language-plaintext highlighter-rouge">CLAUDE.md</code> files for project context and <code class="language-plaintext highlighter-rouge">.claude/agents/</code> Markdown files for custom subagent definitions.</p>

<p><strong>OpenAI Codex CLI</strong> (Rust) supports native multi-provider configuration through a TOML config file. Custom providers are defined as named sections:</p>

<div class="language-toml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="py">model</span> <span class="p">=</span> <span class="s">"llama3.3:70b"</span>
<span class="py">model_provider</span> <span class="p">=</span> <span class="s">"openwebui"</span>
 
<span class="nn">[model_providers.openwebui]</span>
<span class="py">name</span> <span class="p">=</span> <span class="s">"Open WebUI"</span>
<span class="py">base_url</span> <span class="p">=</span> <span class="s">"http://localhost:3000/openai"</span>
<span class="py">env_key</span> <span class="p">=</span> <span class="s">"OPENWEBUI_API_KEY"</span>
</code></pre></div></div>

<p><strong>Gemini CLI</strong> (Google, Node.js) uses <code class="language-plaintext highlighter-rouge">GEMINI.md</code> files for project context and a three-tier discovery hierarchy for skills. Routing it through an OpenAI-compatible endpoint requires the <code class="language-plaintext highlighter-rouge">open-gemini-cli</code> fork, which injects an adapter layer that translates Gemini’s internal message format.</p>

<p><strong>OpenCode</strong> (opencode.ai, Go) is the most flexible in terms of provider support, relying on the <code class="language-plaintext highlighter-rouge">@ai-sdk/openai-compatible</code> adapter to connect to any OpenAI-compatible backend:</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"provider"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="nl">"litellm"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
      </span><span class="nl">"npm"</span><span class="p">:</span><span class="w"> </span><span class="s2">"@ai-sdk/openai-compatible"</span><span class="p">,</span><span class="w">
      </span><span class="nl">"options"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nl">"baseURL"</span><span class="p">:</span><span class="w"> </span><span class="s2">"http://localhost:4000/v1"</span><span class="w"> </span><span class="p">},</span><span class="w">
      </span><span class="nl">"models"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nl">"llama3"</span><span class="p">:</span><span class="w"> </span><span class="p">{},</span><span class="w"> </span><span class="nl">"qwen2.5-3b"</span><span class="p">:</span><span class="w"> </span><span class="p">{},</span><span class="w"> </span><span class="nl">"hermes3"</span><span class="p">:</span><span class="w"> </span><span class="p">{}</span><span class="w"> </span><span class="p">}</span><span class="w">
    </span><span class="p">}</span><span class="w">
  </span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<h3 id="commercial-tools-deployment-claude-code-codex-gemini-cli-copilot-ccr-bash">Commercial Tools Deployment (Claude Code, Codex, Gemini CLI, Copilot, CCR, bash)</h3>

<p>Claude Code, Codex, Gemini CLI, GitHub Copilot, the Claude Code Router, and a plain <code class="language-plaintext highlighter-rouge">bash</code> shell all share a single Docker image. The tool to launch is selected at runtime as an argument to <code class="language-plaintext highlighter-rouge">run.sh</code>. The <code class="language-plaintext highlighter-rouge">bash</code> option is a deliberately included escape hatch: it drops into an interactive shell inside the <code class="language-plaintext highlighter-rouge">commercial-ai</code> container with all identity and workspace volumes mounted, but without launching any AI tool. This is useful for inspecting the container environment, running scripts manually, debugging mount layouts, or staging files before invoking an AI tool in a separate session.</p>

<div class="language-dockerfile highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># commercial/Dockerfile</span>
<span class="k">FROM</span><span class="s"> node:22-bookworm</span>
 
<span class="k">RUN </span>apt-get update <span class="o">&amp;&amp;</span> apt-get <span class="nb">install</span> <span class="nt">-y</span> <span class="nt">--no-install-recommends</span> <span class="se">\
</span>    git curl ca-certificates ripgrep less nano vim <span class="se">\
</span>    <span class="o">&amp;&amp;</span> <span class="nb">rm</span> <span class="nt">-rf</span> /var/lib/apt/lists/<span class="k">*</span>
 
<span class="k">RUN </span>npm <span class="nb">install</span> <span class="nt">-g</span> <span class="se">\
</span>    @anthropic-ai/claude-code <span class="se">\
</span>    @openai/codex <span class="se">\
</span>    @google/gemini-cli <span class="se">\
</span>    @github/copilot <span class="se">\
</span>    @musistudio/claude-code-router
 
<span class="k">RUN </span><span class="nb">mkdir</span> <span class="nt">-p</span> /root/.claude-code-router
<span class="k">WORKDIR</span><span class="s"> /workspace</span>
<span class="k">CMD</span><span class="s"> ["/bin/bash"]</span>
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">build.sh</code> creates per-tool identity directories for all six options, including <code class="language-plaintext highlighter-rouge">bash</code>, so the directory layout is consistent regardless of which tool is invoked:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/usr/bin/env bash</span>
<span class="nb">set</span> <span class="nt">-euo</span> pipefail
 
docker build <span class="nt">-t</span> commercial-ai:latest <span class="nb">.</span>
 
<span class="nv">BASE_DIR</span><span class="o">=</span><span class="s2">"/home/bill/agents/commercial"</span>
<span class="nb">mkdir</span> <span class="nt">-p</span> <span class="s2">"</span><span class="k">${</span><span class="nv">BASE_DIR</span><span class="k">}</span><span class="s2">/workspace"</span>
 
<span class="k">for </span>tool <span class="k">in </span>claude codex gemini copilot ccr bash<span class="p">;</span> <span class="k">do
  </span><span class="nb">mkdir</span> <span class="nt">-p</span> <span class="se">\</span>
    <span class="s2">"</span><span class="k">${</span><span class="nv">BASE_DIR</span><span class="k">}</span><span class="s2">/</span><span class="k">${</span><span class="nv">tool</span><span class="k">}</span><span class="s2">/home"</span> <span class="se">\</span>
    <span class="s2">"</span><span class="k">${</span><span class="nv">BASE_DIR</span><span class="k">}</span><span class="s2">/</span><span class="k">${</span><span class="nv">tool</span><span class="k">}</span><span class="s2">/npm"</span> <span class="se">\</span>
    <span class="s2">"</span><span class="k">${</span><span class="nv">BASE_DIR</span><span class="k">}</span><span class="s2">/</span><span class="k">${</span><span class="nv">tool</span><span class="k">}</span><span class="s2">/config"</span> <span class="se">\</span>
    <span class="s2">"</span><span class="k">${</span><span class="nv">BASE_DIR</span><span class="k">}</span><span class="s2">/</span><span class="k">${</span><span class="nv">tool</span><span class="k">}</span><span class="s2">/cache"</span>
<span class="k">done</span>
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">run.sh</code> dispatches on the tool name to set the API variable and CLI command. The <code class="language-plaintext highlighter-rouge">bash</code> case sets no API variable and uses <code class="language-plaintext highlighter-rouge">/bin/bash</code> as the command, so no credential prompt is issued and the container simply provides an interactive shell:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/usr/bin/env bash</span>
<span class="nb">set</span> <span class="nt">-euo</span> pipefail
 
<span class="nv">IMAGE</span><span class="o">=</span><span class="s2">"commercial-ai:latest"</span>
<span class="nv">BASE_DIR</span><span class="o">=</span><span class="s2">"/home/bill/agents/commercial"</span>
<span class="nv">CCR_CONFIG</span><span class="o">=</span><span class="s2">"</span><span class="k">${</span><span class="nv">BASE_DIR</span><span class="k">}</span><span class="s2">/ccr/config.json"</span>
 
<span class="k">if</span> <span class="o">[[</span> <span class="nv">$# </span><span class="nt">-ne</span> 1 <span class="o">]]</span><span class="p">;</span> <span class="k">then
  </span><span class="nb">echo</span> <span class="s2">"[run.sh] Usage: </span><span class="nv">$0</span><span class="s2"> {bash|claude|codex|gemini|copilot|ccr}"</span>
  <span class="nb">exit </span>1
<span class="k">fi
 
</span><span class="nv">TOOL</span><span class="o">=</span><span class="s2">"</span><span class="nv">$1</span><span class="s2">"</span>
<span class="nv">EXTRA_ARGS</span><span class="o">=()</span>
<span class="nv">ENV_ARGS</span><span class="o">=()</span>
 
<span class="k">case</span> <span class="s2">"</span><span class="nv">$TOOL</span><span class="s2">"</span> <span class="k">in
  </span>bash<span class="p">)</span>
    <span class="nv">API_VAR</span><span class="o">=</span><span class="s2">""</span>
    <span class="nv">CLI_CMD</span><span class="o">=</span><span class="s2">"/bin/bash"</span>
    <span class="p">;;</span>
  claude<span class="p">)</span>
    <span class="nv">API_VAR</span><span class="o">=</span><span class="s2">"ANTHROPIC_API_KEY"</span>
    <span class="nv">CLI_CMD</span><span class="o">=</span><span class="s2">"claude"</span>
    <span class="p">;;</span>
  ccr<span class="p">)</span>
    <span class="nv">API_VAR</span><span class="o">=</span><span class="s2">"ANTHROPIC_API_KEY"</span>
    <span class="nv">CLI_CMD</span><span class="o">=</span><span class="s2">"ccr code"</span>
    <span class="k">if</span> <span class="o">[[</span> <span class="o">!</span> <span class="nt">-f</span> <span class="s2">"</span><span class="k">${</span><span class="nv">CCR_CONFIG</span><span class="k">}</span><span class="s2">"</span> <span class="o">]]</span><span class="p">;</span> <span class="k">then
      </span><span class="nb">echo</span> <span class="s2">"[run.sh] CCR config not found at </span><span class="k">${</span><span class="nv">CCR_CONFIG</span><span class="k">}</span><span class="s2">"</span>
      <span class="nb">echo</span> <span class="s2">"[run.sh] Create it before running with the ccr option."</span>
      <span class="nb">exit </span>1
    <span class="k">fi
    </span>EXTRA_ARGS+<span class="o">=(</span><span class="nt">-v</span> <span class="s2">"</span><span class="k">${</span><span class="nv">CCR_CONFIG</span><span class="k">}</span><span class="s2">:/home/agent/.claude-code-router/config.json:ro"</span><span class="o">)</span>
    <span class="p">;;</span>
  codex<span class="p">)</span>
    <span class="nv">API_VAR</span><span class="o">=</span><span class="s2">"OPENAI_API_KEY"</span>
    <span class="nv">CLI_CMD</span><span class="o">=</span><span class="s2">"codex"</span>
    <span class="p">;;</span>
  gemini<span class="p">)</span>
    <span class="nv">API_VAR</span><span class="o">=</span><span class="s2">"GEMINI_API_KEY"</span>
    <span class="nv">CLI_CMD</span><span class="o">=</span><span class="s2">"gemini"</span>
    <span class="p">;;</span>
  copilot<span class="p">)</span>
    <span class="nv">API_VAR</span><span class="o">=</span><span class="s2">"GITHUB_TOKEN"</span>
    <span class="nv">CLI_CMD</span><span class="o">=</span><span class="s2">"copilot"</span>
    <span class="p">;;</span>
  <span class="k">*</span><span class="p">)</span>
    <span class="nb">echo</span> <span class="s2">"[run.sh] Invalid tool: </span><span class="nv">$TOOL</span><span class="s2">"</span>
    <span class="nb">echo</span> <span class="s2">"[run.sh] Valid options: bash | claude | codex | gemini | copilot | ccr"</span>
    <span class="nb">exit </span>1
    <span class="p">;;</span>
<span class="k">esac</span>
 
<span class="nv">TOOL_DIR</span><span class="o">=</span><span class="s2">"</span><span class="k">${</span><span class="nv">BASE_DIR</span><span class="k">}</span><span class="s2">/</span><span class="k">${</span><span class="nv">TOOL</span><span class="k">}</span><span class="s2">"</span>
<span class="nb">mkdir</span> <span class="nt">-p</span> <span class="se">\</span>
  <span class="s2">"</span><span class="k">${</span><span class="nv">BASE_DIR</span><span class="k">}</span><span class="s2">/workspace"</span> <span class="se">\</span>
  <span class="s2">"</span><span class="k">${</span><span class="nv">TOOL_DIR</span><span class="k">}</span><span class="s2">/home"</span> <span class="se">\</span>
  <span class="s2">"</span><span class="k">${</span><span class="nv">TOOL_DIR</span><span class="k">}</span><span class="s2">/npm"</span> <span class="se">\</span>
  <span class="s2">"</span><span class="k">${</span><span class="nv">TOOL_DIR</span><span class="k">}</span><span class="s2">/config"</span> <span class="se">\</span>
  <span class="s2">"</span><span class="k">${</span><span class="nv">TOOL_DIR</span><span class="k">}</span><span class="s2">/cache"</span>
 
<span class="k">if</span> <span class="o">[[</span> <span class="nt">-n</span> <span class="s2">"</span><span class="k">${</span><span class="nv">API_VAR</span><span class="k">}</span><span class="s2">"</span> <span class="o">]]</span><span class="p">;</span> <span class="k">then
  if</span> <span class="o">[[</span> <span class="nt">-z</span> <span class="s2">"</span><span class="k">${</span><span class="p">!API_VAR</span><span class="k">:-}</span><span class="s2">"</span> <span class="o">]]</span><span class="p">;</span> <span class="k">then
    </span><span class="nb">echo</span> <span class="s2">"[run.sh] </span><span class="k">${</span><span class="nv">API_VAR</span><span class="k">}</span><span class="s2"> is not set. Please enter it:"</span>
    <span class="nb">read</span> <span class="nt">-rsp</span> <span class="s2">"&gt;&gt;&gt; "</span> USER_KEY
    <span class="nb">echo</span> <span class="s2">""</span>
    <span class="nb">export</span> <span class="s2">"</span><span class="k">${</span><span class="nv">API_VAR</span><span class="k">}</span><span class="s2">=</span><span class="k">${</span><span class="nv">USER_KEY</span><span class="k">}</span><span class="s2">"</span>
  <span class="k">fi
  </span>ENV_ARGS+<span class="o">=(</span><span class="s2">"-e"</span> <span class="s2">"</span><span class="k">${</span><span class="nv">API_VAR</span><span class="k">}</span><span class="s2">=</span><span class="k">${</span><span class="p">!API_VAR</span><span class="k">}</span><span class="s2">"</span><span class="o">)</span>
<span class="k">fi
 
</span>docker run <span class="nt">--rm</span> <span class="nt">-it</span> <span class="se">\</span>
  <span class="nt">--name</span> <span class="s2">"commercial-</span><span class="k">${</span><span class="nv">TOOL</span><span class="k">}</span><span class="s2">-bill"</span> <span class="se">\</span>
  <span class="nt">-e</span> <span class="nv">HOME</span><span class="o">=</span><span class="s2">"/home/agent"</span> <span class="se">\</span>
  <span class="s2">"</span><span class="k">${</span><span class="nv">ENV_ARGS</span><span class="p">[@]</span><span class="k">}</span><span class="s2">"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="k">${</span><span class="nv">BASE_DIR</span><span class="k">}</span><span class="s2">/workspace:/workspace"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="k">${</span><span class="nv">TOOL_DIR</span><span class="k">}</span><span class="s2">/home:/home/agent"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="k">${</span><span class="nv">TOOL_DIR</span><span class="k">}</span><span class="s2">/npm:/home/agent/.npm"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="k">${</span><span class="nv">TOOL_DIR</span><span class="k">}</span><span class="s2">/config:/home/agent/.config"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="k">${</span><span class="nv">TOOL_DIR</span><span class="k">}</span><span class="s2">/cache:/home/agent/.cache"</span> <span class="se">\</span>
  <span class="s2">"</span><span class="k">${</span><span class="nv">EXTRA_ARGS</span><span class="p">[@]</span><span class="k">}</span><span class="s2">"</span> <span class="se">\</span>
  <span class="nt">-w</span> /workspace <span class="se">\</span>
  <span class="s2">"</span><span class="k">${</span><span class="nv">IMAGE</span><span class="k">}</span><span class="s2">"</span> <span class="k">${</span><span class="nv">CLI_CMD</span><span class="k">}</span>
</code></pre></div></div>

<p>Invoking <code class="language-plaintext highlighter-rouge">./run.sh bash</code> thus drops into the container with the full commercial identity environment present but no tool running, which makes it straightforward to inspect installed package versions, verify that volume mounts resolved correctly, or stage configuration files before running a tool. Because the <code class="language-plaintext highlighter-rouge">bash</code> identity directory (<code class="language-plaintext highlighter-rouge">commercial/bash/home</code>) is separate from all other tool identity directories, any modifications made in a shell session cannot bleed into a Claude Code or Codex session.</p>

<p>The <code class="language-plaintext highlighter-rouge">ccr</code> (Claude Code Router) variant additionally mounts its routing config read-only:</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="err">//</span><span class="w"> </span><span class="err">commercial/ccr/config.json</span><span class="w"> </span><span class="err">—</span><span class="w"> </span><span class="err">routing</span><span class="w"> </span><span class="err">table</span><span class="w">
</span><span class="p">{</span><span class="w">
  </span><span class="nl">"Router"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="nl">"default"</span><span class="p">:</span><span class="w">          </span><span class="s2">"ollama,llama3:latest"</span><span class="p">,</span><span class="w">
    </span><span class="nl">"background"</span><span class="p">:</span><span class="w">       </span><span class="s2">"ollama,qwen2.5:1.5b"</span><span class="p">,</span><span class="w">
    </span><span class="nl">"think"</span><span class="p">:</span><span class="w">            </span><span class="s2">"ollama,gemma4:e2b"</span><span class="p">,</span><span class="w">
    </span><span class="nl">"longContext"</span><span class="p">:</span><span class="w">      </span><span class="s2">"openrouter,google/gemini-2.5-pro-exp-03-25:free"</span><span class="p">,</span><span class="w">
    </span><span class="nl">"longContextThreshold"</span><span class="p">:</span><span class="w"> </span><span class="mi">60000</span><span class="p">,</span><span class="w">
    </span><span class="nl">"webSearch"</span><span class="p">:</span><span class="w">        </span><span class="s2">"openrouter,google/gemini-2.5-pro-exp-03-25:online"</span><span class="w">
  </span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>Switching the active model within a running Claude Code session is a single slash command: <code class="language-plaintext highlighter-rouge">/model ollama,llama3:latest</code>.</p>

<h3 id="opencode-deployment">OpenCode Deployment</h3>

<div class="language-dockerfile highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># opencode/Dockerfile</span>
<span class="k">FROM</span><span class="s"> node:20-bookworm-slim</span>
 
<span class="k">RUN </span>apt-get update <span class="se">\
</span>    <span class="o">&amp;&amp;</span> apt-get <span class="nb">install</span> <span class="nt">-y</span> <span class="nt">--no-install-recommends</span> <span class="se">\
</span>        curl ca-certificates git bash findutils <span class="se">\
</span>    <span class="o">&amp;&amp;</span> <span class="nb">rm</span> <span class="nt">-rf</span> /var/lib/apt/lists/<span class="k">*</span>
 
<span class="k">RUN </span>curl <span class="nt">-fsSL</span> https://opencode.ai/install | bash <span class="se">\
</span>    <span class="o">&amp;&amp;</span> <span class="nb">mkdir</span> <span class="nt">-p</span> /opt/opencode/bin <span class="se">\
</span>    <span class="o">&amp;&amp;</span> <span class="nb">cp</span> <span class="s2">"</span><span class="si">$(</span>find /root <span class="nt">-type</span> f <span class="nt">-name</span> opencode | <span class="nb">head</span> <span class="nt">-n</span> 1<span class="si">)</span><span class="s2">"</span> /opt/opencode/bin/opencode <span class="se">\
</span>    <span class="o">&amp;&amp;</span> <span class="nb">chmod </span>755 /opt/opencode/bin/opencode
 
<span class="k">ENV</span><span class="s"> PATH="/opt/opencode/bin:${PATH}"</span>
<span class="k">ENV</span><span class="s"> HOME=/home/opencode</span>
<span class="k">VOLUME</span><span class="s"> ["/workspace"]</span>
<span class="k">WORKDIR</span><span class="s"> /workspace</span>
<span class="k">ENTRYPOINT</span><span class="s"> ["/opt/opencode/bin/opencode"]</span>
</code></pre></div></div>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># opencode/build.sh</span>
docker build <span class="nt">-t</span> opencode:local <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/opencode"</span>
<span class="nb">mkdir</span> <span class="nt">-p</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/opencode/home"</span>
 
<span class="c"># opencode/run.sh</span>
docker run <span class="nt">--restart</span> no <span class="nt">-it</span> <span class="se">\</span>
  <span class="nt">--name</span> opencode-<span class="k">${</span><span class="nv">USER</span><span class="k">}</span> <span class="se">\</span>
  <span class="nt">--add-host</span><span class="o">=</span>host.docker.internal:host-gateway <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/opencode/home:/home/opencode"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/workspace:/workspace"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/skills/core:/app/skills/core:ro"</span> <span class="se">\</span>
  opencode:local
 
<span class="c"># opencode/attach.sh</span>
docker start <span class="nt">-ai</span> opencode-<span class="k">${</span><span class="nv">USER</span><span class="k">}</span>
</code></pre></div></div>

<p><strong>KiloCode</strong> (VS Code extension, Node.js) is the VS Code-native member of the survey. It brings the agentic loop into the editor rather than the terminal, with direct access to the VS Code language server for diagnostics, symbol navigation, and refactoring. Connecting it to LiteLLM is a one-field change in its settings JSON.</p>

<h3 id="kilocode-deployment">KiloCode Deployment</h3>

<div class="language-dockerfile highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># kilocode/Dockerfile</span>
<span class="k">FROM</span><span class="s"> debian:bookworm-slim</span>
 
<span class="k">RUN </span>apt-get update <span class="o">&amp;&amp;</span> <span class="se">\
</span>    apt-get <span class="nb">install</span> <span class="nt">-y</span> <span class="nt">--no-install-recommends</span> <span class="se">\
</span>        ca-certificates git bash wget curl <span class="o">&amp;&amp;</span> <span class="se">\
</span>    <span class="nb">rm</span> <span class="nt">-rf</span> /var/lib/apt/lists/<span class="k">*</span>
 
<span class="k">ARG</span><span class="s"> TARGETARCH=amd64</span>
 
<span class="k">RUN </span><span class="nv">ARCH</span><span class="o">=</span><span class="k">${</span><span class="nv">TARGETARCH</span><span class="k">}</span> <span class="o">&amp;&amp;</span> <span class="se">\
</span>    <span class="k">if</span> <span class="o">[</span> <span class="s2">"</span><span class="nv">$ARCH</span><span class="s2">"</span> <span class="o">=</span> <span class="s2">"arm64"</span> <span class="o">]</span><span class="p">;</span> <span class="k">then </span><span class="nv">ARCH</span><span class="o">=</span><span class="s2">"arm64"</span><span class="p">;</span> <span class="k">else </span><span class="nv">ARCH</span><span class="o">=</span><span class="s2">"x64"</span><span class="p">;</span> <span class="k">fi</span> <span class="o">&amp;&amp;</span> <span class="se">\
</span>    wget <span class="nt">-qO</span> /tmp/kilo.tar.gz <span class="se">\
</span>        <span class="s2">"https://github.com/Kilo-Org/kilocode/releases/latest/download/kilo-linux-</span><span class="k">${</span><span class="nv">ARCH</span><span class="k">}</span><span class="s2">.tar.gz"</span> <span class="o">&amp;&amp;</span> <span class="se">\
</span>    <span class="nb">tar</span> <span class="nt">-xzf</span> /tmp/kilo.tar.gz <span class="nt">-C</span> /usr/local/bin <span class="o">&amp;&amp;</span> <span class="se">\
</span>    <span class="nb">chmod</span> +x /usr/local/bin/kilo <span class="o">&amp;&amp;</span> <span class="se">\
</span>    <span class="nb">rm</span> /tmp/kilo.tar.gz
 
<span class="k">VOLUME</span><span class="s"> ["/workspace"]</span>
<span class="k">WORKDIR</span><span class="s"> /workspace</span>
<span class="k">ENTRYPOINT</span><span class="s"> ["kilo"]</span>
</code></pre></div></div>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># kilocode/build.sh</span>
docker build <span class="nt">-t</span> kilocode:local <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/kilocode"</span>
<span class="nb">mkdir</span> <span class="nt">-p</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/kilocode/home"</span>
<span class="nb">chown</span> <span class="nt">-R</span> <span class="si">$(</span><span class="nb">id</span> <span class="nt">-u</span><span class="si">)</span>:<span class="si">$(</span><span class="nb">id</span> <span class="nt">-g</span><span class="si">)</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/kilocode/home"</span>
<span class="nb">chmod</span> <span class="nt">-R</span> u+rwX <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/kilocode/home"</span>
 
<span class="c"># kilocode/run.sh</span>
docker run <span class="nt">--restart</span> no <span class="nt">-it</span> <span class="se">\</span>
  <span class="nt">--user</span> <span class="si">$(</span><span class="nb">id</span> <span class="nt">-u</span><span class="si">)</span>:<span class="si">$(</span><span class="nb">id</span> <span class="nt">-g</span><span class="si">)</span> <span class="se">\</span>
  <span class="nt">--add-host</span><span class="o">=</span>host.docker.internal:host-gateway <span class="se">\</span>
  <span class="nt">-e</span> <span class="nv">HOME</span><span class="o">=</span>/home/kilo <span class="se">\</span>
  <span class="nt">-e</span> <span class="nv">XDG_CONFIG_HOME</span><span class="o">=</span>/home/kilo/.config <span class="se">\</span>
  <span class="nt">-e</span> <span class="nv">XDG_DATA_HOME</span><span class="o">=</span>/home/kilo/.local/share <span class="se">\</span>
  <span class="nt">-e</span> <span class="nv">XDG_CACHE_HOME</span><span class="o">=</span>/home/kilo/.cache <span class="se">\</span>
  <span class="nt">--name</span> kilocode-<span class="k">${</span><span class="nv">USER</span><span class="k">}</span> <span class="se">\</span>
  <span class="nt">-e</span> <span class="nv">TERM</span><span class="o">=</span>xterm-256color <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/kilocode/home:/home/kilo"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/workspace:/workspace"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/skills/core:/app/skills/core:ro"</span> <span class="se">\</span>
  kilocode:local
 
<span class="c"># kilocode/attach.sh</span>
docker start <span class="nt">-ai</span> kilocode-<span class="k">${</span><span class="nv">USER</span><span class="k">}</span>
</code></pre></div></div>

<p><strong>pi</strong> (pi.dev, Node.js) takes a deliberately minimal stance, omitting built-in MCP, plan mode, and permission gates in favor of a package-based extensibility model. It supports OpenRouter directly through a provider block in <code class="language-plaintext highlighter-rouge">models.json</code>, and NVIDIA NIM endpoints are equally accessible through the same mechanism. I reach for pi primarily for rapid exploratory work precisely because it does not impose an opinionated workflow.</p>

<h3 id="pi-deployment">Pi Deployment</h3>

<div class="language-dockerfile highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># pi/Dockerfile</span>
<span class="k">FROM</span><span class="s"> node:22-bookworm-slim</span>
 
<span class="k">RUN </span>apt-get update <span class="se">\
</span> <span class="o">&amp;&amp;</span> apt-get <span class="nb">install</span> <span class="nt">-y</span> <span class="nt">--no-install-recommends</span> <span class="se">\
</span>    bash ca-certificates curl git openssh-client <span class="se">\
</span> <span class="o">&amp;&amp;</span> <span class="nb">rm</span> <span class="nt">-rf</span> /var/lib/apt/lists/<span class="k">*</span>
 
<span class="k">RUN </span>npm <span class="nb">install</span> <span class="nt">-g</span> @mariozechner/pi-coding-agent
 
<span class="k">ENV</span><span class="s"> HOME=/home/pi-agent</span>
<span class="k">VOLUME</span><span class="s"> ["/workspace"]</span>
<span class="k">WORKDIR</span><span class="s"> /workspace</span>
<span class="k">ENTRYPOINT</span><span class="s"> ["pi"]</span>
</code></pre></div></div>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># pi/build.sh</span>
docker build <span class="nt">-t</span> pi:local <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/pi"</span>
<span class="nb">mkdir</span> <span class="nt">-p</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/pi/home/.pi/agent"</span>
 
<span class="c"># Write default models.json pointing at Ollama</span>
<span class="nb">cat</span> <span class="o">&gt;</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/pi/home/.pi/agent/models.json"</span> <span class="o">&lt;&lt;</span> <span class="sh">'</span><span class="no">EOF</span><span class="sh">'
{
  "providers": {
    "ollama": {
      "baseUrl": "http://host.docker.internal:11434/v1",
      "api": "openai-completions",
      "apiKey": "ollama",
      "models": [
        {
          "id": "qwen2.5:7b",
          "name": "Qwen 2.5 7B (Local)",
          "contextWindow": 32768,
          "maxTokens": 8192,
          "cost": { "input": 0, "output": 0 }
        }
      ]
    },
    "openrouter": {
      "baseUrl": "https://openrouter.ai/api/v1",
      "apiKey": "</span><span class="k">${</span><span class="nv">OPENROUTER_API_KEY</span><span class="k">}</span><span class="sh">",
      "models": [
        { "id": "google/gemini-2.5-pro-exp-03-25:free", "contextWindow": 1000000 },
        { "id": "meta-llama/llama-4-maverick:free",      "contextWindow": 128000  }
      ]
    }
  }
}
</span><span class="no">EOF
 
</span><span class="c"># pi/run.sh</span>
docker run <span class="nt">--restart</span> no <span class="nt">-it</span> <span class="se">\</span>
  <span class="nt">--name</span> pi-<span class="k">${</span><span class="nv">USER</span><span class="k">}</span> <span class="se">\</span>
  <span class="nt">--add-host</span><span class="o">=</span>host.docker.internal:host-gateway <span class="se">\</span>
  <span class="nt">-e</span> <span class="nv">TERM</span><span class="o">=</span>xterm-256color <span class="se">\</span>
  <span class="nt">-e</span> <span class="nv">OLLAMA_HOST</span><span class="o">=</span><span class="s2">"http://host.docker.internal:11434"</span> <span class="se">\</span>
  <span class="nt">-e</span> <span class="nv">OPENROUTER_API_KEY</span><span class="o">=</span><span class="s2">"</span><span class="k">${</span><span class="nv">OPENROUTER_API_KEY</span><span class="k">:-</span><span class="nv">YOUR_OPENROUTER_API_KEY</span><span class="k">}</span><span class="s2">"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/pi/home:/home/pi-agent"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/workspace:/workspace"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/skills/core:/app/skills/core:ro"</span> <span class="se">\</span>
  pi:local
 
<span class="c"># pi/attach.sh</span>
docker start <span class="nt">-ai</span> pi-<span class="k">${</span><span class="nv">USER</span><span class="k">}</span>
</code></pre></div></div>

<p><strong>Hermes</strong> is a named agent identity maintained around Nous Research’s Hermes 3 model family. The <code class="language-plaintext highlighter-rouge">hermes/home/</code> directory holds a configuration and prompt library tuned for Hermes 3’s specific instruction format and function-calling conventions. Because Hermes 3 handles structured output and tool-use with uncommon consistency, I route all tool-calling-heavy workloads to this identity.</p>

<h3 id="hermes-agent-deployment">Hermes Agent Deployment</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># hermes/build.sh — probe first, then run</span>
<span class="c"># Confirm the image's runtime user and home directory before committing a mount path:</span>
docker pull nousresearch/hermes-agent:latest
docker run <span class="nt">--rm</span> <span class="nt">--entrypoint</span> sh nousresearch/hermes-agent:latest <span class="se">\</span>
  <span class="nt">-c</span> <span class="s1">'id &amp;&amp; echo HOME=$HOME'</span>
 
<span class="nb">mkdir</span> <span class="nt">-p</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/hermes/home"</span>
 
<span class="c"># hermes/run.sh</span>
<span class="c"># Always launch with -it — running detached causes immediate exit</span>
docker run <span class="nt">--restart</span> no <span class="nt">-it</span> <span class="se">\</span>
  <span class="nt">--name</span> hermes-<span class="k">${</span><span class="nv">USER</span><span class="k">}</span> <span class="se">\</span>
  <span class="nt">--add-host</span><span class="o">=</span>host.docker.internal:host-gateway <span class="se">\</span>
  <span class="nt">-e</span> <span class="nv">TERM</span><span class="o">=</span>xterm-256color <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/hermes/home:/home/hermes/.hermes"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/workspace:/workspace"</span> <span class="se">\</span>
  nousresearch/hermes-agent:latest
 
<span class="c"># hermes/attach.sh</span>
docker start <span class="nt">-ai</span> hermes-<span class="k">${</span><span class="nv">USER</span><span class="k">}</span>
</code></pre></div></div>

<p><strong>GNHF</strong> (Good Night Have Fun) is the task-bounded harness I use when I want a single-purpose agent that executes a defined workflow, reports results, and stops. The name is a ham radio sign-off, which fits its character: polite, brief, and does exactly what it was asked to do. Unlike Claude Code or pi, which are interactive and session-oriented, GNHF takes a task description and a workspace path as inputs, executes against the LiteLLM gateway, writes its outputs to the workspace, and exits.</p>

<h3 id="gnhf-deployment">GNHF Deployment</h3>

<p>GNHF requires a Dockerfile placed at <code class="language-plaintext highlighter-rouge">$HOME/agents/gnhf/Dockerfile</code> before <code class="language-plaintext highlighter-rouge">build.sh</code> can execute, because the gnhf binary distribution mechanism is external to this stack. A representative starting point:</p>

<div class="language-dockerfile highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># gnhf/Dockerfile — adapt to your gnhf binary distribution</span>
<span class="k">FROM</span><span class="s"> node:22-bookworm</span>
 
<span class="k">RUN </span>apt-get update <span class="o">&amp;&amp;</span> apt-get <span class="nb">install</span> <span class="nt">-y</span> <span class="nt">--no-install-recommends</span> <span class="se">\
</span>    git curl ca-certificates ripgrep less <span class="se">\
</span>    <span class="o">&amp;&amp;</span> <span class="nb">rm</span> <span class="nt">-rf</span> /var/lib/apt/lists/<span class="k">*</span>
 
<span class="c"># Install the agent CLIs that gnhf wraps</span>
<span class="k">RUN </span>npm <span class="nb">install</span> <span class="nt">-g</span> <span class="se">\
</span>    @anthropic-ai/claude-code <span class="se">\
</span>    @openai/codex <span class="se">\
</span>    @github/copilot
 
<span class="c"># Install gnhf — adapt to your distribution method:</span>
<span class="c"># RUN npm install -g gnhf</span>
<span class="c"># or: COPY gnhf /usr/local/bin/gnhf &amp;&amp; chmod +x /usr/local/bin/gnhf</span>
 
<span class="k">WORKDIR</span><span class="s"> /workspace</span>
<span class="k">CMD</span><span class="s"> ["/bin/bash"]</span>
</code></pre></div></div>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># gnhf/build.sh</span>
<span class="k">if</span> <span class="o">[[</span> <span class="o">!</span> <span class="nt">-f</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/gnhf/Dockerfile"</span> <span class="o">]]</span><span class="p">;</span> <span class="k">then
  </span><span class="nb">echo</span> <span class="s2">"ERROR: place Dockerfile in </span><span class="nv">$HOME</span><span class="s2">/agents/gnhf/ first"</span><span class="p">;</span> <span class="nb">exit </span>1
<span class="k">fi
</span>docker build <span class="nt">-t</span> gnhf:latest <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/gnhf"</span>
 
<span class="c"># gnhf/run.sh — key arguments shown; full script handles key resolution per agent</span>
<span class="c"># Usage: ./run.sh --agent &lt;codex|claude|copilot&gt; --repo &lt;path&gt; \</span>
<span class="c">#                 [--max-iterations N] [--max-tokens N] "task description"</span>
docker run <span class="nt">--rm</span> <span class="nt">-it</span> <span class="se">\</span>
  <span class="nt">--name</span> <span class="s2">"gnhf-</span><span class="k">${</span><span class="nv">AGENT</span><span class="k">}</span><span class="s2">-</span><span class="k">${</span><span class="nv">USER</span><span class="k">}</span><span class="s2">"</span> <span class="se">\</span>
  <span class="nt">-e</span> <span class="nv">ANTHROPIC_API_KEY</span><span class="o">=</span><span class="s2">"</span><span class="k">${</span><span class="nv">ANTHROPIC_API_KEY</span><span class="k">:-</span><span class="nv">YOUR_ANTHROPIC_API_KEY</span><span class="k">}</span><span class="s2">"</span> <span class="se">\</span>
  <span class="nt">-e</span> <span class="nv">HOME</span><span class="o">=</span><span class="s2">"/home/agent"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/workspace:/workspace"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/gnhf/home:/home/agent"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/gnhf/npm:/home/agent/.npm"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/gnhf/config:/home/agent/.config"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/gnhf/cache:/home/agent/.cache"</span> <span class="se">\</span>
  gnhf:latest <span class="se">\</span>
  bash <span class="nt">-lc</span> <span class="s1">'
    cd "$1" &amp;&amp;
    git config --global --add safe.directory "$1" &amp;&amp;
    shift &amp;&amp; exec "$@"
  '</span> _ <span class="s2">"</span><span class="k">${</span><span class="nv">REPO_PATH</span><span class="k">}</span><span class="s2">"</span> gnhf <span class="nt">--agent</span> <span class="s2">"</span><span class="k">${</span><span class="nv">AGENT</span><span class="k">}</span><span class="s2">"</span> <span class="s2">"</span><span class="k">${</span><span class="nv">PROMPT</span><span class="k">}</span><span class="s2">"</span>
</code></pre></div></div>

<p>All six tools use project context files (<code class="language-plaintext highlighter-rouge">CLAUDE.md</code>, <code class="language-plaintext highlighter-rouge">AGENTS.md</code>, <code class="language-plaintext highlighter-rouge">GEMINI.md</code>) to provide persistent project instructions without consuming prompt tokens on every turn, and all six are converging on MCP as the standard for tool integration.</p>

<h2 id="the-google-ecosystem-agents-cli-and-workspace-cli">The Google Ecosystem: Agents CLI and Workspace CLI</h2>

<p>Two Google-specific tools occupy their own tier in the stack, with independent container identities and a shared philosophy of treating Google’s API surface as a set of agent-accessible tools.</p>

<p><strong>Google Agents CLI</strong> is the command-line interface for Google’s Agent Development Kit (ADK), a Python framework for building multi-agent systems that run on Google’s infrastructure and interact with Gemini models. The <code class="language-plaintext highlighter-rouge">uv</code> cache indicates a Python-heavy dependency footprint, the <code class="language-plaintext highlighter-rouge">evals/</code> directory holds evaluation datasets and result logs, and the container runs as a non-root user to match the bind-mounted volume permissions.</p>

<h3 id="google-agents-cli-deployment">Google Agents CLI Deployment</h3>

<div class="language-dockerfile highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># googleagentscli/Dockerfile</span>
<span class="k">FROM</span><span class="s"> python:3.12-slim</span>
 
<span class="k">RUN </span>apt-get update <span class="o">&amp;&amp;</span> apt-get <span class="nb">install</span> <span class="nt">-y</span> <span class="nt">--no-install-recommends</span> <span class="se">\
</span>        ca-certificates curl git gnupg lsb-release unzip wget <span class="se">\
</span>        jq vim less procps build-essential <span class="se">\
</span>    <span class="o">&amp;&amp;</span> <span class="nb">rm</span> <span class="nt">-rf</span> /var/lib/apt/lists/<span class="k">*</span>
 
<span class="k">RUN </span>curl <span class="nt">-fsSL</span> https://deb.nodesource.com/setup_lts.x | bash - <span class="se">\
</span>    <span class="o">&amp;&amp;</span> apt-get <span class="nb">install</span> <span class="nt">-y</span> <span class="nt">--no-install-recommends</span> nodejs <span class="se">\
</span>    <span class="o">&amp;&amp;</span> <span class="nb">rm</span> <span class="nt">-rf</span> /var/lib/apt/lists/<span class="k">*</span>
 
<span class="k">COPY</span><span class="s"> --from=ghcr.io/astral-sh/uv:latest /uv /uvx /usr/local/bin/</span>
 
<span class="c"># Google Cloud SDK</span>
<span class="k">RUN </span><span class="nb">echo</span> <span class="s2">"deb [signed-by=/usr/share/keyrings/cloud.google.gpg] </span><span class="se">\
</span><span class="s2">        https://packages.cloud.google.com/apt cloud-sdk main"</span> <span class="se">\
</span>        | <span class="nb">tee</span> /etc/apt/sources.list.d/google-cloud-sdk.list <span class="se">\
</span>    <span class="o">&amp;&amp;</span> curl <span class="nt">-fsSL</span> https://packages.cloud.google.com/apt/doc/apt-key.gpg <span class="se">\
</span>        | gpg <span class="nt">--dearmor</span> <span class="nt">-o</span> /usr/share/keyrings/cloud.google.gpg <span class="se">\
</span>    <span class="o">&amp;&amp;</span> apt-get update <span class="o">&amp;&amp;</span> apt-get <span class="nb">install</span> <span class="nt">-y</span> <span class="nt">--no-install-recommends</span> google-cloud-cli <span class="se">\
</span>    <span class="o">&amp;&amp;</span> <span class="nb">rm</span> <span class="nt">-rf</span> /var/lib/apt/lists/<span class="k">*</span>
 
<span class="k">RUN </span>groupadd <span class="nt">-g</span> 1000 agent <span class="o">&amp;&amp;</span> useradd <span class="nt">-m</span> <span class="nt">-u</span> 1000 <span class="nt">-g</span> agent <span class="nt">-s</span> /bin/bash agent
 
<span class="k">ENV</span><span class="s"> UV_TOOL_DIR=/usr/local/uv-tools</span>
<span class="k">ENV</span><span class="s"> PATH="${UV_TOOL_DIR}/bin:${PATH}"</span>
 
<span class="k">RUN </span>uv tool <span class="nb">install </span>google-agents-cli <span class="o">&amp;&amp;</span> <span class="nb">chown</span> <span class="nt">-R</span> agent:agent <span class="s2">"</span><span class="k">${</span><span class="nv">UV_TOOL_DIR</span><span class="k">}</span><span class="s2">"</span>
 
<span class="k">RUN </span><span class="nb">mkdir</span> <span class="nt">-p</span> /workspace /home/agent/.config/agents-cli /home/agent/.config/gcloud <span class="se">\
</span>        /home/agent/.cache/uv /home/agent/.cache/npm /home/agent/evals /home/agent/logs <span class="se">\
</span>    <span class="o">&amp;&amp;</span> <span class="nb">chown</span> <span class="nt">-R</span> agent:agent /workspace /home/agent/.config /home/agent/.cache <span class="se">\
</span>        /home/agent/evals /home/agent/logs
 
<span class="k">USER</span><span class="s"> agent</span>
<span class="k">WORKDIR</span><span class="s"> /workspace</span>
<span class="k">CMD</span><span class="s"> ["sleep", "infinity"]</span>
</code></pre></div></div>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># googleagentscli/build.sh</span>
docker build <span class="nt">--progress</span><span class="o">=</span>plain <span class="nt">-t</span> google-agents-cli:local <span class="se">\</span>
  <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/googleagentscli"</span>
 
<span class="c"># googleagentscli/run.sh — starts detached; exec in with attach.sh</span>
<span class="nv">GCLOUD_MOUNT</span><span class="o">=()</span>
<span class="o">[[</span> <span class="nt">-d</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/.config/gcloud"</span> <span class="o">]]</span> <span class="o">&amp;&amp;</span> <span class="se">\</span>
  <span class="nv">GCLOUD_MOUNT</span><span class="o">=(</span><span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/.config/gcloud:/home/agent/.config/gcloud:ro"</span><span class="o">)</span>
 
docker run <span class="se">\</span>
  <span class="nt">--detach</span> <span class="se">\</span>
  <span class="nt">--name</span> <span class="s2">"googleagentscli-</span><span class="k">${</span><span class="nv">USER</span><span class="k">}</span><span class="s2">"</span> <span class="se">\</span>
  <span class="nt">--restart</span> unless-stopped <span class="se">\</span>
  <span class="nt">--add-host</span><span class="o">=</span>host.docker.internal:host-gateway <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/googleagentscli/data:/workspace"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/googleagentscli/config:/home/agent/.config/agents-cli"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/googleagentscli/cache/uv:/home/agent/.cache/uv"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/googleagentscli/cache/npm:/home/agent/.cache/npm"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/googleagentscli/evals:/home/agent/evals"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/googleagentscli/logs:/home/agent/logs"</span> <span class="se">\</span>
  <span class="s2">"</span><span class="k">${</span><span class="nv">GCLOUD_MOUNT</span><span class="p">[@]</span><span class="k">}</span><span class="s2">"</span> <span class="se">\</span>
  <span class="nt">-e</span> <span class="nv">GOOGLE_API_KEY</span><span class="o">=</span><span class="s2">"</span><span class="k">${</span><span class="nv">GOOGLE_API_KEY</span><span class="k">:-</span><span class="nv">YOUR_GOOGLE_API_KEY</span><span class="k">}</span><span class="s2">"</span> <span class="se">\</span>
  <span class="nt">--workdir</span> /workspace <span class="se">\</span>
  google-agents-cli:local
 
<span class="c"># googleagentscli/attach.sh</span>
docker <span class="nb">exec</span> <span class="nt">-it</span> <span class="nt">--user</span> agent <span class="nt">--workdir</span> /workspace <span class="se">\</span>
  <span class="s2">"googleagentscli-</span><span class="k">${</span><span class="nv">USER</span><span class="k">}</span><span class="s2">"</span> bash <span class="nt">--login</span>
</code></pre></div></div>

<p>Post-launch authentication: Option A (AI Studio key, no Cloud billing) is <code class="language-plaintext highlighter-rouge">docker exec -it googleagentscli-${USER} agents-cli login</code>. Option B (Google Cloud ADC for production workloads) requires running <code class="language-plaintext highlighter-rouge">gcloud auth application-default login</code> on the host machine; the <code class="language-plaintext highlighter-rouge">run.sh</code> mounts <code class="language-plaintext highlighter-rouge">~/.config/gcloud</code> read-only into the container automatically.</p>

<p><strong>Google Workspace CLI</strong> is a containerized <code class="language-plaintext highlighter-rouge">gcloud</code> environment configured with the scopes necessary to drive the Google Workspace APIs programmatically: Gmail, Drive, Calendar, Sheets, and Docs. Authenticate once inside the container; the bind-mounted credentials directory persists across container recreations.</p>

<h3 id="google-workspace-cli-deployment">Google Workspace CLI Deployment</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># googleworkspacecli/run.sh</span>
docker run <span class="nt">--restart</span> no <span class="nt">-it</span> <span class="se">\</span>
  <span class="nt">--name</span> gworkspace-<span class="k">${</span><span class="nv">USER</span><span class="k">}</span> <span class="se">\</span>
  <span class="nt">--add-host</span><span class="o">=</span>host.docker.internal:host-gateway <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/googleworkspacecli/gcloud:/root/.config/gcloud"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/workspace:/workspace"</span> <span class="se">\</span>
  google/cloud-sdk:slim <span class="se">\</span>
  bash
 
<span class="c"># Inside container (first time only):</span>
<span class="c"># gcloud auth login</span>
<span class="c"># gcloud config set project YOUR_PROJECT_ID</span>
 
<span class="c"># googleworkspacecli/attach.sh</span>
docker start <span class="nt">-ai</span> gworkspace-<span class="k">${</span><span class="nv">USER</span><span class="k">}</span>
</code></pre></div></div>

<p>The authentication separation between the Workspace CLI container and the rest of the stack is intentional: the container that holds the Google credentials never has access to the project workspace or to any other agent’s identity directory. Data flows through the workspace volume only.</p>

<h2 id="agent-frameworks-agent-zero-archon-and-mastra">Agent Frameworks: Agent Zero, Archon, and Mastra</h2>

<p>The stack runs three agent frameworks that occupy distinct positions on the spectrum from fully autonomous to fully programmable.</p>

<h3 id="agent-zero">Agent Zero</h3>

<p><a href="https://github.com/frdel/agent-zero">Agent Zero</a> is the most autonomous framework in the stack, designed around the premise that the agent should be able to self-improve its own instructions and tools over the course of a session. It runs as a web UI on port 8081 and exposes a chat interface backed by a hierarchical agent system where the primary agent can spawn specialized subagents. The persistent state in <code class="language-plaintext highlighter-rouge">a0/data/</code> includes the agent’s memory bank, its accumulated tool library, and its evolving system prompt, all of which carry forward across container restarts.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># a0/build.sh</span>
docker pull agent0ai/agent-zero
<span class="nb">mkdir</span> <span class="nt">-p</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/a0/data"</span>
 
<span class="c"># a0/run.sh</span>
docker run <span class="nt">-d</span> <span class="se">\</span>
  <span class="nt">--name</span> <span class="s2">"a0-</span><span class="k">${</span><span class="nv">USER</span><span class="k">}</span><span class="s2">"</span> <span class="se">\</span>
  <span class="nt">--restart</span> no <span class="se">\</span>
  <span class="nt">--add-host</span><span class="o">=</span>host.docker.internal:host-gateway <span class="se">\</span>
  <span class="nt">-p</span> 8081:80 <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/a0/data:/a0/usr"</span> <span class="se">\</span>
  agent0ai/agent-zero
 
<span class="c"># a0/attach.sh — tail live logs (Ctrl+C safe; container keeps running)</span>
docker logs <span class="nt">--follow</span> <span class="nt">--timestamps</span> <span class="s2">"a0-</span><span class="k">${</span><span class="nv">USER</span><span class="k">}</span><span class="s2">"</span>
</code></pre></div></div>

<p>Open <code class="language-plaintext highlighter-rouge">http://localhost:8081</code> in a browser after the container starts.</p>

<h3 id="archon">Archon</h3>

<p><a href="https://github.com/coleam00/Archon">Archon</a> occupies a meta-level in the stack: it is an agent framework whose purpose is to help build other agent frameworks. Its Streamlit-based UI presents a development environment where I describe the agent I want to build in natural language, and Archon generates the scaffolding, tool definitions, system prompt, and evaluation harness for that agent. Archon-generated agents are configured at generation time to use the LiteLLM endpoint, so they enter the stack already wired to the unified gateway without any post-generation modification.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># archon/build.sh</span>
docker pull ghcr.io/coleam00/archon:latest
<span class="nb">mkdir</span> <span class="nt">-p</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/archon/data/workflows"</span>
 
<span class="c"># archon/data/config.yaml — default model routing</span>
<span class="nb">cat</span> <span class="o">&gt;</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/archon/data/config.yaml"</span> <span class="o">&lt;&lt;</span> <span class="sh">'</span><span class="no">EOF</span><span class="sh">'
assistant: pi
assistants:
  pi:
    provider: openrouter
    model: openrouter/openrouter/free
</span><span class="no">EOF
 
</span><span class="c"># archon/run.sh — ephemeral (--rm); exits after task</span>
<span class="o">[[</span> <span class="nt">-z</span> <span class="s2">"</span><span class="k">${</span><span class="nv">OPENROUTER_API_KEY</span><span class="k">:-}</span><span class="s2">"</span> <span class="o">]]</span> <span class="o">&amp;&amp;</span> <span class="se">\</span>
  <span class="o">{</span> <span class="nb">echo</span> <span class="s2">"ERROR: OPENROUTER_API_KEY not set"</span><span class="p">;</span> <span class="nb">exit </span>1<span class="p">;</span> <span class="o">}</span>
 
docker run <span class="nt">--rm</span> <span class="se">\</span>
  <span class="nt">--name</span> <span class="s2">"archon-</span><span class="k">${</span><span class="nv">USER</span><span class="k">}</span><span class="s2">"</span> <span class="se">\</span>
  <span class="nt">--user</span> <span class="s2">"</span><span class="si">$(</span><span class="nb">id</span> <span class="nt">-u</span><span class="si">)</span><span class="s2">:</span><span class="si">$(</span><span class="nb">id</span> <span class="nt">-g</span><span class="si">)</span><span class="s2">"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/workspace:/home/bun/.archon/workspaces"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/archon/data:/home/bun/.archon"</span> <span class="se">\</span>
  <span class="nt">-p</span> 3090:3090 <span class="se">\</span>
  <span class="nt">-e</span> <span class="nv">OPENROUTER_API_KEY</span><span class="o">=</span><span class="s2">"</span><span class="k">${</span><span class="nv">OPENROUTER_API_KEY</span><span class="k">}</span><span class="s2">"</span> <span class="se">\</span>
  <span class="nt">-e</span> <span class="nv">DEFAULT_AI_ASSISTANT</span><span class="o">=</span>pi <span class="se">\</span>
  ghcr.io/coleam00/archon:latest workflow list
</code></pre></div></div>

<h3 id="mastra">Mastra</h3>

<p><a href="https://mastra.ai">Mastra</a> is a TypeScript-based AI agent framework that runs as a Docker Compose service exposing a REST API and a Studio UI on port 4111. It uses LibSQL for persistent conversation history, meaning that agent memory survives container restarts. Mastra occupies the programmable end of the spectrum: rather than autonomous self-direction, it provides a typed API for defining agents, tools, workflows, and memory retrievers in TypeScript.</p>

<p>The Mastra image is a multi-stage build. Critically, it handles the <code class="language-plaintext highlighter-rouge">instrumentation.mjs</code> file conditionally, since its presence varies across Mastra version upgrades.</p>

<div class="language-dockerfile highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># mastra/Dockerfile (multi-stage)</span>
<span class="k">FROM</span><span class="w"> </span><span class="s">node:22-alpine</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="s">builder</span>
<span class="k">WORKDIR</span><span class="s"> /app</span>
<span class="k">RUN </span>apk add <span class="nt">--no-cache</span> gcompat
<span class="k">COPY</span><span class="s"> package*.json ./</span>
<span class="k">RUN </span>npm <span class="nb">install</span>
<span class="k">COPY</span><span class="s"> tsconfig*.json ./</span>
<span class="k">COPY</span><span class="s"> src ./src</span>
<span class="k">RUN </span>npx mastra build
 
<span class="k">FROM</span><span class="w"> </span><span class="s">node:22-alpine</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="s">runner</span>
<span class="k">WORKDIR</span><span class="s"> /app</span>
<span class="k">RUN </span>apk add <span class="nt">--no-cache</span> gcompat wget
<span class="k">RUN </span>addgroup <span class="nt">-g</span> 1001 <span class="nt">-S</span> nodejs <span class="o">&amp;&amp;</span> adduser <span class="nt">-S</span> mastra <span class="nt">-u</span> 1001
<span class="k">COPY</span><span class="s"> --from=builder --chown=mastra:nodejs /app/.mastra/output ./.mastra/output</span>
<span class="k">COPY</span><span class="s"> --from=builder --chown=mastra:nodejs /app/node_modules ./node_modules</span>
<span class="k">COPY</span><span class="s"> --from=builder --chown=mastra:nodejs /app/package.json ./package.json</span>
<span class="k">RUN </span><span class="nb">mkdir</span> <span class="nt">-p</span> /app/data <span class="o">&amp;&amp;</span> <span class="nb">chown </span>mastra:nodejs /app/data
<span class="k">USER</span><span class="s"> mastra</span>
<span class="k">ENV</span><span class="s"> PORT=4111</span>
<span class="k">ENV</span><span class="s"> NODE_ENV=production</span>
<span class="k">ENV</span><span class="s"> DATABASE_URL="file:/app/data/mastra.db"</span>
<span class="k">EXPOSE</span><span class="s"> 4111</span>
<span class="k">HEALTHCHECK</span><span class="s"> --interval=30s --timeout=10s --start-period=20s --retries=3 \</span>
    CMD wget -qO- http://localhost:4111/api &gt; /dev/null || exit 1
<span class="c"># Conditional instrumentation: works across Mastra versions</span>
<span class="k">CMD</span><span class="s"> ["sh", "-c", "if [ -f .mastra/output/instrumentation.mjs ]; then \</span>
  node --import=./.mastra/output/instrumentation.mjs .mastra/output/index.mjs; \
  else node .mastra/output/index.mjs; fi"]
</code></pre></div></div>

<p>The agent definition is deliberately minimal:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// src/mastra/agents/assistant.ts</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">Agent</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">"</span><span class="s2">@mastra/core/agent</span><span class="dl">"</span><span class="p">;</span>
 
<span class="k">export</span> <span class="kd">const</span> <span class="nx">assistant</span> <span class="o">=</span> <span class="k">new</span> <span class="nx">Agent</span><span class="p">({</span>
  <span class="na">name</span><span class="p">:</span> <span class="dl">"</span><span class="s2">assistant</span><span class="dl">"</span><span class="p">,</span>
  <span class="na">instructions</span><span class="p">:</span> <span class="dl">"</span><span class="s2">You are a helpful, concise, and accurate assistant.</span><span class="dl">"</span><span class="p">,</span>
  <span class="na">model</span><span class="p">:</span> <span class="dl">"</span><span class="s2">openrouter/meta-llama/llama-3.1-8b-instruct:free</span><span class="dl">"</span><span class="p">,</span>
  <span class="c1">// Other free-tier options:</span>
  <span class="c1">// openrouter/mistralai/mistral-7b-instruct:free</span>
  <span class="c1">// openrouter/google/gemma-3-12b-it:free</span>
<span class="p">});</span>
</code></pre></div></div>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># mastra/docker-compose.yml</span>
<span class="na">services</span><span class="pi">:</span>
  <span class="na">mastra</span><span class="pi">:</span>
    <span class="na">build</span><span class="pi">:</span>
      <span class="na">context</span><span class="pi">:</span> <span class="s">.</span>
      <span class="na">dockerfile</span><span class="pi">:</span> <span class="s">Dockerfile</span>
    <span class="na">container_name</span><span class="pi">:</span> <span class="s">mastra-${USER}</span>
    <span class="na">ports</span><span class="pi">:</span>
      <span class="pi">-</span> <span class="s2">"</span><span class="s">4111:4111"</span>
    <span class="na">environment</span><span class="pi">:</span>
      <span class="na">OPENROUTER_API_KEY</span><span class="pi">:</span> <span class="s">${OPENROUTER_API_KEY}</span>
      <span class="na">NODE_ENV</span><span class="pi">:</span> <span class="s">production</span>
      <span class="na">DATABASE_URL</span><span class="pi">:</span> <span class="s2">"</span><span class="s">file:/app/data/mastra.db"</span>
    <span class="na">volumes</span><span class="pi">:</span>
      <span class="pi">-</span> <span class="s">/home/${USER}/agents/mastra/data:/app/data</span>
    <span class="na">restart</span><span class="pi">:</span> <span class="s">unless-stopped</span>
    <span class="na">healthcheck</span><span class="pi">:</span>
      <span class="na">test</span><span class="pi">:</span> <span class="pi">[</span><span class="s2">"</span><span class="s">CMD"</span><span class="pi">,</span> <span class="s2">"</span><span class="s">wget"</span><span class="pi">,</span> <span class="s2">"</span><span class="s">-qO-"</span><span class="pi">,</span> <span class="s2">"</span><span class="s">http://localhost:4111/api"</span><span class="pi">]</span>
      <span class="na">interval</span><span class="pi">:</span> <span class="s">30s</span>
      <span class="na">timeout</span><span class="pi">:</span> <span class="s">10s</span>
      <span class="na">retries</span><span class="pi">:</span> <span class="m">3</span>
      <span class="na">start_period</span><span class="pi">:</span> <span class="s">20s</span>
</code></pre></div></div>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># mastra/build.sh — self-contained; creates src files if absent, prompts for key</span>
<span class="nb">cd</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/mastra"</span>
docker compose up <span class="nt">--build</span> <span class="nt">-d</span>
 
<span class="c"># mastra/run.sh — start after reboot without rebuild</span>
<span class="nb">cd</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/mastra"</span> <span class="o">&amp;&amp;</span> docker compose up <span class="nt">-d</span>
 
<span class="c"># mastra/attach.sh — tail live logs</span>
docker logs <span class="nt">--follow</span> <span class="nt">--timestamps</span> <span class="s2">"mastra-</span><span class="k">${</span><span class="nv">USER</span><span class="k">}</span><span class="s2">"</span>
</code></pre></div></div>

<p>The agent server exposes a standard REST endpoint:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>curl <span class="nt">-X</span> POST http://localhost:4111/api/agents/assistant/generate <span class="se">\</span>
  <span class="nt">-H</span> <span class="s2">"Content-Type: application/json"</span> <span class="se">\</span>
  <span class="nt">-d</span> <span class="s1">'{"messages": [{"role": "user", "content": "Summarize this document."}]}'</span>
</code></pre></div></div>

<h2 id="open-design-collaborative-canvas-with-embedded-agent">Open Design: Collaborative Canvas with Embedded Agent</h2>

<p><a href="https://github.com/nexu-io/open-design">Open Design</a> is a web-based collaborative design canvas with an embedded pi coding agent, running on port 5173. It is the visual layer of the stack, useful for design work where the agent can read and modify the canvas state directly. The pi identity directory is separate from the main pi tool identity, so Open Design’s agent configuration does not interfere with standalone pi sessions.</p>

<h3 id="open-design-deployment">Open Design Deployment</h3>

<div class="language-dockerfile highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># open-design/Dockerfile</span>
<span class="k">FROM</span><span class="s"> node:24-bookworm</span>
 
<span class="k">ARG</span><span class="s"> OPEN_DESIGN_REPO=https://github.com/nexu-io/open-design.git</span>
<span class="k">ARG</span><span class="s"> OPEN_DESIGN_REF=main</span>
 
<span class="k">ENV</span><span class="s"> APP_DIR=/opt/open-design</span>
<span class="k">ENV</span><span class="s"> PNPM_HOME=/root/.local/share/pnpm</span>
<span class="k">ENV</span><span class="s"> PATH=/root/.local/share/pnpm:/usr/local/bin:/usr/local/sbin:/usr/sbin:/usr/bin:/sbin:/bin</span>
<span class="k">ENV</span><span class="s"> PORT=5173</span>
<span class="k">ENV</span><span class="s"> HOST=0.0.0.0</span>
<span class="k">ENV</span><span class="s"> OD_HOST=0.0.0.0</span>
<span class="k">ENV</span><span class="s"> OD_ALLOWED_DEV_ORIGINS=127.0.0.1,localhost</span>
 
<span class="k">RUN </span>apt-get update <span class="o">&amp;&amp;</span> apt-get <span class="nb">install</span> <span class="nt">-y</span> <span class="nt">--no-install-recommends</span> <span class="se">\
</span>    git ca-certificates curl bash python3 <span class="se">\
</span>    <span class="o">&amp;&amp;</span> <span class="nb">rm</span> <span class="nt">-rf</span> /var/lib/apt/lists/<span class="k">*</span>
 
<span class="k">RUN </span>corepack <span class="nb">enable</span>
<span class="k">RUN </span>npm <span class="nb">install</span> <span class="nt">-g</span> @mariozechner/pi-coding-agent
 
<span class="k">RUN </span>git clone <span class="nt">--branch</span> <span class="s2">"</span><span class="k">${</span><span class="nv">OPEN_DESIGN_REF</span><span class="k">}</span><span class="s2">"</span> <span class="nt">--depth</span> 1 <span class="se">\
</span>    <span class="s2">"</span><span class="k">${</span><span class="nv">OPEN_DESIGN_REPO</span><span class="k">}</span><span class="s2">"</span> <span class="s2">"</span><span class="k">${</span><span class="nv">APP_DIR</span><span class="k">}</span><span class="s2">"</span>
 
<span class="k">WORKDIR</span><span class="s"> ${APP_DIR}</span>
 
<span class="c"># Patch next.config.ts to accept OD_ALLOWED_DEV_ORIGINS from environment</span>
<span class="k">RUN </span>python3 - <span class="o">&lt;&lt;</span> <span class="sh">'</span><span class="no">PY</span><span class="sh">'</span>
<span class="k">from</span><span class="s"> pathlib import Path</span>
p = Path("apps/web/next.config.ts")
s = p.read_text()
old = "allowedDevOrigins: ['127.0.0.1'],"
new = """allowedDevOrigins: (
    process.env.OD_ALLOWED_DEV_ORIGINS
      ? process.env.OD_ALLOWED_DEV_ORIGINS.split(',').map((s) =&gt; s.trim()).filter(Boolean)
      : ['127.0.0.1']
  ),"""
if old not in s:
    raise SystemExit("Could not find allowedDevOrigins line")
p.write_text(s.replace(old, new))
PY
 
<span class="k">RUN </span>corepack pnpm <span class="nt">--version</span> <span class="o">&amp;&amp;</span> pnpm <span class="nb">install</span>
 
<span class="k">EXPOSE</span><span class="s"> 5173</span>
<span class="k">CMD</span><span class="s"> ["pnpm", "tools-dev", "run", "web", "--web-port", "5173"]</span>
</code></pre></div></div>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># open-design/build.sh</span>
docker build <span class="nt">-t</span> open-design-pi <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/open-design"</span>
<span class="nb">mkdir</span> <span class="nt">-p</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/open-design/data"</span>
<span class="nb">mkdir</span> <span class="nt">-p</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/open-design/pi"</span>
 
<span class="c"># First-time pi setup (run once to authenticate)</span>
docker run <span class="nt">--rm</span> <span class="nt">-it</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/open-design/pi:/root/.pi"</span> <span class="se">\</span>
  open-design-pi <span class="se">\</span>
  pi
<span class="c"># Inside pi session: /login</span>
 
<span class="c"># open-design/run.sh</span>
<span class="c"># OD_ALLOWED_DEV_ORIGINS must match the IP the browser uses to reach the container</span>
docker run <span class="nt">--rm</span> <span class="nt">-it</span> <span class="se">\</span>
  <span class="nt">--name</span> open-design-<span class="k">${</span><span class="nv">USER</span><span class="k">}</span> <span class="se">\</span>
  <span class="nt">-e</span> <span class="s2">"OD_ALLOWED_DEV_ORIGINS=YOUR_HOST_IP"</span> <span class="se">\</span>
  <span class="nt">-p</span> 5173:5173 <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/open-design/data:/opt/open-design/.od"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/open-design/pi:/root/.pi"</span> <span class="se">\</span>
  open-design-pi
</code></pre></div></div>

<p>Open the canvas at <code class="language-plaintext highlighter-rouge">http://YOUR_HOST_IP:5173</code>. <code class="language-plaintext highlighter-rouge">OD_ALLOWED_DEV_ORIGINS</code> must match the IP address the browser uses to reach the container. To detect it automatically, substitute <code class="language-plaintext highlighter-rouge">$(hostname -I | awk '{print $1}')</code> for the hardcoded value.</p>

<h2 id="docker-volume-architecture-identity-workspace-skills">Docker Volume Architecture: Identity, Workspace, Skills</h2>

<p>One of the more carefully considered design decisions in this stack is the separation of Docker bind mounts into four independent tiers, which I call identity, workspace, skills, and tool data. This separation means that swapping a user identity, adding a skills package, or destroying an experimental container affects only its own tier; the other three are untouched.</p>

<table>
  <thead>
    <tr>
      <th>Tier</th>
      <th>Host Path Pattern</th>
      <th>Container Mount</th>
      <th>Shared?</th>
      <th>Destroyable?</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Identity</td>
      <td><code class="language-plaintext highlighter-rouge">$HOME/agents/{tool}/home</code></td>
      <td>varies per tool</td>
      <td>No</td>
      <td>Backup first</td>
    </tr>
    <tr>
      <td>Workspace</td>
      <td><code class="language-plaintext highlighter-rouge">$HOME/agents/workspace</code></td>
      <td><code class="language-plaintext highlighter-rouge">/workspace</code></td>
      <td>Yes</td>
      <td>No</td>
    </tr>
    <tr>
      <td>Skills</td>
      <td><code class="language-plaintext highlighter-rouge">$HOME/agents/skills/{name}</code></td>
      <td><code class="language-plaintext highlighter-rouge">/app/skills/{name}</code></td>
      <td>Yes</td>
      <td>Yes</td>
    </tr>
    <tr>
      <td>Tool Data</td>
      <td><code class="language-plaintext highlighter-rouge">$HOME/agents/{tool}/data</code></td>
      <td>varies</td>
      <td>No</td>
      <td>Snapshot first</td>
    </tr>
  </tbody>
</table>

<p>The identity hot-swap pattern is simple enough to describe in three commands: stop the container, remove it (volumes are untouched), and rerun with a different <code class="language-plaintext highlighter-rouge">home/</code> path. The workspace and skills mount points are identical in both invocations. This makes it straightforward to work on the same project files under different API key contexts or with different tool configurations.</p>

<p>Permission repair across containers with differing internal UID/GID values is handled by a disposable Alpine container:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker run <span class="nt">--rm</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/some-tool/home:/mnt/target"</span> <span class="se">\</span>
  alpine <span class="se">\</span>
  <span class="nb">chown</span> <span class="nt">-R</span> 1000:1000 /mnt/target
</code></pre></div></div>

<h2 id="container-isolated-tool-invocation">Container-Isolated Tool Invocation</h2>

<p>One of the more practically useful habits I have developed with this stack is running agentic CLI tools inside dedicated Docker containers rather than installing them to my host user environment. The motivation is threefold: environment isolation, workspace scope control, and plugin sandboxing.</p>

<p><strong>Workspace scope control</strong> is where the bind-mount architecture pays its most direct dividend. Rather than giving a tool access to the entire home filesystem, I mount only the specific project directories I want it to operate on:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker run <span class="nt">--rm</span> <span class="nt">-it</span> <span class="se">\</span>
  <span class="nt">--add-host</span><span class="o">=</span>host.docker.internal:host-gateway <span class="se">\</span>
  <span class="nt">-e</span> <span class="nv">ANTHROPIC_API_KEY</span><span class="o">=</span><span class="s2">"</span><span class="k">${</span><span class="nv">ANTHROPIC_API_KEY</span><span class="k">:-</span><span class="nv">YOUR_ANTHROPIC_API_KEY</span><span class="k">}</span><span class="s2">"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/commercial/claude/home:/home/agent"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/projects/project-alpha:/workspace/project-alpha"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/projects/project-beta:/workspace/project-beta"</span> <span class="se">\</span>
  <span class="nt">-w</span> /workspace/project-alpha <span class="se">\</span>
  commercial-ai:latest claude
</code></pre></div></div>

<p>From inside the container, the agent sees exactly two project directories and nothing else. For work involving student data or grant-sensitive materials, this mount-scoping discipline is not optional; it is the architectural enforcement of the data minimization principle.</p>

<p><strong>Plugin sandboxing</strong> makes it practical to evaluate new tools without risk. I can install an untrusted npm package, register a new MCP server, or try an experimental integration inside an ephemeral container with a scratch identity directory, observe its behavior against a scoped workspace mount, and discard the container entirely if I decide against it. The two-stage pattern, scratch evaluation followed by deliberate promotion to the production identity directory, is something the tiered bind-mount architecture makes nearly effortless.</p>

<h2 id="the-extended-ecosystem-containerized-evaluation-of-ai-tools">The Extended Ecosystem: Containerized Evaluation of AI Tools</h2>

<p>The stack described above is not a closed list. One of the structural advantages of the containerized, mount-scoped architecture is that any new AI tool can be evaluated, and subsequently adopted or discarded, without touching host state, leaking credentials, or requiring destructive cleanup. The evaluation pattern is consistent: build a throw-away image that installs the candidate tool and any dependencies it requires; run a container with a scratch identity directory and a scoped workspace mount containing only the project data relevant to the evaluation task; observe behavior against a free or low-cost model routed through LiteLLM; and either promote the configuration to a named identity directory or <code class="language-plaintext highlighter-rouge">docker rmi</code> the image and move on. What follows is a survey of tools I use or actively monitor, each of which fits neatly into this workflow.</p>

<h3 id="fabric">Fabric</h3>

<p><a href="https://github.com/danielmiessler/fabric">Fabric</a> (Go) is an AI augmentation framework built around the concept of patterns, which are markdown-formatted system prompt templates stored in <code class="language-plaintext highlighter-rouge">~/.config/fabric/patterns/</code>. Rather than a conversational agent, Fabric is a UNIX-pipeline-oriented tool: it reads from stdin or a file, applies a named pattern as the system prompt, and writes the model’s response to stdout. This makes it composable with every other shell tool by design. Because Fabric speaks to any OpenAI-compatible endpoint, it integrates with the LiteLLM gateway through a single environment variable, <code class="language-plaintext highlighter-rouge">OPENAI_BASE_URL=http://localhost:4000/v1</code>, and is well-suited to free-model operation for batch summarization, extraction, and classification tasks. The container footprint is minimal: the Go binary, the patterns directory bind-mounted from <code class="language-plaintext highlighter-rouge">~/.config/fabric/patterns</code>, and a workspace volume.</p>

<h3 id="anythingllm">AnythingLLM</h3>

<p><a href="https://github.com/Mintplex-Labs/anything-llm">AnythingLLM</a> is a full-stack application that provides document ingestion, vector storage, and retrieval-augmented generation through a browser-based interface, all self-hostable and all pointing at whatever OpenAI-compatible endpoint you configure. Unlike Open WebUI, which is primarily a chat interface to models you have already pulled, AnythingLLM is organized around workspaces that each maintain their own document corpus and retrieval context. This makes it the most natural comparison point for evaluating RAG pipeline quality, including chunking strategies and retrieval configurations, against the same local models without committing to a production indexing infrastructure. It runs as a single Docker container and connects to LiteLLM without modification.</p>

<h3 id="autogpt">AutoGPT</h3>

<p><a href="https://github.com/Significant-Gravitas/AutoGPT">AutoGPT</a> is one of the original autonomous agent frameworks, now significantly matured into a platform with a visual workflow builder and a marketplace of pre-built agents. Its architectural evolution mirrors the broader field: the early single-agent loop has been replaced by a multi-agent orchestration model where specialized agents collaborate on subtasks. AutoGPT’s containerized deployment is well-documented, and its use of a PostgreSQL backend for persistent agent state means that evaluation sessions survive container restarts. I evaluate AutoGPT primarily for its workflow builder, which provides a visual alternative to writing agent orchestration code by hand, and because its agent marketplace offers a useful inventory of community-developed task patterns.</p>

<h3 id="cli-anything">CLI-Anything</h3>

<p>CLI-Anything is a natural-language shell interface that translates plain-English task descriptions into shell command sequences, explains the commands it generates before executing them, and allows interactive refinement. The evaluation pattern is particularly straightforward: mount a scratch workspace, describe a file manipulation or build task in natural language, and assess the fidelity of the generated commands against the intent. Because CLI-Anything operates at the shell command level rather than the source code level, it is usable with substantially smaller models than coding-focused tools, which makes it a good candidate for free-tier or small local model evaluation.</p>

<h3 id="google-antigravity">Google Antigravity</h3>

<p>Google Antigravity is an experimental framework for rapid agent prototyping that provides a higher-level abstraction layer over the Google Agent Development Kit described above. It is oriented toward fast iteration on multi-agent system designs, with an emphasis on making architectural experiments cheap to run and discard, which maps naturally onto the containerized evaluation philosophy of this stack. I use it alongside the Google Agents CLI container, sharing the same workspace volume but maintaining a separate identity directory so that Antigravity’s experimental state does not contaminate the production ADK configuration.</p>

<h3 id="t3-code">T3 Code</h3>

<p><a href="https://t3.chat">T3 Code</a> is a code generation tool that applies Theo Browne’s T3 stack architectural preferences (TypeScript, tRPC, Tailwind, Prisma) to AI-assisted scaffolding. Its opinionated output is both a strength and a constraint: the generated code is immediately coherent within the T3 ecosystem but requires deliberate adaptation outside it. I evaluate it in a container against a workspace containing a greenfield TypeScript project, routed through LiteLLM, and find it most useful as a rapid scaffolding baseline rather than a continuous coding companion.</p>

<h3 id="paperclip">Paperclip</h3>

<p><a href="https://github.com/paperclip-ai/paperclip">Paperclip</a> is a document-aware coding assistant that maintains a live index of a project’s files and surfaces relevant context into the model’s prompt automatically as the conversation evolves. The document-indexing architecture distinguishes it from tools that rely on the user to provide context explicitly: Paperclip’s retrieval layer operates continuously in the background, which makes it well-suited to exploration of unfamiliar codebases. Containerized evaluation against a read-only mount of a target codebase is a clean pattern: the index lives in the scratch identity directory, the codebase is mounted read-only, and the entire evaluation state is discardable.</p>

<h3 id="open-source-cowork-alternatives-in-the-ecosystem">Open-Source Cowork Alternatives in the Ecosystem</h3>

<p>The tools already described in dedicated sections below, including OpenWork, OpenCoworkAI, and Multica, represent the most directly Cowork-comparable options in the ecosystem, but several adjacent tools occupy related positions.</p>

<p><a href="https://accomplish.ai/">Accomplish</a> takes a “BYO-AI” stance, functioning as the orchestration layer (hands and eyes) while allowing model selection, supporting OpenAI, Anthropic, Google, and Ollama backends without lock-in.</p>

<p><a href="https://github.com/kuse-ai/kuse_cowork">Kuse Cowork</a> implements the agent runtime in Rust with Docker-based sandboxing built into the design, which makes its security boundary properties particularly legible from the perspective of this stack’s threat model.</p>

<h3 id="the-containerization-argument">The Containerization Argument</h3>

<p>The tooling landscape I have described, spanning fabric, AnythingLLM, AutoGPT, CLI-Anything, Antigravity, T3 Code, Paperclip, and the Cowork-adjacent frameworks, shares a property that makes the containerized evaluation pattern especially productive: nearly all of them support connection to a popular API-compatible backend (OpenAI, Anthropic, or a generic OpenAI-compatible endpoint), and nearly all of them can be pointed at free-tier models on OpenRouter or at local Ollama models with no modification beyond a base URL. This means that a feasibility evaluation of any tool in the list, assessing whether its interaction model, output quality, and integration characteristics are worth the effort of a production deployment, can be conducted at near-zero cost: no paid API usage, no host-level installation, no persistent state that requires cleanup. The directory structure is created by <code class="language-plaintext highlighter-rouge">build.sh</code>, the container runs against a scoped workspace mount, the evaluation task executes against a free model, and the result determines whether to invest in a full identity directory configuration or to <code class="language-plaintext highlighter-rouge">docker rmi</code> and move on. This evaluation-first discipline is, I would argue, the appropriate epistemological stance toward a tool ecosystem that is evolving faster than any individual practitioner can track.</p>

<h2 id="open-source-cowork-alternatives">Open-Source Cowork Alternatives</h2>

<p>Anthropic’s Claude Cowork launch in January 2026 triggered a vigorous open-source response. I track several of the resulting projects.</p>

<p><a href="https://github.com/different-ai/openwork">OpenWork</a> is the most actively developed, functioning as a control surface for agentic workflows with hot-reloadable skills, session management, and SSE event stream subscriptions. It is ejectable to OpenCode, which provides a meaningful portability guarantee.</p>

<h3 id="openwork-deployment">OpenWork Deployment</h3>

<div class="language-dockerfile highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># openwork/Dockerfile</span>
<span class="k">FROM</span><span class="s"> node:22-bookworm-slim</span>
 
<span class="k">ARG</span><span class="s"> OPENWORK_ORCHESTRATOR_VERSION=latest</span>
 
<span class="k">RUN </span>apt-get update <span class="se">\
</span> <span class="o">&amp;&amp;</span> apt-get <span class="nb">install</span> <span class="nt">-y</span> <span class="nt">--no-install-recommends</span> <span class="se">\
</span>    ca-certificates curl git <span class="nb">tar </span>unzip <span class="se">\
</span> <span class="o">&amp;&amp;</span> <span class="nb">rm</span> <span class="nt">-rf</span> /var/lib/apt/lists/<span class="k">*</span>
 
<span class="k">RUN </span>npm <span class="nb">install</span> <span class="nt">-g</span> <span class="s2">"openwork-orchestrator@</span><span class="k">${</span><span class="nv">OPENWORK_ORCHESTRATOR_VERSION</span><span class="k">}</span><span class="s2">"</span>
 
<span class="k">ENV</span><span class="s"> OPENWORK_DATA_DIR=/data/openwork-orchestrator</span>
<span class="k">ENV</span><span class="s"> OPENWORK_SIDECAR_DIR=/data/sidecars</span>
<span class="k">ENV</span><span class="s"> OPENWORK_WORKSPACE=/workspace</span>
 
<span class="k">EXPOSE</span><span class="s"> 8787</span>
<span class="k">VOLUME</span><span class="s"> ["/workspace", "/data"]</span>
 
<span class="k">CMD</span><span class="s"> ["openwork", "serve", "--workspace", "/workspace", "--remote-access", \</span>
     "--openwork-port", "8787", "--opencode-host", "127.0.0.1", \
     "--opencode-port", "4096", "--connect-host", "127.0.0.1", \
     "--cors", "*", "--approval", "manual", "--no-opencode-router"]
</code></pre></div></div>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># openwork/build.sh</span>
docker build <span class="nt">-t</span> openwork:local <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/openwork"</span>
<span class="nb">mkdir</span> <span class="nt">-p</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/openwork/workspace"</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/openwork/data"</span>
 
<span class="c"># openwork/run.sh</span>
docker run <span class="nt">-it</span> <span class="se">\</span>
  <span class="nt">--restart</span> no <span class="se">\</span>
  <span class="nt">--add-host</span><span class="o">=</span>host.docker.internal:host-gateway <span class="se">\</span>
  <span class="nt">-p</span> 8787:8787 <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/openwork/workspace:/workspace"</span> <span class="se">\</span>
  <span class="nt">-v</span> <span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/agents/openwork/data:/data"</span> <span class="se">\</span>
  <span class="nt">-e</span> <span class="nv">OPENWORK_TOKEN</span><span class="o">=</span>dev-token <span class="se">\</span>
  <span class="nt">-e</span> <span class="nv">OPENWORK_HOST_TOKEN</span><span class="o">=</span>dev-host-token <span class="se">\</span>
  <span class="nt">--name</span> openwork-<span class="k">${</span><span class="nv">USER</span><span class="k">}</span> <span class="se">\</span>
  openwork:local
 
<span class="c"># openwork/attach.sh</span>
docker start <span class="nt">-ai</span> openwork-<span class="k">${</span><span class="nv">USER</span><span class="k">}</span>
</code></pre></div></div>

<p><a href="https://accomplish.ai/">Accomplish</a> takes a “BYO-AI” stance, functioning as the orchestration layer (hands and eyes) while allowing model selection, supporting OpenAI, Anthropic, Google, and Ollama backends without lock-in. <a href="https://github.com/kuse-ai/kuse_cowork">Kuse Cowork</a> implements the agent runtime in Rust with Docker-based sandboxing. <a href="https://github.com/opencowork-ai/opencowork">OpenCoworkAI</a> explicitly commits to VM/bwrap isolation and checkpoint-rollback capability.</p>

<p><a href="https://github.com/multica-ai/multica">Multica</a> occupies a different conceptual position as a managed agents platform rather than a desktop agent. It assigns issues to AI agents as you would assign them to human teammates, and it implements skill compounding: when an agent completes a task successfully, the solution is saved as a reusable skill that future tasks can leverage. This maps interestingly onto organizational learning theory, though a thoughtful critique in the project’s issue tracker notes that the human-management metaphor may be insufficient for genuinely autonomous AI orchestration at scale. I find this a productive tension worth thinking about seriously.</p>

<h2 id="mcp-servers-extending-the-stack-with-custom-tools">MCP Servers: Extending the Stack with Custom Tools</h2>

<p>The Model Context Protocol (MCP), introduced by Anthropic in late 2024, defines how AI assistants communicate with external tools through four primitive types: tools (callable functions), resources (readable data streams), prompts (reusable templates), and sampling (delegated inference requests). Running MCP servers in Docker and connecting them to the local stack over a shared external network is straightforward.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker network create mcp-shared
</code></pre></div></div>

<p>With this network in place, any container that declares <code class="language-plaintext highlighter-rouge">mcp-shared</code> as an external network can reach an MCP server at its container DNS name, such as <code class="language-plaintext highlighter-rouge">http://mcp-bibliography:8000/mcp</code>, without host port exposure. For backends that use OpenAI function-calling format rather than MCP’s JSON-RPC protocol, a thin Flask adapter service translates between the two on startup:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">mcp_post</span><span class="p">(</span><span class="n">method</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span> <span class="n">params</span><span class="p">:</span> <span class="nb">dict</span><span class="p">,</span> <span class="n">req_id</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="mi">1</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="nb">dict</span><span class="p">:</span>
    <span class="n">payload</span> <span class="o">=</span> <span class="n">json</span><span class="p">.</span><span class="n">dumps</span><span class="p">({</span>
        <span class="s">"jsonrpc"</span><span class="p">:</span> <span class="s">"2.0"</span><span class="p">,</span> <span class="s">"id"</span><span class="p">:</span> <span class="n">req_id</span><span class="p">,</span>
        <span class="s">"method"</span><span class="p">:</span> <span class="n">method</span><span class="p">,</span> <span class="s">"params"</span><span class="p">:</span> <span class="n">params</span>
    <span class="p">}).</span><span class="n">encode</span><span class="p">()</span>
    <span class="n">req</span> <span class="o">=</span> <span class="n">urllib</span><span class="p">.</span><span class="n">request</span><span class="p">.</span><span class="n">Request</span><span class="p">(</span>
        <span class="n">MCP_URL</span><span class="p">,</span> <span class="n">data</span><span class="o">=</span><span class="n">payload</span><span class="p">,</span>
        <span class="n">headers</span><span class="o">=</span><span class="p">{</span><span class="s">"Content-Type"</span><span class="p">:</span> <span class="s">"application/json"</span><span class="p">},</span> <span class="n">method</span><span class="o">=</span><span class="s">"POST"</span>
    <span class="p">)</span>
    <span class="k">with</span> <span class="n">urllib</span><span class="p">.</span><span class="n">request</span><span class="p">.</span><span class="n">urlopen</span><span class="p">(</span><span class="n">req</span><span class="p">,</span> <span class="n">timeout</span><span class="o">=</span><span class="mi">10</span><span class="p">)</span> <span class="k">as</span> <span class="n">r</span><span class="p">:</span>
        <span class="k">return</span> <span class="n">json</span><span class="p">.</span><span class="n">loads</span><span class="p">(</span><span class="n">r</span><span class="p">.</span><span class="n">read</span><span class="p">())</span>
</code></pre></div></div>

<h2 id="openrouter-a-cloud-model-interface">OpenRouter: A Cloud Model Interface</h2>

<p><a href="https://openrouter.ai">OpenRouter</a> serves as the cloud model interface for tasks I route away from local inference. It exposes multiple model providers through an OpenAI-compatible API surface, so the same client code that talks to the local LiteLLM gateway can talk to OpenRouter without modification. Model identifiers follow a provider-qualified format:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>anthropic/claude-3-opus
openai/gpt-4o
mistralai/mixtral-8x7b
meta-llama/llama-3-70b-instruct
</code></pre></div></div>

<p>The free-tier model list changes, but as of early 2026 includes Gemini 2.5 Pro, DeepSeek Chat v3.5, and LLaMA 4 Maverick. I use OpenRouter as a fallback in the Mastra agent server, as the primary cloud provider for pi, and as the escalation path from GNHF when a batch task exceeds local model capability. All API key management is handled through environment variables; no key is ever written into a container image.</p>

<h2 id="reproducibility-the-one-shot-deploy-script">Reproducibility: The One-Shot Deploy Script</h2>

<p>The entire stack, including all Dockerfiles, helper scripts, LiteLLM configuration, Mastra project files, and the directory tree, is generated by a single bash script called <code class="language-plaintext highlighter-rouge">deploy-agents.sh</code>. Running this script on a new machine, after providing the necessary API keys, produces a fully functional environment covering all the services described above with no manual steps (except placing a Dockerfile in <code class="language-plaintext highlighter-rouge">gnhf/</code> for the task-bounded harness).</p>

<p>The script follows a fixed sequence: create the directory tree; write Dockerfiles for each custom image; write LiteLLM, LocalAI, and CCR configs; write Mastra project files; write GNHF task modules; write Agent Zero and Archon startup scripts; build custom images; clone and build OpenClaude; pull pre-built images; and finally start all daemon containers. This design means the script itself is the documentation, and any configuration drift between machines is detected by diffing the generated files against a known-good reference.</p>

<p>Publishing a custom Docker image to GitHub Container Registry follows the same minimal GitHub Actions pattern:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="pi">-</span> <span class="na">uses</span><span class="pi">:</span> <span class="s">docker/login-action@v3</span>
  <span class="na">with</span><span class="pi">:</span>
    <span class="na">registry</span><span class="pi">:</span> <span class="s">ghcr.io</span>
    <span class="na">username</span><span class="pi">:</span> <span class="s">$</span>
    <span class="na">password</span><span class="pi">:</span> <span class="s">$</span>
 
<span class="pi">-</span> <span class="na">uses</span><span class="pi">:</span> <span class="s">docker/build-push-action@v6</span>
  <span class="na">with</span><span class="pi">:</span>
    <span class="na">context</span><span class="pi">:</span> <span class="s">.</span>
    <span class="na">push</span><span class="pi">:</span> <span class="no">true</span>
    <span class="na">tags</span><span class="pi">:</span> <span class="s">$</span>
</code></pre></div></div>

<p>After the workflow runs, making the resulting package public in the GitHub package settings allows anyone to <code class="language-plaintext highlighter-rouge">docker pull ghcr.io/yourusername/yourrepo:main</code> without further configuration.</p>

<h2 id="model-selection-notes">Model Selection Notes</h2>

<p>A few observations on local model selection from operational experience. For the think-heavy routing slot I use <code class="language-plaintext highlighter-rouge">gemma4:e2b</code>; for lightweight background tasks such as file summarization and classification I use <code class="language-plaintext highlighter-rouge">qwen2.5:1.5b</code>; for tool-calling workflows in Open WebUI and via the Hermes agent identity I use <code class="language-plaintext highlighter-rouge">hermes3:8b</code>; for general interactive sessions I use <code class="language-plaintext highlighter-rouge">llama3</code>. The selection criteria are principally RAM footprint and whether the model’s function-calling format is well-supported by the tool in question.</p>

<hr />]]></content><author><name>Bill Mongan</name><email>billmongan+website@gmail.com</email></author><category term="ai" /><category term="technical" /><category term="docker" /><category term="llm" /><category term="agents" /><summary type="html"><![CDATA[For the past several years I have been thinking carefully about what it means to run AI infrastructure that I actually own, control, and understand from the ground up. The rapid proliferation of frontier model APIs, agentic coding tools, and open-weight model releases in 2025-2026 finally made this tractable at a price and complexity point that a single person could manage. This post documents the architecture I settled on: a self-hosted, Docker-based stack running on a mini PC, unified by a single OpenAI-compatible model gateway, and surfaced through a collection of local inference servers, agentic CLI tools, autonomous agent frameworks, open-source Cowork alternatives, and a task-bounded command harness built around structured queues. My goals were privacy, sovereignty, reproducibility, and the ability to swap components without rebuilding everything from scratch.]]></summary></entry><entry><title type="html">Setting Up AREDN on a Mikrotik hAp to use 44net</title><link href="https://www.billmongan.com/posts/2025/06/aredn44/" rel="alternate" type="text/html" title="Setting Up AREDN on a Mikrotik hAp to use 44net" /><published>2025-06-30T00:00:00+00:00</published><updated>2025-06-30T00:00:00+00:00</updated><id>https://www.billmongan.com/posts/2025/06/aredn44</id><content type="html" xml:base="https://www.billmongan.com/posts/2025/06/aredn44/"><![CDATA[<p>This guide will walk you through setting up a Mikrotik hAp device (I used a hAp ac2) to use 44net addresses, bridging AREDN and 44net services between the two networks.  I set up the hAp to broadcast a WiFi hotspot SSID that, when connected to a client, enables access to both 44net and to AREDN resources simultaneously.  I use <a href="https://connect.44net.cloud">44net Connect</a> (formerly 44net.cloud) to route a network allocation they assigned to me through a Wireguard tunnel that they also assigned.  The tunnel can be configured through their portal to route to the network.  It is likely also possible to do this by decapsulating the ipencap packets from the raspberry pi directly, and using a traditional 44net subnet allocation, but this setup enables me to take the hAp setup to mobile deployments, without worring about the NAT configuration or my ability to forward ipencap traffic at my destination.</p>

<p>I took the following steps for this setup:</p>

<ol>
  <li>Obtain and use a 44net IP allocation via 44net Connect</li>
  <li>Set up a Raspberry Pi Zero 2 W as a WireGuard gateway router, including NAT and <code class="language-plaintext highlighter-rouge">iptables</code> firewall with <code class="language-plaintext highlighter-rouge">fail2ban</code></li>
  <li>Connect AREDN node (e.g., hAp2) via the 44Net tunnel</li>
  <li>Write a script to configure the routing table on the AREDN node, in case it is not peristed on router reboot</li>
</ol>

<h2 id="set-up-a-44net-allocation-and-tunnel-from-44net-connect">Set up a 44net Allocation and Tunnel from 44net Connect</h2>

<p>This step could be replaced with your existing 44net allocation, but, as I mentioned, I chose to work through the 44net Connect tunnel service and network allocation generously offered by their service.  Although I have traditional 44net subnets allocated for fixed use, I felt that the tunneled approach would give me some versatility in the deployment behind a variety of WAN connections.  Because I have an interest in computing and networking education, the ability to pack up and redeploy this setup in a mobile environment was advantageous for me.</p>

<p>Go to <a href="https://connect.44net.cloud">https://44net Connect</a> and create an account</p>

<p>Request a routed subnet (e.g., <code class="language-plaintext highlighter-rouge">44.x.y.z/29</code>) and a WireGuard tunnel</p>

<p>You’ll receive a:</p>

<ul>
  <li>Tunnel endpoint (e.g., <code class="language-plaintext highlighter-rouge">a.b.c.d:44000</code>)</li>
  <li>Allocated IP (e.g., <code class="language-plaintext highlighter-rouge">44.i.j.k/32</code>)</li>
  <li>Route for your subnet via that tunnel by editing the tunnel allocation and specifying static routing to the subnet</li>
</ul>

<h2 id="configure-a-raspberry-pi-as-a-gateway-to-44net-connect-for-the-aredn-router">Configure a Raspberry Pi as a Gateway to 44net Connect for the AREDN Router</h2>

<p>Although the router should generally support it, the AREDN firmware does not allow making a non-AREDN Wireguard connection such as the one I am using to connect to 44net Connect in order to route this network allocation (at least as far as I’m aware at the time of this writing!).  Instead, I’ve set up a Raspberry Pi Zero 2W to serve this purpose.  I’ll connect the pi to the AREDN router via its WiFI LAN Hotspot connection, and give the pi an IP address on the 44net subnet allocation alongside the router, and will set it up for IPv4 forwarding.</p>

<h3 id="configure-the-wireguard-tunnel">Configure the Wireguard Tunnel</h3>

<h4 id="install-wireguard">Install WireGuard</h4>

<p>Here, we create the Wireguard tunnel connection to 44net Connect from the Raspberry Pi, and configure the IP routes on the pi to send 44net traffic out via this Wireguard interface.  In addition, we route our local 44net subnet out via our LAN connection, so that it goes directly to the AREDN router.  In my case, I’m connected to the AREDN router via a wifi Part 15 connection, so I use <code class="language-plaintext highlighter-rouge">wlan0</code> for this interface.  If you plug the pi into the AREDN router via Ethernet, you might use <code class="language-plaintext highlighter-rouge">eth0</code> here instead.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo apt update
sudo apt install wireguard netfilter-persistent iptables-persistent resolvconf sshpass
</code></pre></div></div>

<p>Create <code class="language-plaintext highlighter-rouge">/etc/wireguard/wg0.conf</code>, using your 44net subnet allocation for <code class="language-plaintext highlighter-rouge">44.x.y.z/29</code>, your 44net Connect Wireguard tunnel keys, and substituting <code class="language-plaintext highlighter-rouge">wlan0</code> for the LAN interface you’re using from the pi to the AREDN node, as well as <code class="language-plaintext highlighter-rouge">44.a.b.c</code> for your wireguard tunnel client address and <code class="language-plaintext highlighter-rouge">a.b.c.d:44000</code> for your tunnel endpoint:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[Interface]
PrivateKey = &lt;local private key&gt;
Address = 44.a.b.c
DNS = 1.1.1.1,1.0.0.1

PostUp = ip route add 44.x.y.z/29 dev wlan0
PostDown = ip route del 44.x.y.z/29 dev wlan0

[Peer]
PublicKey = &lt;peer public key&gt;
PresharedKey = &lt;preshared key&gt;
Endpoint = a.b.c.d:44000
AllowedIPs = 44.0.0.0/9, 44.128.0.0/10
PersistentKeepalive = 20
</code></pre></div></div>

<h4 id="enable-and-start-wireguard">Enable and Start WireGuard</h4>

<p>Create <code class="language-plaintext highlighter-rouge">/etc/systemd/system/wg-44net-client.service</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[Unit]
Description=WireGuard Client to 44net (wg0)
After=network-online.target
Wants=network-online.target

[Service]
Type=oneshot
ExecStart=/usr/bin/wg-quick up wg0
ExecStop=/usr/bin/wg-quick down wg0
RemainAfterExit=yes

[Install]
WantedBy=multi-user.target
</code></pre></div></div>

<h4 id="enable-the-service-for-automatic-startup">Enable the Service for Automatic Startup</h4>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo systemctl enable wg-44net-client
sudo systemctl start wg-44net-client
</code></pre></div></div>

<h3 id="set-static-ip-enable-ip-forwarding-and-disable-rp-filtering">Set Static IP, Enable IP Forwarding, and Disable RP Filtering</h3>

<p>Set the AREDN router (hap2) to give a static IP to the pi.  For example, if your 44net subnet is <code class="language-plaintext highlighter-rouge">44.a.b.32/29</code>, then the AREDN router might take <code class="language-plaintext highlighter-rouge">44.a.b.33/29</code> and the pi could be assigned <code class="language-plaintext highlighter-rouge">44.a.b.34/29</code> as a DHCP reservation from the AREDN router.</p>

<p>To enable IPv4 forwarding, edit <code class="language-plaintext highlighter-rouge">/etc/sysctl.conf</code> and add or uncomment the following line:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>net.ipv4.ip_forward=1
</code></pre></div></div>

<p>Apply the changes:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo sysctl -p
</code></pre></div></div>

<p>If you find that <code class="language-plaintext highlighter-rouge">ip_forward</code> is still <code class="language-plaintext highlighter-rouge">1</code> when you execute <code class="language-plaintext highlighter-rouge">cat /proc/sys/net/ipv4/ip_forward</code> on the AREDN node or on the Pi, update them by running: <code class="language-plaintext highlighter-rouge">echo 1 &gt; /proc/sys/net/ipv4/ip_forward</code>.</p>

<p>On both the AREDN router, and on the Pi, ensure that IPv4 forwarding is enabled, and that RP filtering is disabled, by running:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cd /proc/sys/net/ipv4/conf
find . -iname 'rp_filter' -exec cat {} \;
</code></pre></div></div>

<p>If you find any that are set to <code class="language-plaintext highlighter-rouge">1</code> or <code class="language-plaintext highlighter-rouge">2</code>, update them via: <code class="language-plaintext highlighter-rouge">echo 0 &gt; /proc/sys/net/ipv4/conf/&lt;name&gt;/rp_filter</code>.</p>

<h3 id="set-up-nat-and-firewall-on-the-pi">Set up NAT and Firewall on the Pi</h3>

<p>Note that this is a minimal configuration of the firewall to enable the functionality, and should be hardened to filter traffic from the Internet.  This firewall is configured to allow all traffic from AREDN (<code class="language-plaintext highlighter-rouge">10.0.0.0/8</code>) and 44net (<code class="language-plaintext highlighter-rouge">44.0.0.0/9</code> and <code class="language-plaintext highlighter-rouge">44.128.0.0/10</code>), as well as bidirectional AREDN Wireguard tunnels on UDP port <code class="language-plaintext highlighter-rouge">5525</code>.  We also allow ICMP ping packets, and any established connection packets.  Finally, we masquerade traffic on the 44net Connect tunnel Wireguard interface <code class="language-plaintext highlighter-rouge">wg0</code>.  I do not explicitly allow TCP port 22 SSH traffic from the Internet, but rather allow ssh connections via the AREDN and 44net subnets (this is also overly permissive, but is presented this way to demonstrate the functionality).</p>

<p>Edit <code class="language-plaintext highlighter-rouge">/etc/iptables/rules.v4</code>, substituting your 44net subnet allocation for <code class="language-plaintext highlighter-rouge">44.x.y.z/29</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]

-A POSTROUTING -s 44.x.y.z/29 -o wg0 -j MASQUERADE
COMMIT

*filter
:INPUT DROP [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]

# Allow loopback
-A INPUT -i lo -j ACCEPT

# Allow established and related
-A INPUT -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT

# Allow ALL traffic from 44.0.0.0/9
-A INPUT -s 44.0.0.0/9 -j ACCEPT

# Allow ALL traffic from 44.128.0.0/10
-A INPUT -s 44.128.0.0/10 -j ACCEPT

# Allow ALL traffic from 10.0.0.0/8
-A INPUT -s 10.0.0.0/8 -j ACCEPT

# Allow AREDN tunnels
-A INPUT -p udp --dport 5525 -j ACCEPT
-A INPUT -p udp --sport 5525 -j ACCEPT

# Optional: Allow ICMP for ping and diagnostics
-A INPUT -p icmp -j ACCEPT

# Forwarding rules: Send AREDN Wireguard traffic from 44net Connect to the AREDN router
-A FORWARD -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT
COMMIT
</code></pre></div></div>

<p>Load and save these rules:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo iptables-restore &lt; /etc/iptables/rules.v4
sudo netfilter-persistent save
</code></pre></div></div>

<h3 id="implement-fail2ban-on-ssh">Implement fail2ban on SSH</h3>

<p>Install <code class="language-plaintext highlighter-rouge">fail2ban</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo apt install fail2ban
</code></pre></div></div>

<p>Create <code class="language-plaintext highlighter-rouge">/etc/fail2ban/jail.local</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[sshd]
enabled = true
backend = systemd
logpath = journal
port = ssh
maxretry = 5
findtime = 600
bantime = 86400
</code></pre></div></div>

<p>Start and Enable <code class="language-plaintext highlighter-rouge">fail2ban</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo systemctl enable fail2ban
sudo systemctl start fail2ban
</code></pre></div></div>

<h2 id="configure-the-aredn-node-router">Configure the AREDN Node Router</h2>

<p>Configure the node as with a 44net allocation for its local LAN, and specify its IP address (for example, <code class="language-plaintext highlighter-rouge">44.x.y.33/29</code> for an allocation of <code class="language-plaintext highlighter-rouge">44.x.y.32/29</code>).  I used <code class="language-plaintext highlighter-rouge">wan0</code> as a LAN hotspot so that my devices could connect to this router wirelessly over 2.4 ghz, and <code class="language-plaintext highlighter-rouge">wlan1</code> as a WAN client to my home wireless internet connection.</p>

<p>I also created a tunnel server on my home AREDN router node, to which this new 44net AREDN node connects.  Because my home AREDN node is exposed with a 44net IP address on a different subnet, the two are able to reach each other via the 44net Connect Wireguard tunnel via the Raspberry pi.  So, I used the 44net address of my home AREDN router as the public IP from which this new node connects.</p>

<p>Finally, I allow WAN access to my LAN nodes (and provide a default route) through the network settings on the main page of the GUI.</p>

<h2 id="configure-the-routing-table-on-the-aredn-router">Configure the Routing Table on the AREDN Router</h2>

<p>Now, we will configure the AREDN router to route 44net traffic out via the Raspberry Pi, since the pi has a Wireguard connection to the 44net Connect tunnel.</p>

<p>Custom routes on the AREDN router are overwritten on reboot (as far as I can tell).  In addition, the home directory is <code class="language-plaintext highlighter-rouge">/tmp</code>, so I was unable to use ssh key login to script the creation of these routes via ssh when the Raspberry pi boots up.</p>

<p>But, for good measure, I created the following startup script on the AREDN node after logging in via <code class="language-plaintext highlighter-rouge">ssh root@localnode.local.mesh -p 2222</code> and creating a file <code class="language-plaintext highlighter-rouge">/etc/init.d/customroutes</code> (substitute <code class="language-plaintext highlighter-rouge">44.x.y.z/29</code> for your 44net subnet allocation, and <code class="language-plaintext highlighter-rouge">44.x.y.A</code> for the reserved DHCP address of your Raspberry Pi that you assigned earlier from your AREDN node out of your 44net subnet allocation, in addition to a route for your tunnel endpoint <code class="language-plaintext highlighter-rouge">44.a.b.c</code> in case it is within one of the routing blocks like <code class="language-plaintext highlighter-rouge">44.0.0.0/9</code> (so that it doesn’t go out over the same interface, but instead over your WAN default route gateway (which we will detect with the <code class="language-plaintext highlighter-rouge">get_wlan1_gw</code> function below) and <code class="language-plaintext highlighter-rouge">wlan1</code> (use your WAN interface for <code class="language-plaintext highlighter-rouge">wlan1</code> everywhere here):</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#!/bin/sh /etc/rc.common

START=95
STOP=10

get_wlan1_gw() {
    # Output gateway for default route on wlan1, or empty string if not found
    set -- $(ip -4 route show default dev wlan1 2&gt;/dev/null)
    [ "$1" = "default" ] &amp;&amp; [ "$2" = "via" ] &amp;&amp; echo "$3"
}

start() {
    logger -t customroutes "Adding custom IP routes"
    sleep 5
    ip rule add to 44.0.0.0/9 lookup main priority 5 
    ip rule add to 44.128.0.0/10 lookup main priority 6 
    ip route add 44.0.0.0/9 via 44.x.y.A
    ip route add 44.128.0.0/10 via 44.x.y.A
    ip route add 44.x.y.z/29 dev br-lan
    
    gw="$(get_wlan1_gw)"
    ip route add 44.a.b.c via "$gw" dev wlan1 
    
    logger -t customroutes "Custom IP routes added"
}

stop() {
    logger -t customroutes "Removing custom IP routes"
    ip rule del to 44.0.0.0/9 lookup main priority 5 
    ip rule del to 44.128.0.0/10 lookup main priority 6     
    ip route del 44.0.0.0/9 via 44.x.y.A
    ip route del 44.128.0.0/10 via 44.x.y.A
    ip route del 44.x.y.z/29 dev br-lan
    
    gw="$(get_wlan1_gw)"
    ip route del 44.a.b.c via "$gw" dev wlan1 
    
    logger -t customroutes "Custom IP routes removed"
}
</code></pre></div></div>

<p>Inside the <code class="language-plaintext highlighter-rouge">start()</code> function, add these lines as needed from the IP forwarding and RP filtering step above:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>echo 0 &gt; /proc/sys/net/ipv4/conf/&lt;name&gt;/rp_filter
echo 1 &gt; /proc/sys/net/ipv4/ip_forward 
</code></pre></div></div>

<p>Enable the service to run at startup on the AREDN node (although the node might not respect it):</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>chmod +x /etc/init.d/customroutes
/etc/init.d/customroutes enable
</code></pre></div></div>

<h3 id="scripting-this-from-the-rasperry-pi">Scripting This from the Rasperry Pi</h3>

<p>In case the AREDN router fails to execute this on startup, I also made a script on the Raspberry Pi to execute these <code class="language-plaintext highlighter-rouge">ip route</code> commands via ssh.</p>

<p>On the pi, create a file <code class="language-plaintext highlighter-rouge">/usr/local/bin/setup-mesh-routes.sh</code> with the same commands and addresses as above:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#!/bin/bash

# Define remote host and port
REMOTE_HOST="root@localnode.local.mesh"
REMOTE_PORT=2222

# Run the routing commands over SSH
ssh -p $REMOTE_PORT $REMOTE_HOST &lt;&lt; 'EOF'
ip rule add to 44.0.0.0/9 lookup main priority 5 || true
ip rule add to 44.128.0.0/10 lookup main priority 6 || true
ip route add 44.0.0.0/9 via 44.x.y.A || true
ip route add 44.128.0.0/10 via 44.x.y.A || true
ip route add 44.x.y.z/29 dev br-lan || true

set -- $(ip -4 route show default dev wlan1)
gw="$3"
ip route add 44.a.b.c via "$gw" dev wlan1 || true
EOF
</code></pre></div></div>

<p>Similarly, add these lines as needed for the IP forwarding and RP filtering steps from earlier:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>echo 0 &gt; /proc/sys/net/ipv4/conf/&lt;name&gt;/rp_filter
echo 1 &gt; /proc/sys/net/ipv4/ip_forward 
</code></pre></div></div>

<p>Executing <code class="language-plaintext highlighter-rouge">/usr/local/bin/setup-mesh-routes.sh</code> from the raspberry pi will set up the routes.  The <code class="language-plaintext highlighter-rouge">|| true</code> clause at each line allows the script to continue and exit successfully even if the routes already exist on the router.</p>

<h4 id="automating-the-script-to-run-at-startup">Automating the Script to Run at Startup</h4>

<p>If the AREDN mesh could retain ssh keys for logging in, I could fully automate this by enabling an <code class="language-plaintext highlighter-rouge">init.d</code> service on the raspberry pi to execute this script whenever the wireless LAN becomes available.  However, since this requires a password, I ssh into the pi and run this script myself.  If you are willing to store your AREDN root password on your pi, you could pass it through the script to call <code class="language-plaintext highlighter-rouge">ssh</code>.  The <code class="language-plaintext highlighter-rouge">sshpass</code> program does exactly this, and you can replace the line <code class="language-plaintext highlighter-rouge">ssh -p $REMOTE_PORT $REMOTE_HOST &lt;&lt; 'EOF'</code> with:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sshpass -p "$PASSWORD" ssh -o StrictHostKeyChecking=no -p $REMOTE_PORT $REMOTE_HOST &lt;&lt; 'EOF' 
</code></pre></div></div>

<p>Assuming you’ve set the <code class="language-plaintext highlighter-rouge">PASSWORD</code> environment variable in the script, from a file, or from the outside environment.</p>

<p>In that case, this can be established as a startup service on the pi:</p>

<p>Create the file <code class="language-plaintext highlighter-rouge">/etc/system/system/aredn-routing.service</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[Unit]
Description=Configure IP routes for AREDN on boot
After=network-online.target
Wants=network-online.target

[Service]
Type=oneshot
ExecStart=/usr/local/bin/setup-mesh-routes.sh
RemainAfterExit=yes

[Install]
WantedBy=multi-user.target
</code></pre></div></div>

<p>Enable it to run at startup via:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo systemctl daemon-reexec
sudo systemctl daemon-reload
sudo systemctl enable aredn-routing.service
sudo systemctl start aredn-routing.service
</code></pre></div></div>]]></content><author><name>Bill Mongan</name><email>billmongan+website@gmail.com</email></author><category term="hamradio" /><summary type="html"><![CDATA[This guide will walk you through setting up a Mikrotik hAp device (I used a hAp ac2) to use 44net addresses, bridging AREDN and 44net services between the two networks. I set up the hAp to broadcast a WiFi hotspot SSID that, when connected to a client, enables access to both 44net and to AREDN resources simultaneously. I use 44net Connect (formerly 44net.cloud) to route a network allocation they assigned to me through a Wireguard tunnel that they also assigned. The tunnel can be configured through their portal to route to the network. It is likely also possible to do this by decapsulating the ipencap packets from the raspberry pi directly, and using a traditional 44net subnet allocation, but this setup enables me to take the hAp setup to mobile deployments, without worring about the NAT configuration or my ability to forward ipencap traffic at my destination.]]></summary></entry><entry><title type="html">Setting Up 44net on a Mikrotik Using 44net.cloud for ipencap Forwarding</title><link href="https://www.billmongan.com/posts/2025/06/44net/" rel="alternate" type="text/html" title="Setting Up 44net on a Mikrotik Using 44net.cloud for ipencap Forwarding" /><published>2025-06-28T00:00:00+00:00</published><updated>2025-06-28T00:00:00+00:00</updated><id>https://www.billmongan.com/posts/2025/06/44net</id><content type="html" xml:base="https://www.billmongan.com/posts/2025/06/44net/"><![CDATA[<p>This guide will walk you through setting up a Mikrotik router with a 44net network allocation using Wireguard to <a href="https://www.44net.cloud">44net.cloud</a> in order to receive ipip encapsulated packets from the UCSD 44net router.  This way, you do not need to be able to forward these packets through your home router or have native IP Protocol 4 support to access 44net.  In this setup, I used a Mikrotik hAp ac2 lite.</p>

<h2 id="configure-44netcloud-and-44net-network">Configure 44net.cloud and 44net network</h2>

<p>On the <a href="https://portal.ampr.org">AMPR portal</a>, request a subnet, and on 44net.cloud, request a tunnel.</p>

<p>Once you receive the tunnel, go back to the AMPR portal, and under <code class="language-plaintext highlighter-rouge">Networks</code> - <code class="language-plaintext highlighter-rouge">My Gateways</code>, click <code class="language-plaintext highlighter-rouge">Create a Gateway</code>.  Fill in the 44net public IP address of your 44net.cloud tunnel as well as the DNS record for that IP address.  You can use <code class="language-plaintext highlighter-rouge">nslookup</code> to determine if a DNS entry exists, or create your own with a third party service and use that domain name here.</p>

<p>Link this gateway to your 44net subnet allocation.</p>

<h2 id="configure-ampr-portal-dns-record-for-your-mikrotik-gateway">Configure AMPR Portal DNS Record for your Mikrotik gateway</h2>

<p>On the AMPR portal, and under <code class="language-plaintext highlighter-rouge">DNS</code> - <code class="language-plaintext highlighter-rouge">My Subdomains</code>, select your subdomain (or create one if needed).  Click <code class="language-plaintext highlighter-rouge">Add a resource record</code> and give a hostname like <code class="language-plaintext highlighter-rouge">gw</code> that points to an A record of your gateway IP address (the first usable address within your 44net subdomain; so <code class="language-plaintext highlighter-rouge">44.x.y.1</code> if your network is <code class="language-plaintext highlighter-rouge">44.x.y.0/29</code>.</p>

<h2 id="configuring-the-mikrotik">Configuring the Mikrotik</h2>

<p>Log into your Mikrotik device.</p>

<h3 id="establishing-wireguard-connection-to-44netcloud">Establishing Wireguard Connection to 44net.cloud</h3>

<p>Execute the following commands in terminal mode to set up your Wireguard connection to 44net.cloud, assuming <code class="language-plaintext highlighter-rouge">44.X.Y.Z/32</code> is your 44net.cloud tunnel IP address:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># 1. Add the WireGuard interface - PUT YOUR WIREGUARD PRIVATE KEY BELOW
/interface/wireguard/add name=wg-44net listen-port=13231 private-key="YOUR PRIVATE KEY HERE"

# 2. Assign the IP address to the interface - PUT YOUR 44NET.CLOUD 44net IP ADDRESS BELOW
/ip/address/add address=44.X.Y.Z/32 interface=wg-44net

# 3. Set the DNS server (optional, for local resolver usage)
/ip/dns/set servers=1.1.1.1

# 4. Add the WireGuard peer - PUT YOUR 44NET.CLOUD PEER PUBLIC KEY, PRESHARED KEY, ENDPOINT ADDRESS, and PORT BELOW
/interface/wireguard/peers/add interface=wg-44net \
    public-key="YOUR PUBLIC KEY HERE" \
    preshared-key="YOUR PRESHARED KEY HERE" \
    endpoint-address=44.X.Y.Z \
    endpoint-port=YOUR 44NET.CLOUD PORT HERE \
    allowed-address=44.0.0.0/9,44.128.0.0/10,169.228.34.84 \
    persistent-keepalive=20

# 5. Configure traffic
/ip firewall filter add chain=input action=accept in-interface=wg-44net comment="Allow input from wg-44net" place-before=5

/ip route add dst-address=44.0.0.0/9 gateway=wg-44net comment="44Net outbound routing"
/ip route add dst-address=44.128.0.0/10 gateway=wg-44net comment="44Net outbound routing"
</code></pre></div></div>

<p>Here, we configure the wireguard connection to allow 44net traffic as well as traffic to the <code class="language-plaintext highlighter-rouge">amprgw</code> at UCSD.</p>

<h3 id="configure-wan-options">Configure WAN Options</h3>

<p>My hAp ac2 has two wireless radios on 2.4 and 5 ghz (<code class="language-plaintext highlighter-rouge">wlan1</code> and <code class="language-plaintext highlighter-rouge">wlan2</code>, respectively).  Optionally, you can configure your node as a WiFi client or hotspot to connect to the Internet or to create a hotspot.  If you configure a WiFi client, be sure it (i.e., <code class="language-plaintext highlighter-rouge">wlan2</code>) is acting as a DHCP client in addition to the <code class="language-plaintext highlighter-rouge">ether1</code> port.</p>

<p>Here, we will configure <code class="language-plaintext highlighter-rouge">wlan1</code> as a client and put it on the bridge with our ethernet LAN ports, and then add <code class="language-plaintext highlighter-rouge">ether1</code> as a DHCP client acting as the WAN port.  I will also set <code class="language-plaintext highlighter-rouge">wlan2</code> to act as a WiFi client so that I can connect to the internet.</p>

<p>First, we will set up the bridge to include the ethernet LAN ports:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/interface bridge port
add bridge=bridge comment=defconf interface=ether2
add bridge=bridge comment=defconf interface=ether3
add bridge=bridge comment=defconf interface=ether4
add bridge=bridge comment=defconf interface=ether5
</code></pre></div></div>

<h4 id="configure-wifi-hotspot">Configure WiFi Hotspot</h4>

<p><code class="language-plaintext highlighter-rouge">wlan1</code> can be configured as a wireless access point hotspot so that clients can connect to it and receive 44net IP addresses as if they were plugged into a LAN ethernet port.  Set <code class="language-plaintext highlighter-rouge">wlan1</code> to <code class="language-plaintext highlighter-rouge">ap-bridge</code> mode and associate it with an SSID name and security profile with your desired pre-shared keys.  Then add it to the bridge:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/interface bridge port
add bridge=bridge comment=defconf interface=wlan1
</code></pre></div></div>

<h4 id="configure-wifi-client">Configure WiFi Client</h4>

<p><code class="language-plaintext highlighter-rouge">wlan2</code> can be used as an WiFi client (<code class="language-plaintext highlighter-rouge">station</code> mode) for outbound connections from your clients to the Internet via an existing base station.  I configured <code class="language-plaintext highlighter-rouge">wlan2</code> as a client, and associated it with a security profile that included my pre-shared keys for the wireless network I was connecting to (similar to the WiFi hotspot I established on <code class="language-plaintext highlighter-rouge">wlan1</code>).  I set <code class="language-plaintext highlighter-rouge">wlan2</code> to be a wireless <code class="language-plaintext highlighter-rouge">station</code>, and set <code class="language-plaintext highlighter-rouge">wlan2</code> as a DHCP client.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/interface wireless set [find name=wlan2] mode=station
/ip dhcp-client add interface=wlan2 add-default-route=yes default-route-distance=1 use-peer-dns=yes use-peer-ntp=yes disabled=no
/interface list member
add interface=wlan2 list=WAN comment="Wi-Fi uplink is WAN too"
</code></pre></div></div>

<h4 id="configure-dhcp">Configure DHCP</h4>

<p>Set up your router’s DHCP server to use the 44net subnet you’ve been allocated by AMPR, filling in your subnet information (i.e., <code class="language-plaintext highlighter-rouge">44.x.y.0/29</code>), gateway IP (i.e., <code class="language-plaintext highlighter-rouge">44.x.y.1</code>), and usable IP addresses (i.e., <code class="language-plaintext highlighter-rouge">44.x.y.1-44.x.y.6</code>) below.  We will also set the <code class="language-plaintext highlighter-rouge">ether1</code> WAN port as a DHCP client.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/ip pool
add name=dhcp ranges=44.x.y.1-44.x.y.6
/ip dhcp-server
add address-pool=dhcp interface=bridge name=defconf
/ip dhcp-server network
add address=44.x.y.0/29 gateway=44.x.y.1 dns-server=44.x.y.1
/ip dhcp-client
add interface=ether1 comment=defconf
</code></pre></div></div>

<h3 id="assigning-ip-addresses-from-your-44net-subnet-and-from-44netcloud">Assigning IP addresses from your 44net Subnet and from 44net.cloud</h3>

<p>Assign an IP address for the gateway to internal network bridge.  We assume your gateway IP is <code class="language-plaintext highlighter-rouge">44.x.y.1</code> from your AMPR 44net subnet allocation.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/ip address
add address=44.x.y.1/29 interface=bridge comment=defconf
</code></pre></div></div>

<h3 id="create-an-ipip-tunnel-to-the-ucsd-44net-gateway">Create an IPIP Tunnel to the UCSD 44net Gateway</h3>

<p>In order to receive IPIP packets to your gateway from UCSD, you will need an IPIP tunnel interface to UCSD (using the IP address associated with <code class="language-plaintext highlighter-rouge">amprgw.ucsd.edu</code>) and from your 44net.cloud IP tunnel address <code class="language-plaintext highlighter-rouge">44.X.Y.Z</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/interface ipip
add name=ipip-44net local-address=44.X.Y.Z remote-address=169.228.34.84 allow-fast-path=no
/ip address add address=44.X.Y.Z/32 interface=ipip-44net
/ip firewall mangle
add chain=forward tcp-flags=syn protocol=tcp out-interface=ipip-44net \
    action=change-mss new-mss=clamp-to-pmtu comment="Clamp MSS for IPIP-over-WG on egress"
add chain=forward action=change-mss protocol=tcp tcp-flags=syn in-interface=wg-44net \
    new-mss=clamp-to-pmtu comment="Clamp MSS when ingressing WG"    
</code></pre></div></div>

<h3 id="configuring-the-basic-routing-table">Configuring the Basic Routing Table</h3>

<p>To route traffic over the Wireguard interface and over the IPIP tunnel, add the following routing tables and routes.  Here, we assume <code class="language-plaintext highlighter-rouge">44.x.y.z</code> is your 44net.cloud tunnel IP address, and <code class="language-plaintext highlighter-rouge">44.0.0.0/29</code> is your AMPR 44net subnet allocation.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/routing table
add name=via-wg
add name=via-ipip
add fib name=via-wg
add fib name=via-ipip

/ip route
add dst-address=44.0.0.0/9 gateway=wg-44net comment="Send all AMPRNet traffic via WireGuard"
add dst-address=44.128.0.0/10 gateway=wg-44net comment="Send all AMPRNet traffic via WireGuard"
add dst-address=0.0.0.0/0 gateway=wg-44net routing-table=via-wg comment="Send all traffic that came in over Wireguard back out via Wireguard"
add dst-address=0.0.0.0/0 gateway=ipip-44net routing-table=via-ipip comment="Send all traffic that came in over the IPIP tunnel back out via the IPIP tunnel"
add dst-address=169.228.34.84/32 out-interface=wg-44net comment="UCSD IPIP transport via WG"
    
/routing rule
add src-address=44.x.y.z/32 interface=wg-44net action=lookup table=via-wg 
add src-address=44.x.y.z/32 interface=ipip-44net action=lookup table=via-ipip
add src-address=44.0.0.0/29 interface=wg-44net action=lookup table=via-wg
add src-address=44.0.0.0/29 interface=ipip-44net action=lookup table=via-ipip
</code></pre></div></div>

<h3 id="configuring-basic-firewall-rules">Configuring Basic Firewall Rules</h3>

<p>You’ll want to add additional rules to harden this installation!  These simply make the tunnel connections functional, and these rules <strong>are not intended to secure the network from the outside</strong>.  Here, we assume <code class="language-plaintext highlighter-rouge">44.x.y.0/29</code> is your 44net subnet allocation from AMPR, <code class="language-plaintext highlighter-rouge">44.x.y.1</code> is your gateway IP address and that <code class="language-plaintext highlighter-rouge">44.x.y.2-44.x.y.6</code> are your remaining routable IP addresses on your subnet.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/ip firewall filter
add chain=input action=accept connection-state=established,related,untracked comment="Allow established"
add chain=input action=drop connection-state=invalid comment="Drop invalid"
add chain=input action=accept protocol=icmp comment="Allow ICMP"
add chain=input action=accept in-interface=wg-44net comment="Allow input from wg-44net"
add chain=input action=accept dst-address=44.x.y.1 protocol=icmp comment="Allow ICMP to 44Net address"
add chain=input action=drop in-interface-list=!LAN comment="Drop non-LAN traffic"
add chain=input in-interface=wg-44net protocol=ipip action=accept comment="Allow inbound IPIP via WG"

add chain=forward action=accept connection-state=established,related,untracked
add chain=forward action=drop connection-state=invalid
add chain=forward action=fasttrack-connection connection-state=established,related hw-offload=yes comment="FastTrack"
add chain=forward action=accept dst-address=44.x.y.2-44.x.y.6 comment="Inbound to 44Net hosts"
add chain=forward action=accept dst-address=44.0.0.0/9 src-address=44.x.y.0/29
add chain=forward action=accept dst-address=44.128.0.0/10 src-address=44.x.y.0/29
add chain=forward action=accept dst-address=44.x.y.0/29 in-interface=wg-44net
add chain=forward action=accept out-interface=wg-44net src-address=44.x.y.0/29
add chain=forward action=accept dst-address=44.x.y.0/29 in-interface=ipip-44net
add chain=forward action=accept out-interface=ipip-44net src-address=44.x.y.0/29
add chain=forward out-interface=ipip-44net action=accept comment="Allow outbound to IPIP tunnel"

/ip/firewall/connection/tracking/set enabled=yes
</code></pre></div></div>

<h3 id="configuring-dns">Configuring DNS</h3>

<p>Enable outside DNS lookups as follows (feel free to replace your favorite DNS servers for <code class="language-plaintext highlighter-rouge">1.1.1.1,8.8.8.8</code>.  We assume <code class="language-plaintext highlighter-rouge">44.x.y.1</code> is your gateway IP.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/ip dns
set allow-remote-requests=yes servers=1.1.1.1,8.8.8.8
/ip dns static
add name=router.lan address=44.x.y.1 type=A comment=defconf
</code></pre></div></div>

<h3 id="adding-44net-network-gateways-to-your-routing-table">Adding 44net Network Gateways to your Routing Table</h3>

<p>You can route traffic over the IPIP tunnel directly to other 44net subnets.  To do this, you need to be aware of their routing table.  UCSD sends this routing table periodically to all nodes, but there is not an easy way to process them on Mikrotik routers once they’re received.  Instead, we can create them manually, and flush/re-create the table on-demand to receive any updates.</p>

<p>This Python program can be run on your local computer, and it generates a Mikrotik script that you can upload and run on your router to populate these routing entries.  They are all tagged with <code class="language-plaintext highlighter-rouge">ampr-imported-route</code> so that they can be flushed for re-creation without modifying the rest of your routing rules.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">requests</span>
<span class="kn">import</span> <span class="nn">ipaddress</span>

<span class="c1"># Upload output file to router and execute with /import file-name=ampr_routes.rsc
</span>
<span class="n">API_TOKEN</span> <span class="o">=</span> <span class="s">"YOUR AMPR PORTAL API TOKEN FROM YOUR AMPR PORTAL PROFILE PAGE"</span>
<span class="n">OUTPUT_FILE</span> <span class="o">=</span> <span class="s">"ampr_routes.rsc"</span>
<span class="n">GATEWAY</span> <span class="o">=</span> <span class="s">"ipip-44net"</span> <span class="c1"># or none to use their address, if routing over ipip already for these IP ranges
</span><span class="n">ROUTE_COMMENT</span> <span class="o">=</span> <span class="s">"ampr-imported-route"</span>

<span class="c1"># List of local subnets to skip (any CIDR you handle internally)
</span><span class="n">SKIP_PREFIXES</span> <span class="o">=</span> <span class="p">[</span>
    <span class="s">"44.x.y.0/29"</span><span class="p">,</span>     <span class="c1"># Your allocation
</span><span class="p">]</span>
<span class="n">SKIP_NETWORKS</span> <span class="o">=</span> <span class="p">[</span><span class="n">ipaddress</span><span class="p">.</span><span class="n">ip_network</span><span class="p">(</span><span class="n">p</span><span class="p">)</span> <span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="n">SKIP_PREFIXES</span><span class="p">]</span>

<span class="c1"># Fetch route list from AMPRNet
</span><span class="n">response</span> <span class="o">=</span> <span class="n">requests</span><span class="p">.</span><span class="n">get</span><span class="p">(</span>
    <span class="s">"https://portal.ampr.org/api/v2/encap/routes"</span><span class="p">,</span>
    <span class="n">headers</span><span class="o">=</span><span class="p">{</span>
        <span class="s">"Authorization"</span><span class="p">:</span> <span class="sa">f</span><span class="s">"Bearer </span><span class="si">{</span><span class="n">API_TOKEN</span><span class="si">}</span><span class="s">"</span><span class="p">,</span>
        <span class="s">"Accept"</span><span class="p">:</span> <span class="s">"application/json"</span>
    <span class="p">}</span>
<span class="p">)</span>

<span class="n">data</span> <span class="o">=</span> <span class="n">response</span><span class="p">.</span><span class="n">json</span><span class="p">()</span>
<span class="n">routes</span> <span class="o">=</span> <span class="n">data</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="s">"encap"</span><span class="p">,</span> <span class="p">[])</span>

<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">OUTPUT_FILE</span><span class="p">,</span> <span class="s">"w"</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
    <span class="c1"># First, remove all old AMPR routes by comment
</span>    <span class="n">f</span><span class="p">.</span><span class="n">write</span><span class="p">(</span><span class="sa">f</span><span class="s">"/ip route remove [find comment=</span><span class="se">\"</span><span class="si">{</span><span class="n">ROUTE_COMMENT</span><span class="si">}</span><span class="se">\"</span><span class="s">]</span><span class="se">\n</span><span class="s">"</span><span class="p">)</span>

    <span class="c1"># Now add the updated routes
</span>    <span class="k">for</span> <span class="n">route</span> <span class="ow">in</span> <span class="n">routes</span><span class="p">:</span>
        <span class="n">prefix</span> <span class="o">=</span> <span class="sa">f</span><span class="s">"</span><span class="si">{</span><span class="n">route</span><span class="p">[</span><span class="s">'network'</span><span class="p">]</span><span class="si">}</span><span class="s">/</span><span class="si">{</span><span class="n">route</span><span class="p">[</span><span class="s">'cidr'</span><span class="p">]</span><span class="si">}</span><span class="s">"</span>
        <span class="n">leaseholder</span> <span class="o">=</span> <span class="n">route</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="s">"leaseholder"</span><span class="p">,</span> <span class="s">"unknown"</span><span class="p">)</span>
        
        <span class="c1"># Skip if the prefix overlaps any local subnet
</span>        <span class="k">if</span> <span class="nb">any</span><span class="p">(</span><span class="n">ipaddress</span><span class="p">.</span><span class="n">ip_network</span><span class="p">(</span><span class="n">prefix</span><span class="p">).</span><span class="n">overlaps</span><span class="p">(</span><span class="n">local</span><span class="p">)</span> <span class="k">for</span> <span class="n">local</span> <span class="ow">in</span> <span class="n">SKIP_NETWORKS</span><span class="p">):</span>
            <span class="k">continue</span>        
            
        <span class="k">if</span> <span class="n">GATEWAY</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span>
            <span class="n">gateway</span> <span class="o">=</span> <span class="n">route</span><span class="p">[</span><span class="s">'gatewayIP'</span><span class="p">]</span>
        <span class="k">else</span><span class="p">:</span>
            <span class="n">gateway</span> <span class="o">=</span> <span class="n">GATEWAY</span>
        
        <span class="n">f</span><span class="p">.</span><span class="n">write</span><span class="p">(</span>
            <span class="sa">f</span><span class="s">"/ip route add dst-address=</span><span class="si">{</span><span class="n">prefix</span><span class="si">}</span><span class="s"> gateway=</span><span class="si">{</span><span class="n">gateway</span><span class="si">}</span><span class="s"> comment=</span><span class="se">\"</span><span class="si">{</span><span class="n">ROUTE_COMMENT</span><span class="si">}</span><span class="se">\"\n</span><span class="s">"</span>
        <span class="p">)</span>
</code></pre></div></div>

<p>Upload the <code class="language-plaintext highlighter-rouge">ampr_routes.rsc</code> file to your Mikrotik router under the <code class="language-plaintext highlighter-rouge">Files</code> tab, and back in the terminal, execute the following command to run the script and add the routes:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/import file-name=ampr_routes.rsc
</code></pre></div></div>

<h2 id="wrapping-up">Wrapping Up</h2>

<p>Reboot the router and wait an hour for UCSD to start sending packets to your gateway.</p>

<p>Test by pinging your gateway (<code class="language-plaintext highlighter-rouge">gw.&lt;your call sign&gt;.ampr.org</code> or <code class="language-plaintext highlighter-rouge">44.x.y.1</code>).</p>

<p>Be sure to set firewall rules to restrict traffic, especially inbound, to your router!</p>]]></content><author><name>Bill Mongan</name><email>billmongan+website@gmail.com</email></author><category term="hamradio" /><summary type="html"><![CDATA[This guide will walk you through setting up a Mikrotik router with a 44net network allocation using Wireguard to 44net.cloud in order to receive ipip encapsulated packets from the UCSD 44net router. This way, you do not need to be able to forward these packets through your home router or have native IP Protocol 4 support to access 44net. In this setup, I used a Mikrotik hAp ac2 lite.]]></summary></entry><entry><title type="html">Setting Up AllStarLink ASL3 and Associated Tools</title><link href="https://www.billmongan.com/posts/2025/01/asl3/" rel="alternate" type="text/html" title="Setting Up AllStarLink ASL3 and Associated Tools" /><published>2025-01-10T00:00:00+00:00</published><updated>2025-01-10T00:00:00+00:00</updated><id>https://www.billmongan.com/posts/2025/01/asl3</id><content type="html" xml:base="https://www.billmongan.com/posts/2025/01/asl3/"><![CDATA[<p>This guide walks you through installing and configuring AllStarLink (ASL3) along with several helpful management tools and utilities:</p>

<ul>
  <li>allscan</li>
  <li>supermon</li>
  <li>allmon3</li>
  <li>skywarn plus</li>
  <li>dvswitch</li>
  <li>digital_link</li>
  <li>alltune</li>
</ul>

<h2 id="step-1-install-asl3">Step 1: Install ASL3</h2>

<ol>
  <li><strong>Prepare the Raspberry Pi</strong>:
    <ul>
      <li>Download and install the Raspberry Pi Imager.</li>
      <li>Select Lite 64-bit OS and configure username, password, and enable SSH.</li>
      <li>Flash the microSD card and boot the Raspberry Pi.</li>
    </ul>
  </li>
  <li><strong>Install ASL3</strong>:
    <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">sudo</span> <span class="nt">-s</span>
apt update <span class="o">&amp;&amp;</span> apt upgrade
<span class="nb">cd</span> /tmp
wget https://repo.allstarlink.org/public/asl-apt-repos.deb12_all.deb
dpkg <span class="nt">-i</span> asl-apt-repos.deb12_all.deb
apt update
apt <span class="nb">install </span>git asl3
</code></pre></div>    </div>
  </li>
  <li><strong>Configure ASL3</strong>:
    <ul>
      <li>Run the ASL configuration menu:
        <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>asl-menu
</code></pre></div>        </div>
      </li>
      <li>Adjust node settings as needed and save.</li>
      <li>For my setup, I set CTCSS From to <code class="language-plaintext highlighter-rouge">usb</code> instead of <code class="language-plaintext highlighter-rouge">usbinvert</code></li>
    </ul>
  </li>
</ol>

<p>Access ASL3 at <code class="language-plaintext highlighter-rouge">http://&lt;your_ip_address&gt;:9090</code>.</p>

<h2 id="step-2-configure-asl3-components">Step 2: Configure ASL3 Components</h2>

<h3 id="simpleusb-configuration">SimpleUSB Configuration</h3>
<p>Edit the <code class="language-plaintext highlighter-rouge">simpleusb.conf</code> file to set up your hardware interface:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>nano /etc/asterisk/simpleusb.conf
</code></pre></div></div>

<h3 id="configure-audio-levels">Configure Audio Levels</h3>
<p>Restart Asterisk and test the setup:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>asterisk <span class="nt">-r</span>
rpt fun &lt;your_node_number&gt; <span class="k">*</span>355553
</code></pre></div></div>

<p>Speak into the microphone and adjust the audio settings so that the meter just touches the 5kHz level.  Disconnect when done with:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>rpt fun &lt;your_node_number&gt; <span class="k">*</span>155553
</code></pre></div></div>

<p>To set the audio transmit level, edit <code class="language-plaintext highlighter-rouge">/etc/asterisk/simpleusb.conf</code> and set the <code class="language-plaintext highlighter-rouge">rxmixerset</code> value.  I went with a value around <code class="language-plaintext highlighter-rouge">650</code>, such that running <code class="language-plaintext highlighter-rouge">asterisk -rvvv</code> (after restarting asterisk) and <code class="language-plaintext highlighter-rouge">rpt debug level 7</code> decodes and displays DTMF codes being sent and audio levels “just about right” or at least just a bit low when testing on node <code class="language-plaintext highlighter-rouge">55553</code>.  I found that setting the audio level to the “just about right” setting caused the DTMF tones to be oversaturated and fail to decode, so I adjusted this to a slightly lower value.</p>

<h3 id="test-allstarlink-connection">Test AllStarLink Connection</h3>

<p>Connect to Allstar Echo node: <code class="language-plaintext highlighter-rouge">40894</code> via the radio by entering <code class="language-plaintext highlighter-rouge">*340894</code> to connect (and speak / echo), and <code class="language-plaintext highlighter-rouge">*140894</code> to disconnect.</p>

<h3 id="allmon3-setup">Allmon3 Setup</h3>
<ol>
  <li>Install Allmon3:
    <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apt <span class="nb">install </span>allmon3
</code></pre></div>    </div>
  </li>
  <li>
    <p>Configure Allmon3 settings in <code class="language-plaintext highlighter-rouge">/etc/allmon3/allmon3.ini</code> according to <a href="https://github.com/VALER24/allstar-shari-dvswitch-install-guide">these instructions</a>.  Be sure to remove the <code class="language-plaintext highlighter-rouge">allmon3</code> user account and add an admin password:</p>

    <p>Update passwords:</p>
    <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>allmon3-passwd <span class="nt">--delete</span> allmon3
allmon3-passwd admin
</code></pre></div>    </div>

    <p>Uncomment / set account and <code class="language-plaintext highlighter-rouge">manager.conf</code> secret for node 1999.</p>
  </li>
  <li>Restart services:
    <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>systemctl restart asterisk
systemctl start allmon3
</code></pre></div>    </div>
  </li>
</ol>

<p>Access Allmon3 at <code class="language-plaintext highlighter-rouge">http://&lt;your_ip_address&gt;/allmon3</code>.</p>

<h3 id="echolink-setup">Echolink Setup</h3>
<p>Edit the following files to configure Echolink: <code class="language-plaintext highlighter-rouge">/etc/asterisk/echolink.conf</code> and <code class="language-plaintext highlighter-rouge">/etc/asterisk/modules.conf</code>.</p>

<p>In <code class="language-plaintext highlighter-rouge">/etc/asterisk/modules.conf</code>, add or uncomment this line: <code class="language-plaintext highlighter-rouge">load =&gt; chan_echolink.so</code>.  If there is a similar line with <code class="language-plaintext highlighter-rouge">noload =&gt;</code>, comment that out.</p>

<p>In /etc/asterisk/echolink.conf, add your AllStarLink node number and set your personal EchoLink information.</p>

<p>Load the Echolink module and restart Asterisk:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>systemctl restart asterisk
</code></pre></div></div>

<p>Test by connecting to <code class="language-plaintext highlighter-rouge">*33009999</code> (EchoLink echo test 9999); then <code class="language-plaintext highlighter-rouge">*13009999</code> to disconnect.  To connect to an EchoLink node, dial <code class="language-plaintext highlighter-rouge">*33</code> followed by a 6 digit EchoLink node number.  If the node number is fewer than 6 digits long, prepend the node number with enough zeroes to make a 6 digit number.  For example, EchoLink node <code class="language-plaintext highlighter-rouge">9999</code> is entered as <code class="language-plaintext highlighter-rouge">009999</code>.</p>

<h2 id="step-3-install-and-configure-dvswitch">Step 3: Install and Configure DVSwitch</h2>
<ol>
  <li>
    <p>Install DVSwitch:</p>

    <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>wget http://dvswitch.org/bookworm
<span class="nb">chmod</span> +x bookworm
./bookworm
apt update
apt <span class="nb">install </span>dvswitch-server
apt <span class="nb">install </span>php-cgi libapache2-mod-php8.2
</code></pre></div>    </div>
  </li>
  <li>
    <p>Configure DVSwitch:</p>

    <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">cd</span> /usr/local/dvs
./dvs
</code></pre></div>    </div>

    <p>You can set modes from the <code class="language-plaintext highlighter-rouge">dvs</code> menu under <code class="language-plaintext highlighter-rouge">Advanced - Configure Other Stanzas (Edit &lt;mode&gt;)</code>, <code class="language-plaintext highlighter-rouge">Additional DMR Networks</code>, and <code class="language-plaintext highlighter-rouge">Configure Favorite TG</code>.  Additionally, you may be able to set up DV3000 USB as the vocoder here if available for D-Star support under the initial setup menu, in the <code class="language-plaintext highlighter-rouge">Hardware Vocoder</code> section (for example, if you have a USB ThumbDV AMBE vocoder device).</p>
  </li>
  <li>
    <p>Edit configuration files as needed:
In <code class="language-plaintext highlighter-rouge">/opt/Analog_Bridge/Analog_Bridge.ini</code>, under <code class="language-plaintext highlighter-rouge">[USRP]</code>, set:</p>

    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>txPort = 32001                          ; Transmit USRP frames on this port
rxPort = 34001                          ; Listen for USRP frames on this port
usrpAudio  AUDIO_USE_GAIN
usrpGain = 3.00
tlvAudio = AUDIO_USE_GAIN
</code></pre></div>    </div>

    <p>In <code class="language-plaintext highlighter-rouge">/opt/MMDVM_Bridge/MMDVM_Bridge.ini</code>, set <code class="language-plaintext highlighter-rouge">Jitter=750</code> under <code class="language-plaintext highlighter-rouge">[DMR Network]</code>, and set  <code class="language-plaintext highlighter-rouge">RX/TXFrequency</code> to <code class="language-plaintext highlighter-rouge">433800000</code> under <code class="language-plaintext highlighter-rouge">[Info]</code>.  You can also set your contact information in this file.</p>
  </li>
  <li>
    <p>Set up private node <code class="language-plaintext highlighter-rouge">1999</code> in ASL3 to bridge to DVSwitch:</p>

    <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="nb">sudo </span>asl-menu
</code></pre></div>    </div>

    <p>Under <code class="language-plaintext highlighter-rouge">Node Settings - AllStar Node Settings - Add Node</code> with the following parameters: node number 1999, “None of the Above”, “Radio Interface USRP”, and “Duplex Type 0 / Half”.</p>

    <p>Test by connecting/disconnecting to node 1999 in allmon3.  You must connect your ASL node to 1999 before using DVSwitch.  You can dial <code class="language-plaintext highlighter-rouge">*31999</code> to do this (and <code class="language-plaintext highlighter-rouge">*11999</code> to disconnect).</p>
  </li>
  <li>Enable and start DVSwitch services:
    <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>systemctl <span class="nb">enable </span>analog_bridge mmdvm_bridge md380-emu
systemctl start analog_bridge mmdvm_bridge md380-emu
</code></pre></div>    </div>
  </li>
  <li>
    <p>Test DVSwitch:</p>

    <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/opt/MMDVM_Bridge/dvswitch.sh mode YSF
/opt/MMDVM_Bridge/dvswitch.sh tune parrot.ysfreflector.de:42020
</code></pre></div>    </div>

    <p>In general, you can connect to any mode and endpoint via:</p>

    <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/opt/MMDVM_Bridge/dvswitch.sh mode <span class="o">{</span>DMR|NXDN|P25|YSF|DSTAR<span class="o">}</span> <span class="c"># Set Analog_Bridge digital mode</span>
/opt/MMDVM_Bridge/dvswitch.sh tune &lt;tg&gt; <span class="c"># Tune to specific TG number/Reflector</span>
</code></pre></div>    </div>

    <p>Don’t forget to connect to node 1999 either through DTMF tones, or through allmon3.  You can change digital modes using the web interface at <code class="language-plaintext highlighter-rouge">http://&lt;your_ip_address&gt;/dvswitch/index.php</code>.</p>

    <p>Note that DVSwitch cannot parse the <code class="language-plaintext highlighter-rouge">#</code> private call character in a <code class="language-plaintext highlighter-rouge">tune</code> command using the <code class="language-plaintext highlighter-rouge">/opt/MMDVM_Bridge/dvswitch.sh tune yourbrandmeisterpasswordhere@3102.repeater.net:62031!91</code> format (i.e., using <code class="language-plaintext highlighter-rouge">!9990#</code>).  Instead, run these as two commands:</p>

    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/opt/MMDVM_Bridge/dvswitch.sh tune yourbrandmeisterpasswordhere@3102.repeater.net:62031
/opt/MMDVM_Bridge/dvswitch.sh tune '9990#'
</code></pre></div>    </div>

    <p>The <code class="language-plaintext highlighter-rouge">!</code> format seems to work when connecting to a non-private call talkgroup; however, it should be separated for any private calls requiring a <code class="language-plaintext highlighter-rouge">#</code> at the end (including the <code class="language-plaintext highlighter-rouge">4000#</code> unlink command.</p>
  </li>
  <li>
    <p>Enable DVSwitch Mode Switcher Frontend:</p>

    <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="nb">sudo </span>apt update <span class="o">&amp;&amp;</span> <span class="nb">sudo </span>apt upgrade <span class="o">&amp;&amp;</span> <span class="nb">sudo </span>apt <span class="nb">install </span>git nodejs
 git clone https://github.com/firealarmss/dvswitch_mode_switcher.git
 <span class="nb">cd </span>dvswitch_mode_switcher
</code></pre></div>    </div>

    <p>Follow the instructions in dvswitch_mode_switcher-README.md.txt within the above repository.  I also set <code class="language-plaintext highlighter-rouge">enabled: true</code> under <code class="language-plaintext highlighter-rouge">usrp</code> in <code class="language-plaintext highlighter-rouge">/opt/dvswitch_mode_switcher/configs/config.yml</code>.</p>

    <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="nb">cd</span> /opt/dvswitch_mode_switcher
 <span class="nb">cp </span>debian/dvswitch_mode_switcher.service /etc/systemd/system/dvswitch_mode_switcher.service
 systemctl daemon-reload
 systemctl <span class="nb">enable </span>dvswitch_mode_switcher.service
 systemctl start dvswitch_mode_switcher.service
</code></pre></div>    </div>

    <p>Then, log onto the ASL3 Cockpit at <code class="language-plaintext highlighter-rouge">http://&lt;your_ip_address&gt;:9090</code> to update the firewall to allow port <code class="language-plaintext highlighter-rouge">3000</code> access.  After logging in, click <code class="language-plaintext highlighter-rouge">Networking</code> on the left, click <code class="language-plaintext highlighter-rouge">Edit Rules and Zones</code> in the firewall panel, click <code class="language-plaintext highlighter-rouge">Custom Ports</code>, and enable port <code class="language-plaintext highlighter-rouge">3000</code> over <code class="language-plaintext highlighter-rouge">TCP</code>.  Alternatively, I logged in over ssh and restricted access to a particular IP block (instead of adding it through the web interface, which opens the port to all IP addresses), by running:</p>

    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> cat &lt;&lt;EOF &gt; /usr/local/bin/nft-allow-allstarlink.sh
 #!/bin/bash
 CHAIN_NAME="filter_IN_allstarlink_allow"
 TABLE_NAME="inet firewalld"

 # Wait up to 30 seconds for firewalld to create the required chain
 for i in {1..30}; do
     if /usr/sbin/nft list chain $TABLE_NAME $CHAIN_NAME &gt;/dev/null 2&gt;&amp;1; then
         break
     fi
     /usr/bin/sleep 1
 done

 # If the chain still doesn't exist, fail
 if ! /usr/sbin/nft list chain $TABLE_NAME $CHAIN_NAME &gt;/dev/null 2&gt;&amp;1; then
     echo "Error: firewalld chain $CHAIN_NAME not found after timeout"
     exit 1
 fi

 /usr/sbin/nft add rule inet firewalld filter_IN_allstarlink_allow ip saddr &lt;your net address&gt;/&lt;your subnet i.e. 24&gt; tcp dport 3000 accept
 EOF

 chmod +x /usr/local/bin/nft-allow-allstarlink.sh

 mkdir -p /etc/systemd/system/firewalld.service.d

 cat &lt;&lt;EOF &gt; /etc/systemd/system/firewalld.service.d/99-add-allstarlink-rules.conf
 [Service]
 ExecStartPost=/usr/local/bin/nft-allow-allstarlink.sh
 EOF

 systemctl daemon-reexec
</code></pre></div>    </div>

    <p>Access the portal via <code class="language-plaintext highlighter-rouge">http://&lt;your_ip_address&gt;:3000</code>.  Again, be sure to connect to node <code class="language-plaintext highlighter-rouge">1999</code> first!  You can edit your favorite digital talkgroups by editing the <code class="language-plaintext highlighter-rouge">/opt/dvswitch_mode_switcher/configs/tg_alias.yml</code> file.  For example:</p>

    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> - tgid: "yourbrandmeisterpasswordhere@3102.repeater.net:62031!91"
   alias: Brandmeister Worldwide
</code></pre></div>    </div>

    <p>Again, be sure not to use the <code class="language-plaintext highlighter-rouge">!TG</code> suffix for private calls ending in <code class="language-plaintext highlighter-rouge">#</code>.  Instead, configure and execute two commands to connect via a private call:</p>

    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> - tgid: "yourbrandmeisterpasswordhere@3102.repeater.net:62031"
   alias: Brandmeister Master
 - tgid: "9990#"
   alias: Brandmeister Parrot Test (Connect to master first)
</code></pre></div>    </div>
  </li>
</ol>

<h2 id="step-4-install-and-configure-supermon">Step 4: Install and Configure Supermon</h2>
<ol>
  <li>
    <p>Install Supermon:</p>

    <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">cd</span> /usr/local/sbin
wget <span class="s2">"http://2577.asnode.org:43856/supermonASL_fresh_install"</span> <span class="nt">-O</span> supermonASL_fresh_install
<span class="nb">chmod</span> +x supermonASL_fresh_install
./supermonASL_fresh_install
</code></pre></div>    </div>
  </li>
  <li>Configure Supermon:
    <ul>
      <li>Edit <code class="language-plaintext highlighter-rouge">allmon.ini</code> and <code class="language-plaintext highlighter-rouge">global.inc</code> in <code class="language-plaintext highlighter-rouge">/var/www/html/supermon/</code>.</li>
      <li>
        <p>Set up <code class="language-plaintext highlighter-rouge">.htpasswd</code> for authentication:</p>

        <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>htpasswd <span class="nt">-cB</span> /var/www/html/supermon/.htpasswd admin
</code></pre></div>        </div>
      </li>
    </ul>
  </li>
  <li>
    <p>Enable automatic database updates:</p>

    <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>systemctl <span class="nb">enable </span>asl3-update-astdb.service asl3-update-astdb.timer
systemctl start asl3-update-astdb.timer
</code></pre></div>    </div>
  </li>
  <li>
    <p>Access Supermon at <code class="language-plaintext highlighter-rouge">http://&lt;your_ip_address&gt;/supermon</code>.</p>
  </li>
  <li>
    <p>Update supermon with <code class="language-plaintext highlighter-rouge">/usr/local/sbin/supermonASL_latest_update</code></p>
  </li>
  <li>Edit <code class="language-plaintext highlighter-rouge">/var/www/html/supermon/almon.ini</code> to modify the top menu.  For example, you can add links to all the URLs from the tools in this document for easy access.</li>
</ol>

<h2 id="step-5-configure-skywarn-plus">Step 5: Configure Skywarn Plus</h2>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bash <span class="nt">-c</span> <span class="s2">"</span><span class="si">$(</span>curl <span class="nt">-fsSL</span> https://raw.githubusercontent.com/Mason10198/SkywarnPlus/main/swp-install<span class="si">)</span><span class="s2">"</span>
</code></pre></div></div>

<h3 id="configuration">Configuration</h3>

<ol>
  <li>
    <p>Configure <code class="language-plaintext highlighter-rouge">/usr/local/bin/SkywarnPlus/config.yaml</code> file according to readme instructions.</p>
  </li>
  <li>
    <p>Add these lines to enable weather alert tail messages at a certain interval of activity:</p>

    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>tailmessagetime=60000
tailsquashedtime=30000
tailmessagelist=/tmp/SkywarnPlus/wx-tail
</code></pre></div>    </div>
  </li>
  <li>
    <p>Add to root crontab:</p>

    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>* * * * * /usr/bin/python3 /usr/local/bin/SkywarnPlus/ASL3_Supermon_Workaround.py
* * * * * /usr/local/bin/SkywarnPlus/SkywarnPlus.py
* * * * * chown -R asterisk:asterisk /tmp/SkywarnPlus 
</code></pre></div>    </div>
  </li>
  <li>
    <p>Follow <a href="https://www.youtube.com/watch?v=lv95j-I3JDc">these instructions</a> to add time and weather hourly announcements.  Add or modify your root crontab as follows:</p>

    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>00 00-23 * * * /usr/bin/perl /usr/local/sbin/saytime.pl 19320 62933 &gt; /dev/null
</code></pre></div>    </div>
  </li>
</ol>

<h2 id="allscan">Allscan</h2>
<p>Follow <a href="https://github.com/davidgsd/AllScan#readme">these instructions</a>.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">sudo </span>apt update<span class="p">;</span> <span class="nb">sudo </span>apt <span class="nb">install </span>php unzip <span class="nt">-y</span>
<span class="nb">cd</span> ~
wget <span class="s1">'https://raw.githubusercontent.com/davidgsd/AllScan/main/AllScanInstallUpdate.php'</span>
<span class="nb">chmod </span>755 AllScanInstallUpdate.php
<span class="nb">sudo</span> ./AllScanInstallUpdate.php
</code></pre></div></div>

<p>Edit <code class="language-plaintext highlighter-rouge">/etc/php/8.2/cli/php.ini</code> and uncomment <code class="language-plaintext highlighter-rouge">extension=pdo_sqlite</code> and <code class="language-plaintext highlighter-rouge">extension=sqlite3</code>.</p>

<p>Test at <code class="language-plaintext highlighter-rouge">http://&lt;your-ip-address&gt;/allscan</code> (set up initial user).  You can now use allscan to connect/disconnect instead of allmon3 or DTMF.</p>

<p>Edit favorite nodes at <code class="language-plaintext highlighter-rouge">/var/www/html/supermon/favorites.ini</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>label[] = "Label"
cmd[] = "rpt cmd %node% ilink 3 &lt;node number&gt;"
label[] = "Echolink Label"
cmd[] = "rpt cmd %node% ilink 3 3&lt;0 prepadded 6 digit node number&gt;"
</code></pre></div></div>

<h2 id="alltune">Alltune</h2>

<p>Follow the instructions at the download provided <a href="https://www.qrz.com/db/N1ACC?aliasFrom=KQ4MZJ3">here</a>.  Extract the web files to <code class="language-plaintext highlighter-rouge">/var/www/html/alltune</code> and access at <code class="language-plaintext highlighter-rouge">http://&lt;your-ip-address&gt;/alltune</code>.</p>

<h2 id="iax-configuration">IAX Configuration</h2>

<p>You can access your AllStarNode from an Android device or other IAX connection by adding the following stanza to <code class="language-plaintext highlighter-rouge">/etc/asterisk/iax.conf</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[iaxclient]                      ; Connect from iax client (Zoiper...)
type = friend                    ; Notice type here is friend &lt;--------------
context = iax-client             ; Context to jump to in extensions.conf
auth = md5
secret = your-secret-password-here
host = dynamic
disallow = all
allow = ulaw
allow = adpcm
allow = gsm
transfer = no
requirecalltoken=no ; to allow all connections
;calltokenoptional=0.0.0.0/0.0.0.0 ; to connect from a particular IP address
</code></pre></div></div>

<p>And add this in the <code class="language-plaintext highlighter-rouge">/etc/asterisk/extensions.conf</code> file:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[iax-client]                            ; for IAX VoIP clients.
exten =&gt; ${NODE},1,Ringing()
        same =&gt; n,Wait(10)
        same =&gt; n,Answer()
        same =&gt; n,Set(CALLSIGN=${CALLERID(name)})
        same =&gt; n,NoOp(Caller ID name is ${CALLSIGN})
        same =&gt; n,NoOp(Caller ID number is ${CALLERID(number)})
        same =&gt; n,GotoIf(${ISNULL(${CALLSIGN})}?hangit)
        same =&gt; n,Playback(rpt/connected-to&amp;rpt/node)
        same =&gt; n,SayDigits(${NODE})
        same =&gt; n,rpt(${NODE}|P|${CALLSIGN}-P)
        same =&gt; n(hangit),NoOp(No Caller ID Name)
        same =&gt; n,Playback(connection-failed)
        same =&gt; n,Wait(1)
        same =&gt; n,Hangup
</code></pre></div></div>

<p>You can then configure DroidStar by adding the following in the <code class="language-plaintext highlighter-rouge">Hosts</code> section: <code class="language-plaintext highlighter-rouge">IAX &lt;your node number&gt; &lt;your node IP address&gt; 4569 iaxclient your-secret-password-here</code>, and choose your node under the <code class="language-plaintext highlighter-rouge">IAX</code> hosts section on the main tab.  You might have to update your databases and hosts with the buttons under <code class="language-plaintext highlighter-rouge">Settings</code> prior to use.  Once connected, you can issue DTMF codes to connect/disconnect, and use the PTT button to transmit.</p>

<h2 id="digital-link-dtmf-tuning">Digital Link DTMF Tuning</h2>

<p>To send DTMF tones to switch digital modes, servers/reflectors, and talkgroups, visit <a href="https://github.com/BillJr99/digital_link">https://github.com/BillJr99/digital_link</a> and follow the installation instructions there.</p>

<h2 id="enabling-a-thumbdv-ambe-device">Enabling a ThumbDV AMBE Device</h2>

<p>I have a USB vocoder, which I enabled by editing <code class="language-plaintext highlighter-rouge">/opt/Analog_Bridge/Analog_Bridge.ini</code> and setting:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[General]
decoderFallBack = true
userEmulator = false

[DV3000]
; address = 127.0.0.1
; rxPort = 2460
address = /dev/ttyUSB0
baud=460800
serial = true
</code></pre></div></div>

<h2 id="references">References</h2>

<ul>
  <li><a href="https://dvswitch.org/DVSwitch_install.pdf">DVSwitch Installation Guide</a></li>
  <li><a href="http://2577.asnode.org:43856/supermonASL_fresh_install">Supermon for ASL3</a></li>
  <li><a href="https://github.com/firealarmss/dvswitch_mode_switcher">DVSwitch Mode Changer</a></li>
  <li><a href="https://github.com/VALER24/allstar-shari-dvswitch-install-guide">ASL3 Installation Guide by Ham Radio and Networking</a></li>
  <li><a href="https://www.youtube.com/watch?v=bNp-zZQKI-I">ASL3 Installation Video by Ham Radio and Networking on Youtube</a></li>
  <li><a href="https://www.youtube.com/@HamRadioCrusader">Ham Radio Crusader Youtube Channel for Information on ASL3, DVSwitch, Supermon, SkywarnPlus, and more</a></li>
  <li><a href="http://www.hamradiolife.org/documents/Supermon%20for%20ASL%203.pdf">Installing Supermon on ASL3</a></li>
  <li><a href="https://dvswitch.org/DVSwitch_install.pdf">DVSwitch Installation</a></li>
  <li><a href="https://www.youtube.com/watch?v=GRMsifz9WTg">Allmon3 and EchoLink on ASL3 by Ham Radio and Networking on Youtube</a></li>
  <li><a href="https://www.youtube.com/watch?v=uAwSHjKTeU4">DVSwitch and ClearNode on AllStar by Ham Radio and Networking on Youtube</a></li>
  <li><a href="https://www.youtube.com/watch?v=9k_gAfXJgx8">DVSwitch on AllStarLink 3 by Ham Radio and Networking on Youtube</a></li>
  <li>DVSwitch Server <a href="https://www.youtube.com/watch?v=Q73vW2tZVco">Part 1</a> and <a href="https://www.youtube.com/watch?v=HlVs9rC5pgE">Part 2</a> by Ham Radio Crusader on Youtube</li>
  <li><a href="https://allstarlink.github.io/pi/cockpit-firewall/">ASL3 Cockpit Firewall</a></li>
  <li><a href="https://www.youtube.com/watch?v=35k1sND7FbQ">SkywarnPlus on ASL3 by Ham Radio Crusader</a></li>
  <li><a href="https://www.youtube.com/watch?v=3SyCHa03pN8">Supermon 7.4+ on ASL3 by Ham Radio Crusader</a></li>
  <li><a href="http://www.hamradiolife.org/documents/Supermon%20for%20ASL%203.pdf">Supermon for ASL3</a></li>
  <li><a href="https://www.youtube.com/watch?v=lv95j-I3JDc">Time and Weather Announcements on SkywarnPlus by Ham Radio Crusader</a></li>
  <li><a href="https://github.com/davidgsd/AllScan#readme">AllScan Setup Instructions</a></li>
  <li><a href="https://www.qrz.com/db/N1ACC?aliasFrom=KQ4MZJ3">Alltune</a></li>
</ul>]]></content><author><name>Bill Mongan</name><email>billmongan+website@gmail.com</email></author><category term="hamradio" /><summary type="html"><![CDATA[This guide walks you through installing and configuring AllStarLink (ASL3) along with several helpful management tools and utilities:]]></summary></entry></feed>