split claude.md into docs/ — per-topic, human-readable
claude.md was eating 400 lines of subsystem detail that's useful
when you're working on that subsystem and noise the rest of the
time. split into:
- docs/conventions.md naming, identity, async forms, commit style
- docs/gotchas.md nspawn / nixos-container quirks
- docs/web-ui.md dashboard + per-agent layouts and endpoints
- docs/turn-loop.md claude invocation, wake prompt, mcp surface
- docs/approvals.md approval flow, manager policy, helper events
- docs/persistence.md sqlite dbs, retention, state dir layout
claude.md is now the entry point — file map, reading paths
("pick the doc that matches your task"), quick reminders that
fit on one screen, and a small scratchpad section for in-flight
context. references the docs; the docs don't reference claude.md.
no content was lost — the docs/ files cover everything the old
claude.md did, plus things i wrote up better while extracting.
This commit is contained in:
parent
c27111ac32
commit
8b10731aa4
7 changed files with 708 additions and 396 deletions
83
docs/gotchas.md
Normal file
83
docs/gotchas.md
Normal file
|
|
@ -0,0 +1,83 @@
|
|||
# Gotchas
|
||||
|
||||
NixOS + nspawn quirks and lessons we hit the hard way. If something
|
||||
here looks unmotivated in the code, there's usually a story underneath.
|
||||
|
||||
## `nixos-container` doesn't expose `--bind` on the CLI
|
||||
|
||||
The CLI doesn't accept `--bind`. Path is via `EXTRA_NSPAWN_FLAGS` in
|
||||
`/etc/nixos-containers/<NAME>.conf` — the start script
|
||||
(`/nix/store/.../container_-start`) expands it unquoted into the
|
||||
`systemd-nspawn` invocation. `lifecycle::set_nspawn_flags()` rewrites
|
||||
this line.
|
||||
|
||||
## `/run/systemd/nspawn/*.nspawn` overrides are ignored
|
||||
|
||||
`nixos-container`'s start script builds the nspawn command line
|
||||
directly. Dropping a `.nspawn` file under `/run/systemd/nspawn/`
|
||||
looks like the obvious extension point and does nothing. Use
|
||||
`EXTRA_NSPAWN_FLAGS` (above).
|
||||
|
||||
## `boot.isNspawnContainer = true`
|
||||
|
||||
Not `boot.isContainer = true`. Renamed in nixos-25.11+.
|
||||
|
||||
## `nixos-container create` auto-assigns `HOST_ADDRESS` / `LOCAL_ADDRESS`
|
||||
|
||||
…in the `.conf`. The start script's `if HOST_ADDRESS set →
|
||||
--network-veth` branch then forces a private netns — silently fatal
|
||||
for our web UIs (the bind is invisible from the host). We
|
||||
force-clear `HOST_ADDRESS` / `LOCAL_ADDRESS` / `HOST_ADDRESS6` /
|
||||
`LOCAL_ADDRESS6` / `HOST_BRIDGE` and set `PRIVATE_NETWORK=0`.
|
||||
|
||||
## systemd service PATH ≠ host PATH
|
||||
|
||||
The hive-c0re service sets `path = [ pkgs.git "/run/current-system/sw" ]`.
|
||||
In-container harness services do the same so anything an agent adds
|
||||
to its own `agent.nix` (`environment.systemPackages`) is visible to
|
||||
claude's Bash tool without editing the service definition.
|
||||
`environment.HYPERHIVE_GIT` bakes git's absolute path in (read by
|
||||
`lifecycle::git_command()`) for the host.
|
||||
|
||||
## `RuntimeDirectoryPreserve = "yes"`
|
||||
|
||||
…keeps `/run/hyperhive/` (and the per-agent sub-dirs) across
|
||||
hive-c0re restarts. Without it, every restart wipes bind sources and
|
||||
existing containers can't be started.
|
||||
|
||||
## `register_agent` is idempotent
|
||||
|
||||
Drops any prior socket task before rebinding. Required so a
|
||||
hive-c0re restart followed by `rebuild alice` recreates the agent's
|
||||
socket without needing a clean reinstall.
|
||||
|
||||
## `claude-code` is unfree
|
||||
|
||||
`harness-base.nix` allow-list's it specifically. The flake pins it to
|
||||
**nixpkgs-unstable** via `overlays.claude-unstable` (stable lags too
|
||||
far). The overlay imports unstable with its own
|
||||
`allowUnfreePredicate` so the access inside the overlay doesn't
|
||||
itself trip.
|
||||
|
||||
## Claude credentials are per-agent
|
||||
|
||||
`/var/lib/hyperhive/agents/<name>/claude/` bind-mounts to
|
||||
`/root/.claude` (RW). Sharing one dir across agents is NOT viable —
|
||||
OAuth refresh tokens rotate, so any sibling refresh invalidates all
|
||||
the others. Login flow runs from the per-agent web UI; creds persist
|
||||
across `destroy`/recreate (`--purge` wipes them).
|
||||
|
||||
## Persistent notes dir per agent
|
||||
|
||||
`/var/lib/hyperhive/agents/<name>/state/` bind-mounts to `/state`
|
||||
(RW). System prompts tell agents to keep durable knowledge here
|
||||
(`/state/notes.md`, anything else under `/state/`). The harness also
|
||||
writes its events log here (`/state/hyperhive-events.sqlite`).
|
||||
Survives `destroy`/recreate alongside the claude dir.
|
||||
|
||||
## Orphan approvals
|
||||
|
||||
If state dirs are wiped out from under a pending approval (test
|
||||
scripts, manual `rm -rf`), the dashboard's next render marks them
|
||||
`failed` with note `"agent state dir missing"` so they fall out of
|
||||
`pending`. They stay in sqlite for audit.
|
||||
Loading…
Add table
Add a link
Reference in a new issue