claude.md flips 'in flight' → 'just landed' for the meta overhaul + extends the file map with meta.rs and migrate.rs. docs/approvals.md replaces the in-flight callout with a proper 'Meta flake' section (two-phase deploy walkthrough, sync_agents semantics, single-phase variants), updates the two-repo box diagram to include the /var/lib/hyperhive/meta/ tree and tracks flake.nix in applied, rewrites the container --flake reference to meta#<name>, replaces the 'Manager view of applied' section with a unified '/agents + /applied + /meta' inventory listing every useful git incantation, and explains the in-place no-state-loss migration that now runs on hive-c0re startup. docs/persistence.md grows entries for the meta repo + the .meta-migration-done marker. readme box diagram picks up the /meta RO bind; approval-flow paragraph rewritten end to end to describe the meta lock dance. lifecycle::flake_base deleted — the meta render hardcodes the manager vs agent-base choice as nix expression.
122 lines
5 KiB
Markdown
122 lines
5 KiB
Markdown
# Persistence + retention
|
|
|
|
Where state lives, what survives what, and how it's bounded.
|
|
|
|
## Two sqlite databases
|
|
|
|
### `/var/lib/hyperhive/broker.sqlite` (host)
|
|
|
|
Three tables, all in one file:
|
|
|
|
- `messages` — every inter-agent / operator-bound message.
|
|
`sender / recipient / body / sent_at / delivered_at`.
|
|
- `approvals` — the queue. `agent / kind (apply_commit | spawn) /
|
|
commit_ref / requested_at / status / resolved_at / note`.
|
|
- `operator_questions` — `ask_operator` queue.
|
|
`asker / question / options_json / multi / asked_at /
|
|
deadline_at (ttl) / answered_at / answer`. Migrated via
|
|
`ALTER TABLE ADD COLUMN` against `pragma_table_info`.
|
|
|
|
Retention:
|
|
|
|
- `Broker::vacuum_delivered` runs hourly via a tokio task in
|
|
`hive-c0re::main`. Drops delivered rows older than 30 days.
|
|
Undelivered rows are always kept (still in flight).
|
|
- Approvals and questions are kept indefinitely — both are
|
|
audit trails. `actions::destroy` and answered questions stay
|
|
visible to anything that queries by id.
|
|
|
|
### `/state/hyperhive-events.sqlite` (per agent)
|
|
|
|
Lives inside each container's bind-mounted `/state/` dir (host
|
|
path: `/var/lib/hyperhive/agents/<name>/state/hyperhive-events.sqlite`).
|
|
One table:
|
|
|
|
- `events(id, ts, kind, payload_json)` — every `LiveEvent` the
|
|
harness emits during turn loop execution.
|
|
|
|
The harness writes; the host vacuums. `hive-c0re::events_vacuum`
|
|
runs hourly and sweeps every existing agent state dir, applying the
|
|
same two-stage delete to each file: drop rows older than 7 days,
|
|
then trim to the 2000 most-recent. Centralising retention on the
|
|
host means a misbehaving harness can't disable its own vacuum and
|
|
agents don't need any cleanup wiring of their own.
|
|
|
|
Path overridable via `HYPERHIVE_EVENTS_DB` (for dev / no-`/state`
|
|
setups). On open failure the `Bus` falls back to no-store mode
|
|
rather than crashing the harness — events still broadcast over SSE,
|
|
just nothing persisted.
|
|
|
|
### `/state/hyperhive-model` (per agent)
|
|
|
|
Single-line text file holding the claude model name currently
|
|
selected for this agent (default `haiku` when absent). Written by
|
|
`Bus::set_model` whenever the operator flips it via `/model
|
|
<name>` in the web terminal. Read once at harness boot in
|
|
`Bus::new`. Path overridable via `HYPERHIVE_MODEL_FILE`.
|
|
Survives destroy/recreate, gone on `--purge`.
|
|
|
|
## State dirs (per agent)
|
|
|
|
Under `/var/lib/hyperhive/agents/<name>/`:
|
|
|
|
- `config/` — the proposed nix repo (manager-editable).
|
|
- `claude/` — claude OAuth credentials, bind-mounted RW to
|
|
`/root/.claude` inside the container.
|
|
- `state/` — durable notes + the events.sqlite db, bind-mounted
|
|
to `/state` inside the container.
|
|
|
|
Under `/var/lib/hyperhive/applied/<name>/` — the hive-c0re-only
|
|
applied repo. Tracks `flake.nix` (module-only boilerplate; never
|
|
edited after first spawn) + `agent.nix` (the actual config; the
|
|
manager's edits land here via the approval flow) + any other
|
|
files the manager committed. `.git/` carries the proposal /
|
|
approved / building / deployed / failed / denied tag history.
|
|
|
|
Under `/var/lib/hyperhive/meta/` — the swarm-wide deploy flake.
|
|
Single repo for the whole host; `flake.nix` declares one input
|
|
per agent + one `nixosConfigurations.<n>` output per agent;
|
|
`flake.lock` is the canonical "what's deployed where." The git
|
|
log is the deploy audit trail (one commit per successful
|
|
deploy or hyperhive bump). Manager has this RO-mounted at
|
|
`/meta/`.
|
|
|
|
Marker file `/var/lib/hyperhive/.meta-migration-done` is
|
|
written by the startup migration after every container has
|
|
been repointed at `meta#<n>`. Removing it forces a re-run on
|
|
next hive-c0re start (idempotent — only the actual repoint
|
|
step would re-fire).
|
|
|
|
## Destroy vs purge
|
|
|
|
- `DESTR0Y` (default) — stops + removes the nspawn container,
|
|
drops the systemd drop-in, fails any pending approvals. State
|
|
dirs stay put; the agent appears in the dashboard's K3PT ST4T3
|
|
section as a tombstone with `⊕ R3V1V3` and `PURG3` actions.
|
|
`R3V1V3` queues a Spawn approval that reuses the kept state on
|
|
approve (no re-login).
|
|
- `PURG3` (opt-in via the dashboard button or
|
|
`hive-c0re destroy --purge <name>`) — DESTR0Y plus wipes
|
|
`/var/lib/hyperhive/{agents,applied}/<name>/`. Config history,
|
|
claude creds, /state/ notes, and the events db are all gone.
|
|
No undo.
|
|
|
|
The manager is non-destroyable from both paths (declarative
|
|
container; would fight with the host's NixOS config).
|
|
|
|
## Run-time dirs
|
|
|
|
`/run/hyperhive/` is tmpfs-backed (systemd `RuntimeDirectory=`) but
|
|
preserved across hive-c0re restarts via `RuntimeDirectoryPreserve=yes`.
|
|
Without that, every restart wipes bind sources and existing
|
|
containers can't be started.
|
|
|
|
- `/run/hyperhive/host.sock` — admin socket (host-side CLI).
|
|
- `/run/hyperhive/manager/mcp.sock` — manager-privileged socket.
|
|
- `/run/hyperhive/agents/<name>/mcp.sock` — per-sub-agent socket
|
|
(bind-mounted into the container as `/run/hive/mcp.sock`).
|
|
|
|
On startup, `Coordinator::register_agent` drops any prior socket
|
|
task before rebinding — idempotent so a hive-c0re restart followed
|
|
by `rebuild alice` recreates the agent's socket without a clean
|
|
reinstall.
|