hyperhive/docs/persistence.md
müde 75e7faff0c docs: full sync ahead of compaction + config-management overhaul
readme: manager mcp surface picks up update; operator-surface
recap mentions /model + last-turn + model chip + the three
collapsibles (inbox / journald / agent.nix).

web-ui.md: details-restore-key story under shape; port-conflict
banner mention on containers; agent.nix viewer alongside journald;
notifications use per-event tags + console.debug log on
block/show; deny endpoint takes note=<reason>; data-prompt /
data-prompt-field generalisation noted.

conventions.md: data-prompt and snapshot/restoreOpenDetails added
to the async-forms section.

persistence.md: operator_questions row picks up deadline_at (ttl)
column with a migration note.

todo.md: new 'Bugs' section captures the manager-question
not-rendering issue with three suspect paths to chase.

claude.md scratchpad rewritten as a clean handoff for the
compaction + the upcoming config-git overhaul. flags the
two-repo (proposed/ + applied/) split as the thing to
reconsider.
2026-05-15 22:12:40 +02:00

105 lines
4.1 KiB
Markdown

# Persistence + retention
Where state lives, what survives what, and how it's bounded.
## Two sqlite databases
### `/var/lib/hyperhive/broker.sqlite` (host)
Three tables, all in one file:
- `messages` — every inter-agent / operator-bound message.
`sender / recipient / body / sent_at / delivered_at`.
- `approvals` — the queue. `agent / kind (apply_commit | spawn) /
commit_ref / requested_at / status / resolved_at / note`.
- `operator_questions` — `ask_operator` queue.
`asker / question / options_json / multi / asked_at /
deadline_at (ttl) / answered_at / answer`. Migrated via
`ALTER TABLE ADD COLUMN` against `pragma_table_info`.
Retention:
- `Broker::vacuum_delivered` runs hourly via a tokio task in
`hive-c0re::main`. Drops delivered rows older than 30 days.
Undelivered rows are always kept (still in flight).
- Approvals and questions are kept indefinitely — both are
audit trails. `actions::destroy` and answered questions stay
visible to anything that queries by id.
### `/state/hyperhive-events.sqlite` (per agent)
Lives inside each container's bind-mounted `/state/` dir (host
path: `/var/lib/hyperhive/agents/<name>/state/hyperhive-events.sqlite`).
One table:
- `events(id, ts, kind, payload_json)` — every `LiveEvent` the
harness emits during turn loop execution.
The harness writes; the host vacuums. `hive-c0re::events_vacuum`
runs hourly and sweeps every existing agent state dir, applying the
same two-stage delete to each file: drop rows older than 7 days,
then trim to the 2000 most-recent. Centralising retention on the
host means a misbehaving harness can't disable its own vacuum and
agents don't need any cleanup wiring of their own.
Path overridable via `HYPERHIVE_EVENTS_DB` (for dev / no-`/state`
setups). On open failure the `Bus` falls back to no-store mode
rather than crashing the harness — events still broadcast over SSE,
just nothing persisted.
### `/state/hyperhive-model` (per agent)
Single-line text file holding the claude model name currently
selected for this agent (default `haiku` when absent). Written by
`Bus::set_model` whenever the operator flips it via `/model
<name>` in the web terminal. Read once at harness boot in
`Bus::new`. Path overridable via `HYPERHIVE_MODEL_FILE`.
Survives destroy/recreate, gone on `--purge`.
## State dirs (per agent)
Under `/var/lib/hyperhive/agents/<name>/`:
- `config/` — the proposed nix repo (manager-editable).
- `claude/` — claude OAuth credentials, bind-mounted RW to
`/root/.claude` inside the container.
- `state/` — durable notes + the events.sqlite db, bind-mounted
to `/state` inside the container.
Under `/var/lib/hyperhive/applied/<name>/` — the hive-c0re-only
applied repo (`flake.nix` + `agent.nix`) that the container
actually builds from.
## Destroy vs purge
- `DESTR0Y` (default) — stops + removes the nspawn container,
drops the systemd drop-in, fails any pending approvals. State
dirs stay put; the agent appears in the dashboard's K3PT ST4T3
section as a tombstone with `⊕ R3V1V3` and `PURG3` actions.
`R3V1V3` queues a Spawn approval that reuses the kept state on
approve (no re-login).
- `PURG3` (opt-in via the dashboard button or
`hive-c0re destroy --purge <name>`) — DESTR0Y plus wipes
`/var/lib/hyperhive/{agents,applied}/<name>/`. Config history,
claude creds, /state/ notes, and the events db are all gone.
No undo.
The manager is non-destroyable from both paths (declarative
container; would fight with the host's NixOS config).
## Run-time dirs
`/run/hyperhive/` is tmpfs-backed (systemd `RuntimeDirectory=`) but
preserved across hive-c0re restarts via `RuntimeDirectoryPreserve=yes`.
Without that, every restart wipes bind sources and existing
containers can't be started.
- `/run/hyperhive/host.sock` — admin socket (host-side CLI).
- `/run/hyperhive/manager/mcp.sock` — manager-privileged socket.
- `/run/hyperhive/agents/<name>/mcp.sock` — per-sub-agent socket
(bind-mounted into the container as `/run/hive/mcp.sock`).
On startup, `Coordinator::register_agent` drops any prior socket
task before rebinding — idempotent so a hive-c0re restart followed
by `rebuild alice` recreates the agent's socket without a clean
reinstall.