docs: tag-driven config-apply plan + migration story

scratchpad in claude.md marks this as in-flight; docs/approvals.md
gets the new tag state machine (proposal/approved/building/deployed/
failed/denied) and the manager applied.git read-only mount. todo
picks up the unprivileged-containers git-identity caveat and a web
ui for config repos as a downstream follow-up.
This commit is contained in:
müde 2026-05-15 22:43:47 +02:00
parent 75e7faff0c
commit 497cd15137
3 changed files with 118 additions and 28 deletions

View file

@ -8,15 +8,29 @@ happens after a decision lands.
## End-to-end approval flow
1. Manager edits `/agents/<name>/config/agent.nix` (bind-mounted
from the host's per-agent `proposed` repo) and commits.
1. Manager edits files under `/agents/<name>/config/` (any tracked
path, but `agent.nix` is the contract entry point) and commits
with its own git identity.
2. Manager submits the commit sha via `request_apply_commit(agent,
commit_ref)`.
3. Operator sees the diff on the dashboard, clicks ◆ APPR0VE (or
3. **hive-c0re immediately fetches that commit from the proposed
repo into the applied repo and tags it `proposal/<id>`.** The
approval row stores both the manager-supplied sha and the
canonical hive-c0re-vouched sha. From here on the proposed
repo is irrelevant for this approval — the manager can amend,
force-push, or `rm -rf` the proposed repo and the queued
approval still points at an immutable git object inside
applied.
4. Operator sees the diff on the dashboard, clicks ◆ APPR0VE (or
`hive-c0re approve <id>` on the CLI).
4. hive-c0re reads the file at that sha from `proposed`, applies
into `applied`, commits there, runs `nixos-container update`.
5. `HelperEvent::ApprovalResolved` lands in the manager's inbox.
5. hive-c0re moves the working tree to `proposal/<id>` and runs
the build under a sequence of tags (see below). On success,
`applied/main` fast-forwards to the proposal commit. On
failure, main stays put and the working tree resets back to
the previous deployed commit.
6. `HelperEvent::ApprovalResolved` (and `Rebuilt` for the
ApplyCommit kind) land in the manager's inbox, carrying both
the canonical sha and the terminal tag.
`Spawn` approvals follow the same shape but skip the commit-diff
step — the operator just sees the name. On approve, hive-c0re
@ -26,27 +40,80 @@ shows a spinner.
## Two repos per agent
```
/var/lib/hyperhive/agents/<name>/config/ proposed
└── agent.nix # the only file the
# manager can change
# (initial commit by
# hive-c0re on first
# spawn, never touched
# again).
/var/lib/hyperhive/agents/<name>/config/ proposed — manager RW
└── <anything> # any files the manager
# wants in the commit.
# agent.nix is the
# convention entry
# point; flake.nix is
# generated and not
# tracked here.
/var/lib/hyperhive/applied/<name>/ applied — hive-c0re-only
├── flake.nix # auto-generated
└── agent.nix # overwritten by approve
# from the proposed commit
/var/lib/hyperhive/applied/<name>/ applied — core-only
├── .git/ # tag-rich history
├── .gitignore # ignores flake.nix
├── flake.nix # hive-c0re-generated,
│ # untracked, rewritten
│ # on spawn/rebuild only
├── agent.nix # working tree of main
└── <other manager files> # also tracked
```
The container's `--flake` ref is `<applied_dir>#default`. The flake
extends `hyperhive.nixosConfigurations.{agent-base|manager}` with
Why two physical repos: the manager's `/agents/<n>/config/` is
RW — a buggy or hostile agent can `git clean -fdx` its own
proposed tree. The applied repo is never bind-mounted (except
the read-only `.git` exposure described below) so a destructive
move inside the container cannot reach it.
The container's `--flake` ref is `<applied_dir>#default`. The
generated `flake.nix` extends
`hyperhive.nixosConfigurations.{agent-base|manager}` with
`./agent.nix` plus an inline module setting
`programs.git.config.user` (committer identity = the agent's name)
and `systemd.services.<harness>.environment` (`HIVE_PORT`,
`HIVE_LABEL`, `HIVE_DASHBOARD_PORT`).
### Tag state machine
Every approval id walks through a fixed set of tags on the
underlying commit inside the applied repo:
| Tag | When | Annotated? |
|---|---|---|
| `proposal/<id>` | request_apply_commit, after fetch | no |
| `approved/<id>` | operator approve | no |
| `building/<id>` | rebuild started | no |
| `deployed/<id>` | rebuild succeeded — `main` ff's here | no |
| `failed/<id>` | rebuild failed | yes (body = error) |
| `denied/<id>` | operator deny | yes (body = operator note) |
`applied/main` is always the latest `deployed/*`. `denied/` and
`failed/` are terminal; the manager submits a new commit + new
approval id to retry. Because tags are first-class git objects,
rejected and failed trees stay browsable forever — `git log
--tags` in the applied repo is the audit trail.
### Manager view of applied
`/agents/<n>/applied.git` is a **read-only bind-mount** of
`/var/lib/hyperhive/applied/<n>/.git` inside the manager
container. The manager fetches tags into its proposed clone
(`git fetch /agents/<n>/applied.git refs/tags/*:refs/tags/applied/*`)
and `git show` any deployed / failed / denied tree to see what
actually shipped, what error blocked the last build, or what
note the operator left on a denial. The RO mount means git
plumbing inside the manager cannot corrupt the applied repo.
## Migration from the pre-tag scheme
There is no in-place migration. Each existing agent must be
purged and re-spawned: `hive-c0re destroy --purge <name>` (or
PURG3 on the dashboard), then `request_spawn` and the operator
approves the fresh agent. The new agent starts with `deployed/0`
seeded by hive-c0re; the manager's first config edit becomes
`proposal/1` and walks the tag scheme from there. Pre-overhaul
tombstones lose their config history.
## Manager (`hm1nd`) is hive-c0re-managed
The manager container runs through the **same lifecycle as