From 497cd15137346c9834bb633a3898ef850e85679a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?m=C3=BCde?= Date: Fri, 15 May 2026 22:43:47 +0200 Subject: [PATCH] docs: tag-driven config-apply plan + migration story scratchpad in claude.md marks this as in-flight; docs/approvals.md gets the new tag state machine (proposal/approved/building/deployed/ failed/denied) and the manager applied.git read-only mount. todo picks up the unprivileged-containers git-identity caveat and a web ui for config repos as a downstream follow-up. --- CLAUDE.md | 27 ++++++++---- TODO.md | 14 ++++++- docs/approvals.md | 105 +++++++++++++++++++++++++++++++++++++--------- 3 files changed, 118 insertions(+), 28 deletions(-) diff --git a/CLAUDE.md b/CLAUDE.md index cbf818c..aac43ea 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -114,14 +114,25 @@ read them à la carte. In-flight or recent context that hasn't earned a section yet. Prune freely. -- **Imminent:** overhaul the git management of agent configs. - Current shape: per-agent `proposed/` repo the manager edits - + `applied/` repo hive-c0re owns, with `request_apply_commit` - shuttling commits between them. Pre-compact note: keep an eye - on whether the two-repo split is still the right shape, or if - a single repo with `proposed/` and `applied/` branches (or a - shared bare repo per agent with refs/proposed and refs/applied) - would simplify the diff / approve / apply path. +- **In flight:** tag-driven config-apply overhaul. Keep the + two-repo split (proposed = manager RW, applied = core-only) + for safety — agent can rm -rf its own repo but never reaches + applied. New flow: at `request_apply_commit` time hive-c0re + fetches the manager's commit into applied and tags it + `proposal/`; the manager's repo is then dead to core for + that approval. Approve/deny/build are encoded as more tags + (`approved/`, `building/`, `deployed/`, `failed/`, `denied/`) + on the same commit; `applied/main` only fast-forwards on + `deployed/`. Failure tags are annotated with the build error; + deny tags with the operator note. Manager gets `applied/.git` + bind-mounted RO at `/agents//applied.git` so it can `git + show` deployed/failed/denied trees and diff against its own + working tree. agent.nix stays the entry point but arbitrary + files in the manager's commit are now preserved; `flake.nix` + becomes hive-c0re-generated, gitignored, regenerated only on + spawn/rebuild. Migration: no in-place. Each existing agent + needs `destroy --purge` + re-spawn; tombstones lose their + history. See `docs/approvals.md` for the tag state machine. - **Recent (since last compaction):** inline +/- diffs on Write/Edit, send full body via collapsed details, operator cancel + ttl on questions, deny-with-reason, dashboard diff --git a/TODO.md b/TODO.md index c38ea06..6e06fc7 100644 --- a/TODO.md +++ b/TODO.md @@ -21,7 +21,12 @@ Pick anything from here when relevant. Cross-cutting design notes live in nixos-container equivalent) so uid 0 inside maps to an unprivileged uid on the host, and a container-root compromise lands the attacker on an ordinary user account, not the host's root. Requires per-agent state - dirs to be chown'd to that uid on the host side. + dirs to be chown'd to that uid on the host side. The per-agent git + identity (currently injected via `programs.git.config.user` against + the root user in `setup_applied`'s generated flake) also needs to be + provisioned for whatever non-root user claude runs as, or commits + the manager makes against `/agents//config` will fall back to a + generic `nixos@…` identity. - **Bash command allow-list.** Replace the blanket `Bash` allow with a pattern allow-list (`Bash(git *)`, `Bash(nix build .*)`, etc.) per claude-code's `--allowedTools` extended grammar. Likely lives in @@ -64,6 +69,13 @@ Pick anything from here when relevant. Cross-cutting design notes live in ## UI / UX +- **Web UI for config repos.** Browse history, diffs, tags + (proposed + approval/* + applied/*) per agent, all from the + dashboard. Something lighter than a full forge — read-only + log + diff + raw-file view is enough. Pairs naturally with + the upcoming config-repo overhaul (tags become the audit + trail; UI surfaces them). + - **xterm.js terminal** embedded per-agent, attached to a PTY exposed by the harness. Pairs well with the unprivileged-container work — would let the operator drop into the container without `nixos-container root-login`. diff --git a/docs/approvals.md b/docs/approvals.md index f892838..0de48aa 100644 --- a/docs/approvals.md +++ b/docs/approvals.md @@ -8,15 +8,29 @@ happens after a decision lands. ## End-to-end approval flow -1. Manager edits `/agents//config/agent.nix` (bind-mounted - from the host's per-agent `proposed` repo) and commits. +1. Manager edits files under `/agents//config/` (any tracked + path, but `agent.nix` is the contract entry point) and commits + with its own git identity. 2. Manager submits the commit sha via `request_apply_commit(agent, commit_ref)`. -3. Operator sees the diff on the dashboard, clicks ◆ APPR0VE (or +3. **hive-c0re immediately fetches that commit from the proposed + repo into the applied repo and tags it `proposal/`.** The + approval row stores both the manager-supplied sha and the + canonical hive-c0re-vouched sha. From here on the proposed + repo is irrelevant for this approval — the manager can amend, + force-push, or `rm -rf` the proposed repo and the queued + approval still points at an immutable git object inside + applied. +4. Operator sees the diff on the dashboard, clicks ◆ APPR0VE (or `hive-c0re approve ` on the CLI). -4. hive-c0re reads the file at that sha from `proposed`, applies - into `applied`, commits there, runs `nixos-container update`. -5. `HelperEvent::ApprovalResolved` lands in the manager's inbox. +5. hive-c0re moves the working tree to `proposal/` and runs + the build under a sequence of tags (see below). On success, + `applied/main` fast-forwards to the proposal commit. On + failure, main stays put and the working tree resets back to + the previous deployed commit. +6. `HelperEvent::ApprovalResolved` (and `Rebuilt` for the + ApplyCommit kind) land in the manager's inbox, carrying both + the canonical sha and the terminal tag. `Spawn` approvals follow the same shape but skip the commit-diff step — the operator just sees the name. On approve, hive-c0re @@ -26,27 +40,80 @@ shows a spinner. ## Two repos per agent ``` -/var/lib/hyperhive/agents//config/ proposed -└── agent.nix # the only file the - # manager can change - # (initial commit by - # hive-c0re on first - # spawn, never touched - # again). +/var/lib/hyperhive/agents//config/ proposed — manager RW +└── # any files the manager + # wants in the commit. + # agent.nix is the + # convention entry + # point; flake.nix is + # generated and not + # tracked here. -/var/lib/hyperhive/applied// applied — hive-c0re-only -├── flake.nix # auto-generated -└── agent.nix # overwritten by approve - # from the proposed commit +/var/lib/hyperhive/applied// applied — core-only +├── .git/ # tag-rich history +├── .gitignore # ignores flake.nix +├── flake.nix # hive-c0re-generated, +│ # untracked, rewritten +│ # on spawn/rebuild only +├── agent.nix # working tree of main +└── # also tracked ``` -The container's `--flake` ref is `#default`. The flake -extends `hyperhive.nixosConfigurations.{agent-base|manager}` with +Why two physical repos: the manager's `/agents//config/` is +RW — a buggy or hostile agent can `git clean -fdx` its own +proposed tree. The applied repo is never bind-mounted (except +the read-only `.git` exposure described below) so a destructive +move inside the container cannot reach it. + +The container's `--flake` ref is `#default`. The +generated `flake.nix` extends +`hyperhive.nixosConfigurations.{agent-base|manager}` with `./agent.nix` plus an inline module setting `programs.git.config.user` (committer identity = the agent's name) and `systemd.services..environment` (`HIVE_PORT`, `HIVE_LABEL`, `HIVE_DASHBOARD_PORT`). +### Tag state machine + +Every approval id walks through a fixed set of tags on the +underlying commit inside the applied repo: + +| Tag | When | Annotated? | +|---|---|---| +| `proposal/` | request_apply_commit, after fetch | no | +| `approved/` | operator approve | no | +| `building/` | rebuild started | no | +| `deployed/` | rebuild succeeded — `main` ff's here | no | +| `failed/` | rebuild failed | yes (body = error) | +| `denied/` | operator deny | yes (body = operator note) | + +`applied/main` is always the latest `deployed/*`. `denied/` and +`failed/` are terminal; the manager submits a new commit + new +approval id to retry. Because tags are first-class git objects, +rejected and failed trees stay browsable forever — `git log +--tags` in the applied repo is the audit trail. + +### Manager view of applied + +`/agents//applied.git` is a **read-only bind-mount** of +`/var/lib/hyperhive/applied//.git` inside the manager +container. The manager fetches tags into its proposed clone +(`git fetch /agents//applied.git refs/tags/*:refs/tags/applied/*`) +and `git show` any deployed / failed / denied tree to see what +actually shipped, what error blocked the last build, or what +note the operator left on a denial. The RO mount means git +plumbing inside the manager cannot corrupt the applied repo. + +## Migration from the pre-tag scheme + +There is no in-place migration. Each existing agent must be +purged and re-spawned: `hive-c0re destroy --purge ` (or +PURG3 on the dashboard), then `request_spawn` and the operator +approves the fresh agent. The new agent starts with `deployed/0` +seeded by hive-c0re; the manager's first config edit becomes +`proposal/1` and walks the tag scheme from there. Pre-overhaul +tombstones lose their config history. + ## Manager (`hm1nd`) is hive-c0re-managed The manager container runs through the **same lifecycle as