# Approvals + manager + helper events The approval queue is hyperhive's pivot: nothing that changes the shape of an agent (its config, whether it exists) happens without an operator click. The manager (`hm1nd`) is the policy gate in front of that queue; helper events are how it stays informed about what happens after a decision lands. ## End-to-end approval flow 1. Manager edits files under `/agents//config/` (any tracked path, but `agent.nix` is the contract entry point) and commits with its own git identity. 2. Manager submits the commit sha via `request_apply_commit(agent, commit_ref)`. 3. **hive-c0re immediately fetches that commit from the proposed repo into the applied repo and tags it `proposal/`.** The approval row stores both the manager-supplied sha and the canonical hive-c0re-vouched sha. From here on the proposed repo is irrelevant for this approval — the manager can amend, force-push, or `rm -rf` the proposed repo and the queued approval still points at an immutable git object inside applied. 4. Operator sees the diff on the dashboard, clicks ◆ APPR0VE (or `hive-c0re approve ` on the CLI). 5. hive-c0re moves the working tree to `proposal/` and runs the build under a sequence of tags (see below). On success, `applied/main` fast-forwards to the proposal commit. On failure, main stays put and the working tree resets back to the previous deployed commit. 6. `HelperEvent::ApprovalResolved` (and `Rebuilt` for the ApplyCommit kind) land in the manager's inbox, carrying both the canonical sha and the terminal tag. `Spawn` approvals follow the same shape but skip the commit-diff step — the operator just sees the name. On approve, hive-c0re creates the container in a background task while the dashboard shows a spinner. ## Meta flake The hive-c0re-owned repo at `/var/lib/hyperhive/meta/` declares one flake input per agent (`agent-.url = "git+file:///var/lib/hyperhive/applied/"`) and one `nixosConfigurations.` output per agent. Each output wraps `inputs.agent-.nixosModules.default` with the identity + `HIVE_PORT` / `HIVE_LABEL` / `HIVE_DASHBOARD_PORT` injection module that `setup_applied` used to generate inline. Containers run against `--flake /var/lib/hyperhive/meta#`. Per-deploy lock flow (two-phase, owned by `actions::run_apply_commit` → `meta::{prepare,finalize,abort} _deploy`): 1. `meta::prepare_deploy(name)` runs `nix flake lock --update-input agent-` without committing. Working tree of meta now points the input at `applied//main` (which `run_apply_commit` already fast-forwarded to `proposal/`). 2. `lifecycle::rebuild_no_meta` runs `nixos-container update --flake meta#`. Nix evaluates against the staged lock. 3. On success — `meta::finalize_deploy(name, sha, "deployed/ ")` stages `flake.lock` and commits with `deploy deployed/ `. Meta's git log gains one entry per successful deploy. 4. On failure — `meta::abort_deploy()` runs `git restore flake.lock` so the meta history shows only successes; the failure stays as an annotated `failed/` tag in `applied/`. Single-phase variants exist for paths without rollback semantics: `meta::lock_update_for_rebuild(name)` for the manual `↻ R3BU1LD` button (commits if the lock changed) and `meta::lock_update_hyperhive()` for the auto-update flake-rev bump (one shot before per-agent rebuilds, commits if the lock changed). `meta::sync_agents(hyperhive_flake, dashboard_port, &agents)` is the idempotent reconciler called by `spawn`, `destroy`, `rebuild`, and the startup migration. Renders `flake.nix` from the agent list; if it differs from disk, runs `nix flake lock` + commits as `regenerate meta flake` (or `seed meta from N agent(s)` on the very first call). The manager has `/meta` RO-bound inside its container: `git -C /meta log --oneline` is the swarm-wide deploy log, `cat /meta/flake.lock | jq '.nodes["agent-"].locked'` resolves which sha each agent is pinned at right now. Dashboard surfaces the same info as a `deployed:` chip per container row. ## Two repos per agent ``` /var/lib/hyperhive/agents//config/ proposed — manager RW └── # any files the manager # wants in the commit. # agent.nix is the # convention entry # point; flake.nix is # tracked boilerplate # (manager doesn't edit # it). /var/lib/hyperhive/applied// applied — core-only ├── .git/ # tag-rich history ├── flake.nix # tracked, fixed │ # boilerplate exporting │ # nixosModules.default ├── agent.nix # working tree of main └── # also tracked /var/lib/hyperhive/meta/ swarm-wide flake — core ├── .git/ # one commit per successful │ # deploy ├── flake.nix # generated from agent set └── flake.lock # pins each agent's sha ``` Why two physical repos: the manager's `/agents//config/` is RW — a buggy or hostile agent can `git clean -fdx` its own proposed tree. The applied repo is never bind-mounted (except the read-only `.git` exposure described below) so a destructive move inside the container cannot reach it. The container's `--flake` ref is `/var/lib/hyperhive/meta#` (see "Meta flake" above). The agent's own `applied//flake.nix` is a fixed boilerplate that exports `nixosModules.default = import ./agent.nix`; the meta flake imports that module and wraps it with identity + `HIVE_PORT` / `HIVE_LABEL` / `HIVE_DASHBOARD_PORT`. ### Tag state machine Every approval id walks through a fixed set of tags on the underlying commit inside the applied repo: | Tag | When | Annotated? | |---|---|---| | `proposal/` | request_apply_commit, after fetch | no | | `approved/` | operator approve | no | | `building/` | rebuild started | no | | `deployed/` | rebuild succeeded — `main` ff's here | no | | `failed/` | rebuild failed | yes (body = error) | | `denied/` | operator deny | yes (body = operator note) | `applied/main` is always the latest `deployed/*`. `denied/` and `failed/` are terminal; the manager submits a new commit + new approval id to retry. Because tags are first-class git objects, rejected and failed trees stay browsable forever — `git log --tags` in the applied repo is the audit trail. ### Manager view of applied + meta The manager container gets three host-side bind mounts via `set_nspawn_flags`: - `/var/lib/hyperhive/agents/` → `/agents/` (RW) — proposed repos. Manager edits + commits per-agent config here. - `/var/lib/hyperhive/applied/` → `/applied/` (RO) — every agent's authoritative applied repo, including `.git`. - `/var/lib/hyperhive/meta/` → `/meta/` (RO) — the swarm-wide deploy flake. Each proposed repo (`/agents//config/`) is pre-configured with `applied` as a git remote pointing at `/applied//.git`. Useful incantations from inside the manager: ```sh git -C /agents//config fetch applied git -C /agents//config log applied/main --oneline git -C /agents//config show applied/refs/tags/deployed/ git -C /agents//config show applied/refs/tags/failed/ # body = build error git -C /agents//config show applied/refs/tags/denied/ # body = operator note git -C /agents//config rebase applied/main # base in-flight work on what's deployed git -C /meta log --oneline # swarm-wide deploy history cat /meta/flake.lock | jq '.nodes | with_entries(select(.key | startswith("agent-")))' ``` The RO binds block push at the kernel level, so the manager can only fetch / read — git plumbing inside the container cannot corrupt either authoritative repo. ## Migration from the pre-tag / pre-meta schemes Both overhauls (tag-driven flow + meta flake) ship in-place migrations that run on every hive-c0re startup. Idempotent; each phase is a no-op once already applied. Behaviour: - Tag-driven phase: assumes the operator ran the one-shot `git tag deployed/0 main` script (see commit history / earlier docs revisions) once per agent. Tagging is non-destructive: it doesn't touch live containers, state dirs, or claude creds. - Meta-flake phase: rewrites each `applied//flake.nix` to the module-only boilerplate, wires the `applied` remote in each proposed repo, bootstraps the meta repo from the current agent list, and `nixos-container update`s every container at `meta#`. The expensive last step is guarded by `/var/lib/hyperhive/.meta-migration-done` so it only runs once across hive-c0re restarts. Set `HIVE_SKIP_META_MIGRATION=1` on the service to defer. No state loss in either migration. claude creds, /state/ notes, the events DB, proposed history, and applied history all survive. The manager keeps its session; sub-agents stay logged in. ## Manager (`hm1nd`) is hive-c0re-managed The manager container runs through the **same lifecycle as sub-agents**. On `hive-c0re serve` startup, if `hm1nd` is missing, hive-c0re creates it. The manager's flake lives at `/var/lib/hyperhive/applied/hm1nd/`; its proposed config at `/var/lib/hyperhive/agents/hm1nd/config/`. Manager can edit its own `agent.nix` (visible inside the container at `/agents/hm1nd/config/`) and submit `request_apply_commit("hm1nd", )` for operator approval. Differences from sub-agents: - `flake.nix` extends `hyperhive.nixosConfigurations.manager` (vs `agent-base`). - Container name is `hm1nd` (no `h-` prefix). - Fixed web UI port (`MANAGER_PORT = 8000`). - `set_nspawn_flags` adds two extra binds: `/var/lib/hyperhive/agents` → `/agents` (RW) so the manager can edit per-agent proposed repos, and `/var/lib/hyperhive/applied` → `/applied` (RO) so the manager can `git fetch` deployed/failed/denied tags from any agent's authoritative applied repo (see "Manager view of applied" below). - First-deploy spawn bypasses the approval queue (manager is required infrastructure). - Per-agent socket lives at `/run/hyperhive/manager/`, owned by `manager_server::start`. **Migration note** (for older hosts): drop any `containers.hm1nd = { ... }` block from your host NixOS config. hyperhive creates and updates the manager itself. ## Manager policy From `hive-ag3nt/prompts/manager.md`: the manager does NOT rubber-stamp sub-agent config requests. It verifies (role match, package legitimacy, cheaper alternative, blast radius) before committing and calling `request_apply_commit`. For ambiguous cases or anything that needs human signal, the manager calls `ask_operator(question, options?, multi?, ttl_seconds?)` — queues the question on the dashboard and returns the id immediately. The operator's answer arrives later as `HelperEvent::OperatorAnswered` in the manager inbox. Storage is `hive-c0re::operator_questions` (sqlite); the answer flow is: ``` POST /answer-question/{id} → OperatorQuestions::answer → notify_manager(OperatorAnswered { id, question, answer }) ``` Two more paths resolve a pending question with a sentinel answer: - `POST /cancel-question/{id}` (✗ CANC3L button on the dashboard) resolves with `[cancelled]`. The manager sees a terminal state and can fall back. - `ttl_seconds` deadline: a tokio watchdog spawned at submit time fires `answer(id, "[expired]")` once the ttl runs out. Already- resolved races no-op. The dashboard surfaces a `⏳ MM:SS` chip on each pending question with a deadline. ## Helper events to the manager `Coordinator::notify_manager(&HelperEvent)` enqueues an inbox message from sender `system` with the event JSON in the body. The manager harness no longer short-circuits these — they drive a regular claude turn so the manager can react. Variants (`hive_sh4re::HelperEvent`): - `ApprovalResolved { id, agent, commit_ref, status, note }` — fired by `actions::approve` + `actions::deny` whenever an approval transitions to its terminal state. - `Spawned { agent, ok, note }` — `actions::approve` (Spawn-kind) + admin `HostRequest::Spawn`. - `Rebuilt { agent, ok, note }` — `auto_update::rebuild_agent` (covers startup scan + manual `/rebuild` from dashboard) + `actions::approve` (ApplyCommit). - `Killed { agent }` — admin `HostRequest::Kill` + dashboard `/kill` + manager `Kill` MCP tool. - `Destroyed { agent }` — `actions::destroy`. - `ContainerCrash { agent, note }` — `crash_watch`: a previously- running container went away with no operator-initiated transient state (Stopping / Restarting / Destroying / Rebuilding). Manager can `start` it again or escalate. - `NeedsLogin { agent }` — sub-agent has no claude session yet. Manager can't act directly (interactive OAuth); typically flags the operator. - `LoggedIn { agent }` — sub-agent just completed login. Manager often greets the agent on this event. - `NeedsUpdate { agent }` — sub-agent's recorded flake rev is stale. Manager calls `update(name)` to rebuild — idempotent, no approval required. - `OperatorAnswered { id, question, answer }` — dashboard `/answer-question/{id}` after the operator submits the answer form. To add a new event: new `HelperEvent` variant + call sites + update `prompts/manager.md` so the manager knows the new shape. ## Auto-update on startup `hive-c0re serve` runs `auto_update::run` in a background task right after opening the coordinator. It enumerates managed containers and rebuilds any whose recorded hyperhive rev differs from the current one — sub-agents and manager go through the same `lifecycle::rebuild` path. "Rev" = canonical filesystem path of `cfg.hyperhiveFlake`. Marker file: `/var/lib/hyperhive/applied/..hyperhive-rev`. If the flake input has no canonical path (e.g. a `github:` URL), auto-update is a no-op — rebuild manually. The dashboard surfaces pending updates per agent: a clickable "needs update ↻" badge appears whenever the marker differs from current rev. The badge POSTs `/rebuild/`, calling the same `auto_update::rebuild_agent` path so manual triggers and the startup scan can't drift. When at least one container is stale, a top-level `↻ UPD4TE 4LL` button appears that loops over every stale container.