revert the earlier 'operator must set allowUnfree' move: per-agent containers evaluate their own nixpkgs and the operator's host-level allowUnfree doesn't propagate in. restoring the scoped allowUnfreePredicate inside both the claude-unstable overlay and harness-base.nix; documented in README + gotchas as 'nothing to set on the operator side'. docs: - claude.md file map adds crash_watch.rs, kick_agent on coordinator, /api/model + journald viewer + bind-with-retry references. - scratchpad rewritten to reflect the recent run. - web-ui.md: notification row + browser notifications section, state row (badge + model chip + last-turn chip + cancel button), per-agent inbox, /model slash, /cancel-question + journald endpoints, focus-preservation on refresh. - turn-loop.md: --model is read from Bus::model() per turn (runtime override via /model); recv(wait_seconds) up to 180s with the rationale; ask_operator gains ttl_seconds; new TurnState section; kick_agent inbox-on-startup hint. - approvals.md: ttl/cancel resolution paths for operator questions. - persistence.md: /state/hyperhive-model file. - gotchas.md: web UI port collision policy (rename, don't probe); bind retry + SO_REUSEADDR shape; auto-unfree restored. - todo.md: cleaned up empty sections and stale entries; /model shipped, dropped from the list.
6.8 KiB
Approvals + manager + helper events
The approval queue is hyperhive's pivot: nothing that changes the
shape of an agent (its config, whether it exists) happens without an
operator click. The manager (hm1nd) is the policy gate in front of
that queue; helper events are how it stays informed about what
happens after a decision lands.
End-to-end approval flow
- Manager edits
/agents/<name>/config/agent.nix(bind-mounted from the host's per-agentproposedrepo) and commits. - Manager submits the commit sha via
request_apply_commit(agent, commit_ref). - Operator sees the diff on the dashboard, clicks ◆ APPR0VE (or
hive-c0re approve <id>on the CLI). - hive-c0re reads the file at that sha from
proposed, applies intoapplied, commits there, runsnixos-container update. HelperEvent::ApprovalResolvedlands in the manager's inbox.
Spawn approvals follow the same shape but skip the commit-diff
step — the operator just sees the name. On approve, hive-c0re
creates the container in a background task while the dashboard
shows a spinner.
Two repos per agent
/var/lib/hyperhive/agents/<name>/config/ proposed
└── agent.nix # the only file the
# manager can change
# (initial commit by
# hive-c0re on first
# spawn, never touched
# again).
/var/lib/hyperhive/applied/<name>/ applied — hive-c0re-only
├── flake.nix # auto-generated
└── agent.nix # overwritten by approve
# from the proposed commit
The container's --flake ref is <applied_dir>#default. The flake
extends hyperhive.nixosConfigurations.{agent-base|manager} with
./agent.nix plus an inline module setting
programs.git.config.user (committer identity = the agent's name)
and systemd.services.<harness>.environment (HIVE_PORT,
HIVE_LABEL, HIVE_DASHBOARD_PORT).
Manager (hm1nd) is hive-c0re-managed
The manager container runs through the same lifecycle as
sub-agents. On hive-c0re serve startup, if hm1nd is missing,
hive-c0re creates it. The manager's flake lives at
/var/lib/hyperhive/applied/hm1nd/; its proposed config at
/var/lib/hyperhive/agents/hm1nd/config/. Manager can edit its own
agent.nix (visible inside the container at /agents/hm1nd/config/)
and submit request_apply_commit("hm1nd", <sha>) for operator
approval.
Differences from sub-agents:
flake.nixextendshyperhive.nixosConfigurations.manager(vsagent-base).- Container name is
hm1nd(noh-prefix). - Fixed web UI port (
MANAGER_PORT = 8000). set_nspawn_flagsadds an extra bind:/var/lib/hyperhive/agents→/agents(RW), so the manager can edit per-agent proposed repos.- First-deploy spawn bypasses the approval queue (manager is required infrastructure).
- Per-agent socket lives at
/run/hyperhive/manager/, owned bymanager_server::start.
Migration note (for older hosts): drop any containers.hm1nd = { ... } block from your host NixOS config. hyperhive creates and
updates the manager itself.
Manager policy
From hive-ag3nt/prompts/manager.md: the manager does NOT
rubber-stamp sub-agent config requests. It verifies (role match,
package legitimacy, cheaper alternative, blast radius) before
committing and calling request_apply_commit.
For ambiguous cases or anything that needs human signal, the
manager calls ask_operator(question, options?, multi?, ttl_seconds?) — queues the question on the dashboard and returns
the id immediately. The operator's answer arrives later as
HelperEvent::OperatorAnswered in the manager inbox. Storage is
hive-c0re::operator_questions (sqlite); the answer flow is:
POST /answer-question/{id}
→ OperatorQuestions::answer
→ notify_manager(OperatorAnswered { id, question, answer })
Two more paths resolve a pending question with a sentinel answer:
POST /cancel-question/{id}(✗ CANC3L button on the dashboard) resolves with[cancelled]. The manager sees a terminal state and can fall back.ttl_secondsdeadline: a tokio watchdog spawned at submit time firesanswer(id, "[expired]")once the ttl runs out. Already- resolved races no-op. The dashboard surfaces a⏳ MM:SSchip on each pending question with a deadline.
Helper events to the manager
Coordinator::notify_manager(&HelperEvent) enqueues an inbox
message from sender system with the event JSON in the body. The
manager harness no longer short-circuits these — they drive a
regular claude turn so the manager can react. Variants
(hive_sh4re::HelperEvent):
ApprovalResolved { id, agent, commit_ref, status, note }— fired byactions::approve+actions::denywhenever an approval transitions to its terminal state.Spawned { agent, ok, note }—actions::approve(Spawn-kind)- admin
HostRequest::Spawn.
- admin
Rebuilt { agent, ok, note }—auto_update::rebuild_agent(covers startup scan + manual/rebuildfrom dashboard) +actions::approve(ApplyCommit).Killed { agent }— adminHostRequest::Kill+ dashboard/kill+ managerKillMCP tool.Destroyed { agent }—actions::destroy.ContainerCrash { agent, note }—crash_watch: a previously- running container went away with no operator-initiated transient state (Stopping / Restarting / Destroying / Rebuilding). Manager canstartit again or escalate.OperatorAnswered { id, question, answer }— dashboard/answer-question/{id}after the operator submits the answer form.
To add a new event: new HelperEvent variant + call sites + update
prompts/manager.md so the manager knows the new shape.
Auto-update on startup
hive-c0re serve runs auto_update::run in a background task right
after opening the coordinator. It enumerates managed containers and
rebuilds any whose recorded hyperhive rev differs from the current
one — sub-agents and manager go through the same lifecycle::rebuild
path.
"Rev" = canonical filesystem path of cfg.hyperhiveFlake. Marker
file: /var/lib/hyperhive/applied/.<name>.hyperhive-rev. If the
flake input has no canonical path (e.g. a github: URL),
auto-update is a no-op — rebuild manually.
The dashboard surfaces pending updates per agent: a clickable
"needs update ↻" badge appears whenever the marker differs from
current rev. The badge POSTs /rebuild/<name>, calling the same
auto_update::rebuild_agent path so manual triggers and the
startup scan can't drift. When at least one container is stale, a
top-level ↻ UPD4TE 4LL button appears that loops over every
stale container.