hyperhive/TODO.md
müde 67e4242b9f per-agent send allow-list via hyperhive.allowedRecipients
new NixOS option in harness-base.nix:
  hyperhive.allowedRecipients = [ 'alice' 'manager' ];  # whitelist
  hyperhive.allowedRecipients = [ ];                    # default = unrestricted

module writes the list as JSON to /etc/hyperhive/send-allow
.json at activation. AgentServer::send reads the file before
issuing the broker request; if the list is non-empty and
`to` isn't on it, the tool returns a claude-readable refusal
string without touching the broker. the manager is always
implicitly permitted regardless of the list — otherwise a
misconfigured allow-list could strand a sub-agent without an
escalation path.

enforcement is in the in-container MCP server (not on the
host's per-agent socket) because the agent's nix config is the
trust boundary anyway — the operator audits agent.nix at
deploy time, the activation-time /etc/hyperhive/send-allow
.json is r/o under /nix/store, so the agent can't tamper at
runtime without going through a new approval.

agent prompt mentions the option + tells claude to route
through the manager when refused. retires the matching TODO
under Permissions / policy.
2026-05-16 03:59:28 +02:00

6.7 KiB

TODO

Pick anything from here when relevant. Cross-cutting design notes live in CLAUDE.md; high-level project intro in README.md.

Security

  • Unprivileged containers (userns mapping). Today the nspawn container runs as a fully privileged root. Goal: PrivateUsersChown=yes (or the nixos-container equivalent) so uid 0 inside maps to an unprivileged uid on the host, and a container-root compromise lands the attacker on an ordinary user account, not the host's root. Requires per-agent state dirs to be chown'd to that uid on the host side. The per-agent git identity (currently injected via programs.git.config.user against the root user in setup_applied's generated flake) also needs to be provisioned for whatever non-root user claude runs as, or commits the manager makes against /agents/<n>/config will fall back to a generic nixos@… identity.
  • Bash command allow-list. Replace the blanket Bash allow with a pattern allow-list (Bash(git *), Bash(nix build .*), etc.) per claude-code's --allowedTools extended grammar. Likely lives in agent.nix so each agent can scope its own shell surface.

Operational hygiene (post-meta-flake)

  • Tag retention. Every approval mints up to 5 tags in applied/<n>/.git (proposal/, approved/, building/, deployed/, plus failed/ or denied/). Every successful deploy adds one commit to /var/lib/hyperhive/meta/.git. Both grow unbounded. A retention policy — keep all deployed/* indefinitely, age-out failed/ + denied/ after N days, drop proposal/ + approved/ + building/ once a terminal sibling lands — would keep the audit trails browsable without forever-growth.

  • Inert nix flake lock no-args call in meta::sync_agents. Still valid in current nix (resolves missing inputs without bumping existing ones) but parallel to the deprecated --update-input we just had to migrate. Worth keeping an eye on; if it gets renamed too, sync_agents stops being able to seed a fresh meta repo.

Bugs

  • Pending question doesn't always appear on the dashboard. Repro: manager calls ask_operator, tool result is question queued (id=N) (so the row is in sqlite), but the M1ND H4S QU3STI0NS section keeps showing "no pending questions". Last seen with id=5. Diagnostic step landed: api_state now warn-logs (target=api_state) when any of its source queries fail instead of silently unwrap_or_default-ing — next repro should print the underlying error in journald and tell us whether this is sqlite (likely OperatorQuestions::pending() row-decode panic on a migrated column) or dashboard-JS-side (renderQuestions exception). Re-investigate with the new log once the bug fires.

UI / UX

  • Dashboard layout overhaul. A 3-column attempt (swarm / 0per4t0r 1n / m3ss4g3s) landed + was reverted in 74ba8a6 — looked worse in practice (sticky col-heads fighting the banner, sub-heads too small, columns too narrow for the container rows). Sections are now ordered semantically in a single column (swarm bits first, then decisions, then messages) which is a no-cost improvement. The bigger restructure is still worth doing; next attempt should:

    • keep current widths usable (don't crunch container rows < ~36em — they have a lot inline)
    • default the heavy-but-rare sections (kept-state, meta- inputs, msg-flow history) into a collapsed <details> so they don't dominate when empty
    • drop the per-section banner divider lines in favour of something quieter (a single border-top on the h2?)
    • try a masonry-ish layout (CSS grid-template-rows: masonry once browsers support it; or just two columns where messages floats on the right at wide viewports while the rest stacks left). avoid sticky headers — they fought the page banner last time.
  • Web UI for config repos + meta deploy log. Browse per-agent proposed / applied tags (proposal/* / approved/* / building/* / deployed/* / failed/* / denied/*) plus the swarm-wide meta repo's git log on the dashboard. Read-only log + diff + raw-file view is enough — something lighter than a full forge. The meta log already answers "what's deployed where + when"; this surfaces it without an ssh-to-host detour.

  • xterm.js terminal embedded per-agent, attached to a PTY exposed by the harness. Pairs well with the unprivileged-container work — would let the operator drop into the container without nixos-container root-login.

Telemetry

  • Harness stats per agent in sqlite, charted on the agent page. bitburner-agent samples 18 series; for hyperhive the generally-applicable ones are:
    • turns/min, tool calls/turn, turn duration p50/p95
    • claude exit code distribution (ok vs --compact-retry vs failure)
    • inbox depth (current + max-over-window)
    • messages sent/received per turn (split by recipient: peer / operator / manager / system)
    • approval queue length (across all agents — dashboard-level)
    • per-tool usage counts (Read/Edit/Bash/send/recv/…)
    • time-since-last-turn (helps spot stuck agents)
    • notes file size growth (cues compaction) Backend: a stats table with (agent, ts, key, value) written from the harness on TurnEnd; GET /api/stats?since=… returns the series; agent page renders with a small chart lib (uPlot is light).

Spawn flow

  • Two-step spawn. Today request_spawn(name) is one shot: manager asks → operator approves → container is created with a default agent.nix and empty /state/. Manager has no way to pre-stage per-agent prompt material, package additions, or initial notes before the agent first wakes. Split into:
    1. request_spawn_draft(name) — host creates the per-agent proposed/ repo (initial commit) and state/ dir with no container; manager now has /agents/<name>/{config,state}/ to edit + commit just like an existing agent.
    2. request_spawn_commit(name, commit_ref) — submits the queued approval; operator sees the diff in the dashboard like a normal apply_commit; on approve the container is created from that commit. Backwards-compat: keep the existing one-shot request_spawn for trivial agents (operator can still type a name in the dashboard). Surface "drafts" as a new section between K3PT ST4T3 and approvals.

Loop substance

  • Notes compaction. /state/ is bind-mounted persistently and agents are told (in the system prompt) to keep /state/notes.md for durable knowledge — but we don't currently nudge them to compact when notes grow. Bitburner-agent's pattern: a short-lived secondary claude session that takes the existing notes + a "compact this" prompt and rewrites them in place. Add when the notes start bloating.