hyperhive/TODO.md
müde 40938d8b54 dashboard: surface silent unwrap_or_default in api_state
every snapshot source backing /api/state used .unwrap_or_default()
— sqlite errors, broker errors, nixos-container list failures,
operator_questions decode crashes all degraded to empty lists
without a log line. the 'pending question doesn't render'
bug we've been chasing was likely a row-decode panic in
OperatorQuestions::pending() being swallowed this way.

new log_default(what, result) replaces each call site: same
default value on Err but emits target=api_state warn with the
source name + dbg error first. five sources covered:
nixos-container list, approvals.pending,
approvals.recent_resolved, broker.recent_for(operator),
questions.pending. next time the question goes missing the
journal will say which source failed and how.

todo updated — pending-question entry now points at the new
log instead of three suspect paths.
2026-05-16 03:49:49 +02:00

129 lines
6.2 KiB
Markdown

# TODO
Pick anything from here when relevant. Cross-cutting design notes live in
[CLAUDE.md](CLAUDE.md); high-level project intro in [README.md](README.md).
## Permissions / policy
- **Per-agent send allow-list.** Today any agent can `send` to any
other recipient (peer, manager, operator). Add a per-agent
policy that constrains the `to` field — declared in `agent.nix`,
e.g. `hyperhive.allowedRecipients = [ "manager" "alice" ]`.
Broker rejects with an `Err { message }` when the policy denies.
Default: unrestricted (back-compat). The manager can still
always send anywhere. Useful for sandboxing untrusted sub-agents
so they can only talk to the manager, not other sub-agents.
## Security
- **Unprivileged containers (userns mapping).** Today the nspawn container
runs as a fully privileged root. Goal: `PrivateUsersChown=yes` (or the
nixos-container equivalent) so uid 0 inside maps to an unprivileged uid
on the host, and a container-root compromise lands the attacker on an
ordinary user account, not the host's root. Requires per-agent state
dirs to be chown'd to that uid on the host side. The per-agent git
identity (currently injected via `programs.git.config.user` against
the root user in `setup_applied`'s generated flake) also needs to be
provisioned for whatever non-root user claude runs as, or commits
the manager makes against `/agents/<n>/config` will fall back to a
generic `nixos@…` identity.
- **Bash command allow-list.** Replace the blanket `Bash` allow with a
pattern allow-list (`Bash(git *)`, `Bash(nix build .*)`, etc.) per
claude-code's `--allowedTools` extended grammar. Likely lives in
`agent.nix` so each agent can scope its own shell surface.
## Operational hygiene (post-meta-flake)
- **Tag retention.** Every approval mints up to 5 tags in
`applied/<n>/.git` (`proposal/`, `approved/`, `building/`,
`deployed/`, plus `failed/` or `denied/`). Every successful
deploy adds one commit to `/var/lib/hyperhive/meta/.git`.
Both grow unbounded. A retention policy — keep all
`deployed/*` indefinitely, age-out `failed/` + `denied/`
after N days, drop `proposal/` + `approved/` + `building/`
once a terminal sibling lands — would keep the audit
trails browsable without forever-growth.
- **Inert `nix flake lock` no-args call in `meta::sync_agents`.**
Still valid in current nix (resolves missing inputs without
bumping existing ones) but parallel to the deprecated
`--update-input` we just had to migrate. Worth keeping an
eye on; if it gets renamed too, sync_agents stops being
able to seed a fresh meta repo.
## Bugs
- **Pending question doesn't always appear on the dashboard.**
Repro: manager calls `ask_operator`, tool result is
`question queued (id=N)` (so the row is in sqlite), but the
M1ND H4S QU3STI0NS section keeps showing "no pending
questions". Last seen with id=5. Diagnostic step landed:
`api_state` now warn-logs (target=`api_state`) when any of
its source queries fail instead of silently
`unwrap_or_default`-ing — next repro should print the
underlying error in journald and tell us whether this is
sqlite (likely `OperatorQuestions::pending()` row-decode
panic on a migrated column) or dashboard-JS-side
(`renderQuestions` exception). Re-investigate with the new
log once the bug fires.
## UI / UX
- **Web UI for config repos + meta deploy log.** Browse
per-agent proposed / applied tags
(`proposal/* / approved/* / building/* / deployed/* /
failed/* / denied/*`) plus the swarm-wide meta repo's git
log on the dashboard. Read-only log + diff + raw-file view
is enough — something lighter than a full forge. The meta
log already answers "what's deployed where + when"; this
surfaces it without an ssh-to-host detour.
- **xterm.js terminal** embedded per-agent, attached to a PTY exposed by
the harness. Pairs well with the unprivileged-container work — would let
the operator drop into the container without `nixos-container root-login`.
## Telemetry
- **Harness stats per agent in sqlite, charted on the agent page.**
bitburner-agent samples 18 series; for hyperhive the generally-applicable
ones are:
- turns/min, tool calls/turn, turn duration p50/p95
- claude exit code distribution (ok vs `--compact`-retry vs failure)
- inbox depth (current + max-over-window)
- messages sent/received per turn (split by recipient: peer / operator /
manager / system)
- approval queue length (across all agents — dashboard-level)
- per-tool usage counts (Read/Edit/Bash/send/recv/…)
- time-since-last-turn (helps spot stuck agents)
- notes file size growth (cues compaction)
Backend: a `stats` table with `(agent, ts, key, value)` written from
the harness on `TurnEnd`; `GET /api/stats?since=…` returns the
series; agent page renders with a small chart lib (uPlot is light).
## Spawn flow
- **Two-step spawn.** Today `request_spawn(name)` is one shot: manager
asks → operator approves → container is created with a default
`agent.nix` and empty `/state/`. Manager has no way to pre-stage
per-agent prompt material, package additions, or initial notes before
the agent first wakes. Split into:
1. `request_spawn_draft(name)` — host creates the per-agent
`proposed/` repo (initial commit) and `state/` dir with no
container; manager now has `/agents/<name>/{config,state}/` to
edit + commit just like an existing agent.
2. `request_spawn_commit(name, commit_ref)` — submits the queued
approval; operator sees the diff in the dashboard like a normal
`apply_commit`; on approve the container is created from that
commit.
Backwards-compat: keep the existing one-shot `request_spawn` for
trivial agents (operator can still type a name in the dashboard).
Surface "drafts" as a new section between K3PT ST4T3 and approvals.
## Loop substance
- **Notes compaction.** `/state/` is bind-mounted persistently and agents
are told (in the system prompt) to keep `/state/notes.md` for durable
knowledge — but we don't currently nudge them to compact when notes
grow. Bitburner-agent's pattern: a short-lived secondary claude session
that takes the existing notes + a "compact this" prompt and rewrites
them in place. Add when the notes start bloating.