müde 0cc25d33d8 drop debug-only cli subcommands from hive-ag3nt + hive-m1nd

drop the one-shot send/recv/kill/start/restart/request-spawn/request-
apply-commit subcommands from both in-container binaries. they were
debug-only — the host admin socket (`hive-c0re ...`) exposes the
same verbs and the manager mcp surface covers the rest from inside
claude. now each binary's --help shows just `serve` and `mcp`,
which are the only commands either is meant to be started with.

removes the `one_shot` helper and the `render` / `check` glue.

2026-05-15 19:34:58 +02:00

6.3 KiB

Raw Blame History

TODO

Pick anything from here when relevant. Cross-cutting design notes live in CLAUDE.md; high-level project intro in README.md.

Security

Unprivileged containers (userns mapping). Today the nspawn container runs as a fully privileged root. Goal: PrivateUsersChown=yes (or the nixos-container equivalent) so uid 0 inside maps to an unprivileged uid on the host, and a container-root compromise lands the attacker on an ordinary user account, not the host's root. Requires per-agent state dirs to be chown'd to that uid on the host side.
Bash command allow-list. Replace the blanket Bash allow with a pattern allow-list (Bash(git *), Bash(nix build .*), etc.) per claude-code's --allowedTools extended grammar. Likely lives in agent.nix so each agent can scope its own shell surface.

Per-agent settings

Model override. Hard-coded to haiku in the turn loop right now. Surface as a per-agent override: operator via dashboard, manager via request_apply_commit setting an attr on the agent's flake (most natural place since the flake already carries per-agent env/identity). Pair with a model status indicator on the agent page (active / queued / last switched) once the override is in place.

UI / UX

Per-agent UI substance. Show last N inbox messages, last turn timing, link back to dashboard.
Delivered events history persistence. The events::Bus ring buffer (500 events, in-memory) backfills the terminal on page load but dies on harness restart, and only ever holds the most recent turn or two. Persist to sqlite (events(agent, id, ts, kind, payload_json)) so the operator can scroll back through prior turns, and so /events/history survives restart. Cap rows per agent or auto-vacuum on age, same trade-off as the bounded broker entry below.
Granular agent state badge above the terminal: idle 💤 / thinking 🧠 / compacting 📦 with an age timer (thinking · 12s). Drives a state channel from the harness: idle when waiting on the inbox, thinking while claude's stream is open, compacting when /compact is in flight. Replaces the binary "harness alive — turn loop running" line.
Terminal: slash commands beyond /help and /clear. Operator-facing in-terminal commands still to add: /model, /compact, /cancel. Each needs harness-side support (model override, force compaction, cancel current claude turn).
Terminal: bigger. The 32em max-height is cramped on a 1080p+ screen. Grow it (e.g. min(70vh, 60em)) so the live tail is the main visual element of the page rather than a strip.
Terminal: sticky-bottom auto-scroll. Today every appended row scrolls to bottom, so the view shifts while the operator is reading scrolled-up. Track whether the user is already at the bottom (within a small threshold), and only auto-scroll when that's true. Show a small "↓ N new" indicator when not at bottom; click to jump.
Terminal: cancel-current-turn button. Explicit "kill claude process for this turn" control. Harness needs to track the in-flight claude child PID and offer a /cancel endpoint that sends SIGTERM; UI surfaces a button while the state badge is thinking. Slash-command equivalent: /cancel.
/compact trigger. Operator-initiated compaction of the current claude session — claude --print --continue with /compact over the same session id. Surfaces as a slash command in the terminal + a toolbar button while the state badge is idle. Sets state to compacting during the run.
xterm.js terminal embedded per-agent, attached to a PTY exposed by the harness. Pairs well with the unprivileged-container work — would let the operator drop into the container without nixos-container root-login.

Telemetry

Harness stats per agent in sqlite, charted on the agent page. bitburner-agent samples 18 series; for hyperhive the generally-applicable ones are:
- turns/min, tool calls/turn, turn duration p50/p95
- claude exit code distribution (ok vs --compact-retry vs failure)
- inbox depth (current + max-over-window)
- messages sent/received per turn (split by recipient: peer / operator / manager / system)
- approval queue length (across all agents — dashboard-level)
- per-tool usage counts (Read/Edit/Bash/send/recv/…)
- time-since-last-turn (helps spot stuck agents)
- notes file size growth (cues compaction) Backend: a stats table with (agent, ts, key, value) written from the harness on TurnEnd; GET /api/stats?since=… returns the series; agent page renders with a small chart lib (uPlot is light).

Manager → operator question channel

TTL / cancel on ask_operator. Questions today block forever; the manager turn stays alive until the operator answers. Add a per-question ttl_seconds (or a dashboard "cancel" button that resolves the question with a sentinel answer) so a long-idle question can time out and let the manager fall back. Wire the timeout into OperatorQuestions::wait_answered and surface remaining-time on the dashboard.

Loop substance

nap tool. Agent-side MCP tool mcp__hyperhive__nap(seconds) that parks the turn loop for a short while before next-message processing. Use cases: agent decides it has nothing useful to do, or wants to throttle itself between rapid wake events. Implementation: harness records a "wake-not-before" timestamp; recv_blocking skips the long poll until that ts; the state badge reads napping · MM:SS during. Operator can cancel via the same /cancel slash command or a dashboard button.
Notes compaction. /state/ is bind-mounted persistently and agents are told (in the system prompt) to keep /state/notes.md for durable knowledge — but we don't currently nudge them to compact when notes grow. Bitburner-agent's pattern: a short-lived secondary claude session that takes the existing notes + a "compact this" prompt and rewrites them in place. Add when the notes start bloating.

Lifecycle / reliability

Bounded broker. Cap rows per recipient or auto-vacuum delivered messages older than a threshold. sqlite is growing unbounded.
Container crash events. Watch container@*.service via D-Bus, push HelperEvent::ContainerCrash to the manager's inbox so the manager can react (restart, escalate, etc.).

6.3 KiB Raw Blame History

TODO

Security

Per-agent settings

UI / UX

Telemetry

Manager → operator question channel

Loop substance

Lifecycle / reliability

6.3 KiB

Raw Blame History