hyperhive/TODO.md
müde 6d52f67292 broker: hourly vacuum of delivered messages older than 30 days
undelivered rows are always kept regardless of age (still in flight).
sweep runs immediately on serve start then every hour. logs row count
when non-zero. keep_secs is hard-coded for now (30 days); can be
config-driven later if a host wants to retain more / less for audit.
2026-05-15 19:40:38 +02:00

6.1 KiB

TODO

Pick anything from here when relevant. Cross-cutting design notes live in CLAUDE.md; high-level project intro in README.md.

Security

  • Unprivileged containers (userns mapping). Today the nspawn container runs as a fully privileged root. Goal: PrivateUsersChown=yes (or the nixos-container equivalent) so uid 0 inside maps to an unprivileged uid on the host, and a container-root compromise lands the attacker on an ordinary user account, not the host's root. Requires per-agent state dirs to be chown'd to that uid on the host side.
  • Bash command allow-list. Replace the blanket Bash allow with a pattern allow-list (Bash(git *), Bash(nix build .*), etc.) per claude-code's --allowedTools extended grammar. Likely lives in agent.nix so each agent can scope its own shell surface.

Per-agent settings

  • Model override. Hard-coded to haiku in the turn loop right now. Surface as a per-agent override: operator via dashboard, manager via request_apply_commit setting an attr on the agent's flake (most natural place since the flake already carries per-agent env/identity). Pair with a model status indicator on the agent page (active / queued / last switched) once the override is in place.

UI / UX

  • Per-agent UI substance. Show last N inbox messages, last turn timing, link back to dashboard.
  • Delivered events history persistence. The events::Bus ring buffer (500 events, in-memory) backfills the terminal on page load but dies on harness restart, and only ever holds the most recent turn or two. Persist to sqlite (events(agent, id, ts, kind, payload_json)) so the operator can scroll back through prior turns, and so /events/history survives restart. Cap rows per agent or auto-vacuum on age, same trade-off as the bounded broker entry below.
  • State badge: compacting + napping states. Idle/thinking already ship (driven from SSE turn_start/turn_end). Add compacting 📦 and napping 😴 once the /compact trigger and nap tool exist — both need a harness signal (an explicit LiveEvent::StateChange variant or piggyback on Note).
  • Terminal: slash commands beyond /help and /clear. Operator-facing in-terminal commands still to add: /model, /compact, /cancel. Each needs harness-side support (model override, force compaction, cancel current claude turn).
  • Terminal: bigger. The 32em max-height is cramped on a 1080p+ screen. Grow it (e.g. min(70vh, 60em)) so the live tail is the main visual element of the page rather than a strip.
  • Terminal: sticky-bottom auto-scroll. Today every appended row scrolls to bottom, so the view shifts while the operator is reading scrolled-up. Track whether the user is already at the bottom (within a small threshold), and only auto-scroll when that's true. Show a small "↓ N new" indicator when not at bottom; click to jump.
  • Terminal: cancel-current-turn button. Explicit "kill claude process for this turn" control. Harness needs to track the in-flight claude child PID and offer a /cancel endpoint that sends SIGTERM; UI surfaces a button while the state badge is thinking. Slash-command equivalent: /cancel.
  • /compact trigger. Operator-initiated compaction of the current claude session — claude --print --continue with /compact over the same session id. Surfaces as a slash command in the terminal + a toolbar button while the state badge is idle. Sets state to compacting during the run.
  • xterm.js terminal embedded per-agent, attached to a PTY exposed by the harness. Pairs well with the unprivileged-container work — would let the operator drop into the container without nixos-container root-login.

Telemetry

  • Harness stats per agent in sqlite, charted on the agent page. bitburner-agent samples 18 series; for hyperhive the generally-applicable ones are:
    • turns/min, tool calls/turn, turn duration p50/p95
    • claude exit code distribution (ok vs --compact-retry vs failure)
    • inbox depth (current + max-over-window)
    • messages sent/received per turn (split by recipient: peer / operator / manager / system)
    • approval queue length (across all agents — dashboard-level)
    • per-tool usage counts (Read/Edit/Bash/send/recv/…)
    • time-since-last-turn (helps spot stuck agents)
    • notes file size growth (cues compaction) Backend: a stats table with (agent, ts, key, value) written from the harness on TurnEnd; GET /api/stats?since=… returns the series; agent page renders with a small chart lib (uPlot is light).

Manager → operator question channel

  • TTL / cancel on ask_operator. Questions today block forever; the manager turn stays alive until the operator answers. Add a per-question ttl_seconds (or a dashboard "cancel" button that resolves the question with a sentinel answer) so a long-idle question can time out and let the manager fall back. Wire the timeout into OperatorQuestions::wait_answered and surface remaining-time on the dashboard.

Loop substance

  • nap tool. Agent-side MCP tool mcp__hyperhive__nap(seconds) that parks the turn loop for a short while before next-message processing. Use cases: agent decides it has nothing useful to do, or wants to throttle itself between rapid wake events. Implementation: harness records a "wake-not-before" timestamp; recv_blocking skips the long poll until that ts; the state badge reads napping · MM:SS during. Operator can cancel via the same /cancel slash command or a dashboard button.
  • Notes compaction. /state/ is bind-mounted persistently and agents are told (in the system prompt) to keep /state/notes.md for durable knowledge — but we don't currently nudge them to compact when notes grow. Bitburner-agent's pattern: a short-lived secondary claude session that takes the existing notes + a "compact this" prompt and rewrites them in place. Add when the notes start bloating.

Lifecycle / reliability

  • Container crash events. Watch container@*.service via D-Bus, push HelperEvent::ContainerCrash to the manager's inbox so the manager can react (restart, escalate, etc.).