hyperhive/TODO.md

4.8 KiB

TODO

Pick anything from here when relevant. Cross-cutting design notes live in CLAUDE.md; high-level project intro in README.md.

Security

  • Unprivileged containers (userns mapping). Today the nspawn container runs as a fully privileged root. Goal: PrivateUsersChown=yes (or the nixos-container equivalent) so uid 0 inside maps to an unprivileged uid on the host, and a container-root compromise lands the attacker on an ordinary user account, not the host's root. Requires per-agent state dirs to be chown'd to that uid on the host side.
  • Bash command allow-list. Replace the blanket Bash allow with a pattern allow-list (Bash(git *), Bash(nix build .*), etc.) per claude-code's --allowedTools extended grammar. Likely lives in agent.nix so each agent can scope its own shell surface.

Per-agent settings

  • Model override. Hard-coded to haiku in the turn loop right now. Surface as a per-agent override: operator via dashboard, manager via request_apply_commit setting an attr on the agent's flake (most natural place since the flake already carries per-agent env/identity).

UI / UX

  • Operator inbox view. Drain replies addressed to operator and show them on the dashboard. Today they accumulate in sqlite unread; you can only see them by watching the live panel of the agent that sent them.
  • Per-agent UI substance. Show last N inbox messages, last turn timing, link back to dashboard.
  • xterm.js terminal embedded per-agent, attached to a PTY exposed by the harness. Pairs well with the unprivileged-container work — would let the operator drop into the container without nixos-container root-login.

Manager → operator question channel

  • mcp__hyperhive__ask_operator(question, options?) tool on the manager MCP surface. The manager turn pauses; the question gets surfaced as a prominent prompt on the dashboard (its own section, or interleaved with the operator inbox); the operator's typed answer comes back as the tool result. Modelled after Claude Code's AskUserQuestion tool.

    Design open questions:

    • Storage. New sqlite table operator_questions(id, asker, question, options_json, asked_at, answered_at, answer) — or piggyback on the existing message broker with a new envelope kind. Probably a new table because the lifecycle (pending → answered) is different from fire-and-forget messages.

    • Waiting semantics. The MCP tool call needs to block until answered. Two options:

      1. Long-poll from inside the tool handler (broker-style — broadcast on insert, await via tokio::sync::broadcast). Simple but the claude turn stays alive for the whole wait, eating context-window budget.
      2. Tool returns a question_id immediately; manager re-enters its inbox loop and a HelperEvent::OperatorAnswered { id, answer } wakes it. Cheaper context-wise but two-step.
    • Dashboard UX. New "◆ M1ND H4S QU3STI0NS ◆" section at the top when any question is pending. Inline <form> with a textarea (or select if options were provided), POST /api/answer-question. State refresh + the live SSE stream notify the manager harness.

    • Sub-agent path. Sub-agents don't get the tool — they message the manager and the manager decides whether to relay the question to the operator. The manager's system prompt already covers this.

    • Timeout / cancel. Questions that sit pending too long: do they expire? Manager probably wants to know if the operator hasn't answered after some interval so it can fall back. Maybe a per- question ttl_seconds.

Loop substance

  • Notes compaction. /state/ is bind-mounted persistently and agents are told (in the system prompt) to keep /state/notes.md for durable knowledge — but we don't currently nudge them to compact when notes grow. Bitburner-agent's pattern: a short-lived secondary claude session that takes the existing notes + a "compact this" prompt and rewrites them in place. Add when the notes start bloating.

Lifecycle / reliability

  • Bounded broker. Cap rows per recipient or auto-vacuum delivered messages older than a threshold. sqlite is growing unbounded.
  • Container crash events. Watch container@*.service via D-Bus, push HelperEvent::ContainerCrash to the manager's inbox so the manager can react (restart, escalate, etc.).
  • destroy --purge. Today destroy keeps state by design; add an opt-in flag (CLI + dashboard) to also wipe /var/lib/hyperhive/agents/<name>/ and /var/lib/hyperhive/applied/<name>/.

Cleanup / docs

  • Debug-only sub-commands. hive-ag3nt send/recv and the analogous hive-m1nd send/recv/... exist only for ops debugging. Move them into a hidden debug sub-command to declutter --help, or drop entirely.