hyperhive/TODO.md

# TODO

Pick anything from here when relevant. Cross-cutting design notes live in
[CLAUDE.md](CLAUDE.md); high-level project intro in [README.md](README.md).

## Security

- **Unprivileged containers (userns mapping).** Today the nspawn container
  runs as a fully privileged root. Goal: `PrivateUsersChown=yes` (or the
  nixos-container equivalent) so uid 0 inside maps to an unprivileged uid
  on the host, and a container-root compromise lands the attacker on an
  ordinary user account, not the host's root. Requires per-agent state
  dirs to be chown'd to that uid on the host side.
- **Bash command allow-list.** Replace the blanket `Bash` allow with a
  pattern allow-list (`Bash(git *)`, `Bash(nix build .*)`, etc.) per
  claude-code's `--allowedTools` extended grammar. Likely lives in
  `agent.nix` so each agent can scope its own shell surface.

## Per-agent settings

- **Model override.** Hard-coded to `haiku` in the turn loop right now.
  Surface as a per-agent override: operator via dashboard, manager via
  `request_apply_commit` setting an attr on the agent's flake (most natural
  place since the flake already carries per-agent env/identity).

## UI / UX

- **Per-agent UI substance.** Show last N inbox messages, last turn timing,
  link back to dashboard.
- **xterm.js terminal** embedded per-agent, attached to a PTY exposed by
  the harness. Pairs well with the unprivileged-container work — would let
  the operator drop into the container without `nixos-container root-login`.

## Manager → operator question channel

- **`mcp__hyperhive__ask_operator(question, options?)` tool** on the manager
  MCP surface. The manager turn pauses; the question gets surfaced as a
  prominent prompt on the dashboard (its own section, or interleaved with
  the operator inbox); the operator's typed answer comes back as the tool
  result. Modelled after Claude Code's `AskUserQuestion` tool.

  Design open questions:

  - **Storage.** New sqlite table `operator_questions(id, asker, question,
    options_json, asked_at, answered_at, answer)` — or piggyback on the
    existing message broker with a new envelope kind. Probably a new
    table because the lifecycle (pending → answered) is different from
    fire-and-forget messages.

  - **Waiting semantics.** The MCP tool call needs to block until
    answered. Two options:
    1. Long-poll from inside the tool handler (broker-style — broadcast
       on insert, await via `tokio::sync::broadcast`). Simple but the
       claude turn stays alive for the whole wait, eating context-window
       budget.
    2. Tool returns a `question_id` immediately; manager re-enters its
       inbox loop and a `HelperEvent::OperatorAnswered { id, answer }`
       wakes it. Cheaper context-wise but two-step.

  - **Dashboard UX.** New "◆ M1ND H4S QU3STI0NS ◆" section at the top
    when any question is pending. Inline `<form>` with a textarea (or
    select if `options` were provided), POST `/api/answer-question`.
    State refresh + the live SSE stream notify the manager harness.

  - **Sub-agent path.** Sub-agents don't get the tool — they message the
    manager and the manager decides whether to relay the question to the
    operator. The manager's system prompt already covers this.

  - **Timeout / cancel.** Questions that sit pending too long: do they
    expire? Manager probably wants to know if the operator hasn't
    answered after some interval so it can fall back. Maybe a per-
    question `ttl_seconds`.

## Loop substance

- **Notes compaction.** `/state/` is bind-mounted persistently and agents
  are told (in the system prompt) to keep `/state/notes.md` for durable
  knowledge — but we don't currently nudge them to compact when notes
  grow. Bitburner-agent's pattern: a short-lived secondary claude session
  that takes the existing notes + a "compact this" prompt and rewrites
  them in place. Add when the notes start bloating.

## Lifecycle / reliability

- **Bounded broker.** Cap rows per recipient or auto-vacuum delivered
  messages older than a threshold. sqlite is growing unbounded.
- **Container crash events.** Watch `container@*.service` via D-Bus, push
  `HelperEvent::ContainerCrash` to the manager's inbox so the manager can
  react (restart, escalate, etc.).
- **`destroy --purge`.** Today `destroy` keeps state by design; add an
  opt-in flag (CLI + dashboard) to also wipe `/var/lib/hyperhive/agents/<name>/`
  and `/var/lib/hyperhive/applied/<name>/`.

## Cleanup / docs

- **Debug-only sub-commands.** `hive-ag3nt send/recv` and the analogous
  `hive-m1nd send/recv/...` exist only for ops debugging. Move them into a
  hidden `debug` sub-command to declutter `--help`, or drop entirely.