drop the one-shot send/recv/kill/start/restart/request-spawn/request- apply-commit subcommands from both in-container binaries. they were debug-only — the host admin socket (`hive-c0re ...`) exposes the same verbs and the manager mcp surface covers the rest from inside claude. now each binary's --help shows just `serve` and `mcp`, which are the only commands either is meant to be started with. removes the `one_shot` helper and the `render` / `check` glue.
122 lines
6.3 KiB
Markdown
122 lines
6.3 KiB
Markdown
# TODO
|
|
|
|
Pick anything from here when relevant. Cross-cutting design notes live in
|
|
[CLAUDE.md](CLAUDE.md); high-level project intro in [README.md](README.md).
|
|
|
|
## Security
|
|
|
|
- **Unprivileged containers (userns mapping).** Today the nspawn container
|
|
runs as a fully privileged root. Goal: `PrivateUsersChown=yes` (or the
|
|
nixos-container equivalent) so uid 0 inside maps to an unprivileged uid
|
|
on the host, and a container-root compromise lands the attacker on an
|
|
ordinary user account, not the host's root. Requires per-agent state
|
|
dirs to be chown'd to that uid on the host side.
|
|
- **Bash command allow-list.** Replace the blanket `Bash` allow with a
|
|
pattern allow-list (`Bash(git *)`, `Bash(nix build .*)`, etc.) per
|
|
claude-code's `--allowedTools` extended grammar. Likely lives in
|
|
`agent.nix` so each agent can scope its own shell surface.
|
|
|
|
## Per-agent settings
|
|
|
|
- **Model override.** Hard-coded to `haiku` in the turn loop right now.
|
|
Surface as a per-agent override: operator via dashboard, manager via
|
|
`request_apply_commit` setting an attr on the agent's flake (most natural
|
|
place since the flake already carries per-agent env/identity). Pair with
|
|
a **model status** indicator on the agent page (active / queued / last
|
|
switched) once the override is in place.
|
|
|
|
## UI / UX
|
|
|
|
- **Per-agent UI substance.** Show last N inbox messages, last turn timing,
|
|
link back to dashboard.
|
|
- **Delivered events history persistence.** The `events::Bus` ring
|
|
buffer (500 events, in-memory) backfills the terminal on page load
|
|
but dies on harness restart, and only ever holds the most recent
|
|
turn or two. Persist to sqlite (`events(agent, id, ts, kind,
|
|
payload_json)`) so the operator can scroll back through prior
|
|
turns, and so `/events/history` survives restart. Cap rows per
|
|
agent or auto-vacuum on age, same trade-off as the bounded broker
|
|
entry below.
|
|
- **Granular agent state badge** above the terminal: `idle 💤 / thinking 🧠 /
|
|
compacting 📦` with an age timer (`thinking · 12s`). Drives a state
|
|
channel from the harness: idle when waiting on the inbox, thinking
|
|
while claude's stream is open, compacting when `/compact` is in
|
|
flight. Replaces the binary "harness alive — turn loop running" line.
|
|
- **Terminal: slash commands beyond /help and /clear.** Operator-facing
|
|
in-terminal commands still to add: `/model`, `/compact`, `/cancel`.
|
|
Each needs harness-side support (model override, force compaction,
|
|
cancel current claude turn).
|
|
- **Terminal: bigger.** The 32em max-height is cramped on a 1080p+
|
|
screen. Grow it (e.g. `min(70vh, 60em)`) so the live tail is the
|
|
main visual element of the page rather than a strip.
|
|
- **Terminal: sticky-bottom auto-scroll.** Today every appended row
|
|
scrolls to bottom, so the view shifts while the operator is reading
|
|
scrolled-up. Track whether the user is *already* at the bottom
|
|
(within a small threshold), and only auto-scroll when that's true.
|
|
Show a small "↓ N new" indicator when not at bottom; click to jump.
|
|
- **Terminal: cancel-current-turn button.** Explicit "kill claude
|
|
process for this turn" control. Harness needs to track the
|
|
in-flight claude child PID and offer a `/cancel` endpoint that sends
|
|
SIGTERM; UI surfaces a button while the state badge is `thinking`.
|
|
Slash-command equivalent: `/cancel`.
|
|
- **`/compact` trigger.** Operator-initiated compaction of the current
|
|
claude session — `claude --print --continue` with `/compact` over the
|
|
same session id. Surfaces as a slash command in the terminal + a
|
|
toolbar button while the state badge is `idle`. Sets state to
|
|
`compacting` during the run.
|
|
- **xterm.js terminal** embedded per-agent, attached to a PTY exposed by
|
|
the harness. Pairs well with the unprivileged-container work — would let
|
|
the operator drop into the container without `nixos-container root-login`.
|
|
|
|
## Telemetry
|
|
|
|
- **Harness stats per agent in sqlite, charted on the agent page.**
|
|
bitburner-agent samples 18 series; for hyperhive the generally-applicable
|
|
ones are:
|
|
- turns/min, tool calls/turn, turn duration p50/p95
|
|
- claude exit code distribution (ok vs `--compact`-retry vs failure)
|
|
- inbox depth (current + max-over-window)
|
|
- messages sent/received per turn (split by recipient: peer / operator /
|
|
manager / system)
|
|
- approval queue length (across all agents — dashboard-level)
|
|
- per-tool usage counts (Read/Edit/Bash/send/recv/…)
|
|
- time-since-last-turn (helps spot stuck agents)
|
|
- notes file size growth (cues compaction)
|
|
Backend: a `stats` table with `(agent, ts, key, value)` written from
|
|
the harness on `TurnEnd`; `GET /api/stats?since=…` returns the
|
|
series; agent page renders with a small chart lib (uPlot is light).
|
|
|
|
## Manager → operator question channel
|
|
|
|
- **TTL / cancel on `ask_operator`.** Questions today block forever; the
|
|
manager turn stays alive until the operator answers. Add a per-question
|
|
`ttl_seconds` (or a dashboard "cancel" button that resolves the question
|
|
with a sentinel answer) so a long-idle question can time out and let the
|
|
manager fall back. Wire the timeout into `OperatorQuestions::wait_answered`
|
|
and surface remaining-time on the dashboard.
|
|
|
|
## Loop substance
|
|
|
|
- **`nap` tool.** Agent-side MCP tool `mcp__hyperhive__nap(seconds)` that
|
|
parks the turn loop for a short while before next-message processing.
|
|
Use cases: agent decides it has nothing useful to do, or wants to
|
|
throttle itself between rapid wake events. Implementation: harness
|
|
records a "wake-not-before" timestamp; `recv_blocking` skips the long
|
|
poll until that ts; the state badge reads `napping · MM:SS` during.
|
|
Operator can cancel via the same `/cancel` slash command or a
|
|
dashboard button.
|
|
- **Notes compaction.** `/state/` is bind-mounted persistently and agents
|
|
are told (in the system prompt) to keep `/state/notes.md` for durable
|
|
knowledge — but we don't currently nudge them to compact when notes
|
|
grow. Bitburner-agent's pattern: a short-lived secondary claude session
|
|
that takes the existing notes + a "compact this" prompt and rewrites
|
|
them in place. Add when the notes start bloating.
|
|
|
|
## Lifecycle / reliability
|
|
|
|
- **Bounded broker.** Cap rows per recipient or auto-vacuum delivered
|
|
messages older than a threshold. sqlite is growing unbounded.
|
|
- **Container crash events.** Watch `container@*.service` via D-Bus, push
|
|
`HelperEvent::ContainerCrash` to the manager's inbox so the manager can
|
|
react (restart, escalate, etc.).
|
|
|