title bar grows a '↑ DASHB04RD' link next to the rebuild button — opens the host dashboard in a new tab so the operator can pivot between agents without losing the live tail. uses the dashboardPort already plumbed via /api/state. state row picks up a 'last turn 12.3s' chip that fills in when state transitions away from thinking. format: ms / s.s / m s. hidden until the first turn completes.
126 lines
6.3 KiB
Markdown
126 lines
6.3 KiB
Markdown
# TODO
|
|
|
|
Pick anything from here when relevant. Cross-cutting design notes live in
|
|
[CLAUDE.md](CLAUDE.md); high-level project intro in [README.md](README.md).
|
|
|
|
## Security
|
|
|
|
- **Unprivileged containers (userns mapping).** Today the nspawn container
|
|
runs as a fully privileged root. Goal: `PrivateUsersChown=yes` (or the
|
|
nixos-container equivalent) so uid 0 inside maps to an unprivileged uid
|
|
on the host, and a container-root compromise lands the attacker on an
|
|
ordinary user account, not the host's root. Requires per-agent state
|
|
dirs to be chown'd to that uid on the host side.
|
|
- **Bash command allow-list.** Replace the blanket `Bash` allow with a
|
|
pattern allow-list (`Bash(git *)`, `Bash(nix build .*)`, etc.) per
|
|
claude-code's `--allowedTools` extended grammar. Likely lives in
|
|
`agent.nix` so each agent can scope its own shell surface.
|
|
|
|
## Per-agent settings
|
|
|
|
- **Model override.** Hard-coded to `haiku` in the turn loop right now.
|
|
Surface as a per-agent override: operator via dashboard, manager via
|
|
`request_apply_commit` setting an attr on the agent's flake (most natural
|
|
place since the flake already carries per-agent env/identity). Pair with
|
|
a **model status** indicator on the agent page (active / queued / last
|
|
switched) once the override is in place.
|
|
|
|
## UI / UX
|
|
|
|
- **Per-agent inbox view.** Show the last N messages addressed to
|
|
this agent on its page (the per-agent equivalent of the
|
|
dashboard's operator inbox). Needs a new wire request from agent
|
|
→ host (host has the broker; agent doesn't); reuse the broker's
|
|
`recent_for` query. Last-turn timing + dashboard back-link
|
|
already shipped.
|
|
- **State badge: compacting + napping states.** Idle/thinking already
|
|
ship (driven from SSE turn_start/turn_end). Add `compacting 📦` and
|
|
`napping 😴` once the `/compact` trigger and `nap` tool exist —
|
|
both need a harness signal (an explicit `LiveEvent::StateChange`
|
|
variant or piggyback on Note).
|
|
- **Server-side state badge.** Today the badge is computed client-side
|
|
from `turn_start`/`turn_end` events. On page reload mid-turn the
|
|
history replay re-derives it, but with a `compacting` / `napping`
|
|
state coming and a non-trivial state machine it's better to track
|
|
authoritative state in the harness and expose it via
|
|
`GET /api/state` (`status: "thinking" | "idle" | "compacting" |
|
|
"napping"`). JS just renders. Drops the
|
|
derive-from-events-and-pray code path.
|
|
- **Terminal: `/model` slash command.** Operator-typeable model
|
|
override from the terminal. Depends on the model-override work
|
|
above; once an override mechanism exists, wire a `/model <name>`
|
|
command that POSTs to a new endpoint.
|
|
- **xterm.js terminal** embedded per-agent, attached to a PTY exposed by
|
|
the harness. Pairs well with the unprivileged-container work — would let
|
|
the operator drop into the container without `nixos-container root-login`.
|
|
|
|
## Telemetry
|
|
|
|
- **Harness stats per agent in sqlite, charted on the agent page.**
|
|
bitburner-agent samples 18 series; for hyperhive the generally-applicable
|
|
ones are:
|
|
- turns/min, tool calls/turn, turn duration p50/p95
|
|
- claude exit code distribution (ok vs `--compact`-retry vs failure)
|
|
- inbox depth (current + max-over-window)
|
|
- messages sent/received per turn (split by recipient: peer / operator /
|
|
manager / system)
|
|
- approval queue length (across all agents — dashboard-level)
|
|
- per-tool usage counts (Read/Edit/Bash/send/recv/…)
|
|
- time-since-last-turn (helps spot stuck agents)
|
|
- notes file size growth (cues compaction)
|
|
Backend: a `stats` table with `(agent, ts, key, value)` written from
|
|
the harness on `TurnEnd`; `GET /api/stats?since=…` returns the
|
|
series; agent page renders with a small chart lib (uPlot is light).
|
|
|
|
## Manager → operator question channel
|
|
|
|
- **TTL on `ask_operator`.** Manual cancel via dashboard already
|
|
ships (✗ CANC3L button resolves the question with answer
|
|
`[cancelled]` and fires `OperatorAnswered` so the manager sees a
|
|
terminal state). Still missing: per-question `ttl_seconds` that
|
|
auto-cancels after a deadline. Spawn a tokio task per submitted
|
|
question that calls the same cancel path after the ttl expires
|
|
(cheap; rare). Surface remaining time on the dashboard.
|
|
|
|
## Spawn flow
|
|
|
|
- **Two-step spawn.** Today `request_spawn(name)` is one shot: manager
|
|
asks → operator approves → container is created with a default
|
|
`agent.nix` and empty `/state/`. Manager has no way to pre-stage
|
|
per-agent prompt material, package additions, or initial notes before
|
|
the agent first wakes. Split into:
|
|
1. `request_spawn_draft(name)` — host creates the per-agent
|
|
`proposed/` repo (initial commit) and `state/` dir with no
|
|
container; manager now has `/agents/<name>/{config,state}/` to
|
|
edit + commit just like an existing agent.
|
|
2. `request_spawn_commit(name, commit_ref)` — submits the queued
|
|
approval; operator sees the diff in the dashboard like a normal
|
|
`apply_commit`; on approve the container is created from that
|
|
commit.
|
|
Backwards-compat: keep the existing one-shot `request_spawn` for
|
|
trivial agents (operator can still type a name in the dashboard).
|
|
Surface "drafts" as a new section between K3PT ST4T3 and approvals.
|
|
|
|
## Loop substance
|
|
|
|
- **`nap` tool.** Agent-side MCP tool `mcp__hyperhive__nap(seconds)` that
|
|
parks the turn loop for a short while before next-message processing.
|
|
Use cases: agent decides it has nothing useful to do, or wants to
|
|
throttle itself between rapid wake events. Implementation: harness
|
|
records a "wake-not-before" timestamp; `recv_blocking` skips the long
|
|
poll until that ts; the state badge reads `napping · MM:SS` during.
|
|
Operator can cancel via the same `/cancel` slash command or a
|
|
dashboard button.
|
|
- **Notes compaction.** `/state/` is bind-mounted persistently and agents
|
|
are told (in the system prompt) to keep `/state/notes.md` for durable
|
|
knowledge — but we don't currently nudge them to compact when notes
|
|
grow. Bitburner-agent's pattern: a short-lived secondary claude session
|
|
that takes the existing notes + a "compact this" prompt and rewrites
|
|
them in place. Add when the notes start bloating.
|
|
|
|
## Lifecycle / reliability
|
|
|
|
- **Container crash events.** Watch `container@*.service` via D-Bus, push
|
|
`HelperEvent::ContainerCrash` to the manager's inbox so the manager can
|
|
react (restart, escalate, etc.).
|
|
|