crash_watch grows two more state-axes alongside running/stopped: - logged-in (claude session dir populated for the agent) - up-to-date (recorded flake rev matches current) per-tick transitions emit HelperEvent::NeedsLogin / LoggedIn / NeedsUpdate. seed-on-first-tick semantics retained — nothing fires on harness boot for agents that were already in their state. only needs_update fires the 'stale appeared' direction; the resolved direction is already covered by Rebuilt. new mcp__hyperhive__update(name) on the manager surface: idempotent rebuild via auto_update::rebuild_agent. transient-aware (Rebuilding) so the dashboard shows the spinner. login intentionally has NO tool — it's interactive OAuth, only the operator can complete it. prompts + approvals doc + turn-loop doc updated. todo grows a 'show per-agent applied config in dashboard' entry (separate follow-up).
104 lines
5.1 KiB
Markdown
104 lines
5.1 KiB
Markdown
# TODO
|
|
|
|
Pick anything from here when relevant. Cross-cutting design notes live in
|
|
[CLAUDE.md](CLAUDE.md); high-level project intro in [README.md](README.md).
|
|
|
|
## Permissions / policy
|
|
|
|
- **Per-agent send allow-list.** Today any agent can `send` to any
|
|
other recipient (peer, manager, operator). Add a per-agent
|
|
policy that constrains the `to` field — declared in `agent.nix`,
|
|
e.g. `hyperhive.allowedRecipients = [ "manager" "alice" ]`.
|
|
Broker rejects with an `Err { message }` when the policy denies.
|
|
Default: unrestricted (back-compat). The manager can still
|
|
always send anywhere. Useful for sandboxing untrusted sub-agents
|
|
so they can only talk to the manager, not other sub-agents.
|
|
|
|
## Security
|
|
|
|
- **Unprivileged containers (userns mapping).** Today the nspawn container
|
|
runs as a fully privileged root. Goal: `PrivateUsersChown=yes` (or the
|
|
nixos-container equivalent) so uid 0 inside maps to an unprivileged uid
|
|
on the host, and a container-root compromise lands the attacker on an
|
|
ordinary user account, not the host's root. Requires per-agent state
|
|
dirs to be chown'd to that uid on the host side.
|
|
- **Bash command allow-list.** Replace the blanket `Bash` allow with a
|
|
pattern allow-list (`Bash(git *)`, `Bash(nix build .*)`, etc.) per
|
|
claude-code's `--allowedTools` extended grammar. Likely lives in
|
|
`agent.nix` so each agent can scope its own shell surface.
|
|
|
|
## Per-agent extension
|
|
|
|
- **Custom per-agent MCP tools.** Today every sub-agent gets the
|
|
same fixed MCP surface (`send`, `recv`). To move bitburner-agent
|
|
(and anything else with rich domain tooling) into hyperhive, an
|
|
agent needs a way to ship its own tools alongside hyperhive's.
|
|
Sketch: `agent.nix` declares a list of extra MCP servers
|
|
(command + args + env), each registered into the agent's
|
|
`--mcp-config` blob at flake-render time. The harness MCP server
|
|
remains the hyperhive surface; new servers slot in as additional
|
|
entries under `mcpServers.<name>` so claude sees them as
|
|
`mcp__<name>__<tool>`. Per-agent tool whitelist (`allowedTools`)
|
|
derived from the same config so the operator stays in control of
|
|
what's exposed.
|
|
|
|
## UI / UX
|
|
|
|
- **Dashboard: show per-agent applied config.** Surface
|
|
`/var/lib/hyperhive/applied/<name>/agent.nix` (the file the
|
|
container actually builds from) as a collapsible `<details>`
|
|
block on each container row, alongside the journald viewer.
|
|
Backend: new `GET /api/agent-config/{name}` returns the file
|
|
contents (text/plain). Frontend: lazy-fetch on expand, render
|
|
inside a `<pre>` with the same theming as the journal panel.
|
|
Useful for spot-checking what `request_apply_commit` produced
|
|
without ssh-ing in.
|
|
- **xterm.js terminal** embedded per-agent, attached to a PTY exposed by
|
|
the harness. Pairs well with the unprivileged-container work — would let
|
|
the operator drop into the container without `nixos-container root-login`.
|
|
|
|
## Telemetry
|
|
|
|
- **Harness stats per agent in sqlite, charted on the agent page.**
|
|
bitburner-agent samples 18 series; for hyperhive the generally-applicable
|
|
ones are:
|
|
- turns/min, tool calls/turn, turn duration p50/p95
|
|
- claude exit code distribution (ok vs `--compact`-retry vs failure)
|
|
- inbox depth (current + max-over-window)
|
|
- messages sent/received per turn (split by recipient: peer / operator /
|
|
manager / system)
|
|
- approval queue length (across all agents — dashboard-level)
|
|
- per-tool usage counts (Read/Edit/Bash/send/recv/…)
|
|
- time-since-last-turn (helps spot stuck agents)
|
|
- notes file size growth (cues compaction)
|
|
Backend: a `stats` table with `(agent, ts, key, value)` written from
|
|
the harness on `TurnEnd`; `GET /api/stats?since=…` returns the
|
|
series; agent page renders with a small chart lib (uPlot is light).
|
|
|
|
## Spawn flow
|
|
|
|
- **Two-step spawn.** Today `request_spawn(name)` is one shot: manager
|
|
asks → operator approves → container is created with a default
|
|
`agent.nix` and empty `/state/`. Manager has no way to pre-stage
|
|
per-agent prompt material, package additions, or initial notes before
|
|
the agent first wakes. Split into:
|
|
1. `request_spawn_draft(name)` — host creates the per-agent
|
|
`proposed/` repo (initial commit) and `state/` dir with no
|
|
container; manager now has `/agents/<name>/{config,state}/` to
|
|
edit + commit just like an existing agent.
|
|
2. `request_spawn_commit(name, commit_ref)` — submits the queued
|
|
approval; operator sees the diff in the dashboard like a normal
|
|
`apply_commit`; on approve the container is created from that
|
|
commit.
|
|
Backwards-compat: keep the existing one-shot `request_spawn` for
|
|
trivial agents (operator can still type a name in the dashboard).
|
|
Surface "drafts" as a new section between K3PT ST4T3 and approvals.
|
|
|
|
## Loop substance
|
|
|
|
- **Notes compaction.** `/state/` is bind-mounted persistently and agents
|
|
are told (in the system prompt) to keep `/state/notes.md` for durable
|
|
knowledge — but we don't currently nudge them to compact when notes
|
|
grow. Bitburner-agent's pattern: a short-lived secondary claude session
|
|
that takes the existing notes + a "compact this" prompt and rewrites
|
|
them in place. Add when the notes start bloating.
|