From 3f2aba4adc5b26997eb62b0471afee11b70b6e94 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?m=C3=BCde?= Date: Fri, 15 May 2026 19:14:35 +0200 Subject: [PATCH] =?UTF-8?q?todo:=20parity=20gaps=20vs=20bitburner-agent=20?= =?UTF-8?q?=E2=80=94=20state=20badge,=20slash=20cmds,=20stats,=20nap,=20vi?= =?UTF-8?q?z=20polish,=20persistent=20event=20history?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- TODO.md | 61 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 60 insertions(+), 1 deletion(-) diff --git a/TODO.md b/TODO.md index 6f5c5e6..5492d72 100644 --- a/TODO.md +++ b/TODO.md @@ -21,16 +21,67 @@ Pick anything from here when relevant. Cross-cutting design notes live in - **Model override.** Hard-coded to `haiku` in the turn loop right now. Surface as a per-agent override: operator via dashboard, manager via `request_apply_commit` setting an attr on the agent's flake (most natural - place since the flake already carries per-agent env/identity). + place since the flake already carries per-agent env/identity). Pair with + a **model status** indicator on the agent page (active / queued / last + switched) once the override is in place. ## UI / UX - **Per-agent UI substance.** Show last N inbox messages, last turn timing, link back to dashboard. +- **Delivered events history persistence.** The `events::Bus` ring + buffer (500 events, in-memory) backfills the terminal on page load + but dies on harness restart, and only ever holds the most recent + turn or two. Persist to sqlite (`events(agent, id, ts, kind, + payload_json)`) so the operator can scroll back through prior + turns, and so `/events/history` survives restart. Cap rows per + agent or auto-vacuum on age, same trade-off as the bounded broker + entry below. +- **Granular agent state badge** above the terminal: `idle ๐Ÿ’ค / thinking ๐Ÿง  / + compacting ๐Ÿ“ฆ` with an age timer (`thinking ยท 12s`). Drives a state + channel from the harness: idle when waiting on the inbox, thinking + while claude's stream is open, compacting when `/compact` is in + flight. Replaces the binary "harness alive โ€” turn loop running" line. +- **Terminal: slash commands + tab-completion.** Operator-facing + in-terminal commands: `/help`, `/model`, `/compact`, `/clear`. Tab + completes command names + model names (cf. bitburner-agent's pattern). +- **Terminal: multi-line input.** Replace the single-line `` with + an auto-growing textarea; Enter sends, Shift+Enter newlines. +- **Terminal: cancel-current-turn button.** Explicit "kill claude + process for this turn" control. Harness needs to track the + in-flight claude child PID and offer a `/cancel` endpoint that sends + SIGTERM; UI surfaces a button while the state badge is `thinking`. + Slash-command equivalent: `/cancel`. +- **`/compact` trigger.** Operator-initiated compaction of the current + claude session โ€” `claude --print --continue` with `/compact` over the + same session id. Surfaces as a slash command in the terminal + a + toolbar button while the state badge is `idle`. Sets state to + `compacting` during the run. +- **Visuals.** Frosted-glass backdrop blur on the terminal wrap, + per-event fade-in slide-up animation on new rows, badge pulse + animation on state-badge transitions. - **xterm.js terminal** embedded per-agent, attached to a PTY exposed by the harness. Pairs well with the unprivileged-container work โ€” would let the operator drop into the container without `nixos-container root-login`. +## Telemetry + +- **Harness stats per agent in sqlite, charted on the agent page.** + bitburner-agent samples 18 series; for hyperhive the generally-applicable + ones are: + - turns/min, tool calls/turn, turn duration p50/p95 + - claude exit code distribution (ok vs `--compact`-retry vs failure) + - inbox depth (current + max-over-window) + - messages sent/received per turn (split by recipient: peer / operator / + manager / system) + - approval queue length (across all agents โ€” dashboard-level) + - per-tool usage counts (Read/Edit/Bash/send/recv/โ€ฆ) + - time-since-last-turn (helps spot stuck agents) + - notes file size growth (cues compaction) + Backend: a `stats` table with `(agent, ts, key, value)` written from + the harness on `TurnEnd`; `GET /api/stats?since=โ€ฆ` returns the + series; agent page renders with a small chart lib (uPlot is light). + ## Manager โ†’ operator question channel - **TTL / cancel on `ask_operator`.** Questions today block forever; the @@ -42,6 +93,14 @@ Pick anything from here when relevant. Cross-cutting design notes live in ## Loop substance +- **`nap` tool.** Agent-side MCP tool `mcp__hyperhive__nap(seconds)` that + parks the turn loop for a short while before next-message processing. + Use cases: agent decides it has nothing useful to do, or wants to + throttle itself between rapid wake events. Implementation: harness + records a "wake-not-before" timestamp; `recv_blocking` skips the long + poll until that ts; the state badge reads `napping ยท MM:SS` during. + Operator can cancel via the same `/cancel` slash command or a + dashboard button. - **Notes compaction.** `/state/` is bind-mounted persistently and agents are told (in the system prompt) to keep `/state/notes.md` for durable knowledge โ€” but we don't currently nudge them to compact when notes