müde 5c6c607e25 agent badges: split into ctx (last-inference) + cost (cumulative)

the existing ctx badge was misnamed: it summed `result.usage`, which is
the cumulative tokens billed across every inference in the turn. for
tool-heavy turns that easily exceeds the model's context window (a 600k
cached prefix × 15 sub-calls = 9M cache_read), making it useless as a
"should i compact?" signal.

now two separate badges:

  ctx · N    last inference's prompt size = actual context window in
             use right now. parsed from each `assistant` event's
             `.message.usage`; the harness tracks the most recent one
             across the stream and snapshots it when the `result`
             event lands.

  cost · M   cumulative tokens billed across the whole turn (the
             previous behaviour, now correctly labelled).

both update via a single `TokenUsageChanged { ctx, cost }` SSE event at
turn-end. turn_stats grows four columns (`last_input_tokens`,
`last_output_tokens`, `last_cache_read_input_tokens`,
`last_cache_creation_input_tokens`) so the cold-load seed can paint both
badges on page load. migrations run try-and-ignore ALTERs so existing
agent dbs catch up; pre-migration rows have last-inference zeros and
yield no `ctx` seed (badge stays empty until next turn) rather than a
misleading 0.

2026-05-18 18:48:35 +02:00

22 KiB

Raw Blame History

Web UI

Two web surfaces share the same skeleton: the dashboard (port 7000) and the per-agent UIs (manager on :8000, sub-agents on a hashed :8100-8999). Both are SPAs — GET / returns a static shell, /api/state returns JSON, JS renders. No full-page reloads.

Shape (shared by both)

GET / → assets/index.html (placeholders for state-driven sections, shipped via include_str! so the binary has no runtime file dependency).
GET /static/*.css + GET /static/*.js → static assets. Both pages prepend hive_fr0nt::BASE_CSS + TERMINAL_CSS to their per-page stylesheet, and GET /static/hive-fr0nt.js serves the shared window.HiveTerminal.create runtime. The dashboard's #msgflow and the per-agent #live log are both backed by this terminal — sticky-bottom auto-scroll, "↓ N new" pill, history backfill, SSE plumbing all live there. Each page registers a kind→renderer map; unknown kinds fall through to a JSON-dump note row.
GET /api/state → JSON snapshot the JS app renders into the DOM. Includes a top-level seq (the dashboard event channel's high-water mark at the moment the snapshot was assembled); clients use it to dedupe their buffered SSE traffic against the snapshot (drop frames with seq <= snapshot.seq).
GET /dashboard/stream (dashboard) / GET /events/stream (per-agent) → text/event-stream SSE for live updates. The dashboard stream carries broker Sent / Delivered (mirrored by a forwarder task from the broker's intra-process channel) plus mutation events (approval_added / approval_resolved, question_added / question_resolved, transient_set / transient_cleared). Each frame carries a seq. The matching backfill endpoint is GET /dashboard/history (last ~200 broker messages wrapped in { seq, events }) on the dashboard and GET /events/history (last 2000 LiveEvents also wrapped in { seq, events }) on the agent.

The JS app handles all form[data-async] submissions via a delegated listener: read data-confirm, swap the button to a spinner, POST application/x-www-form-urlencoded, re-enable the button on success (refreshState may keep the form mounted, so we don't rely on a re-render), call refreshState(). State shapes live in dashboard.rs::StateSnapshot and web_ui.rs::StateSnapshot — when adding state fields, plumb through the snapshot struct and the relevant assets/app.js render function.

Focus preservation: refreshState checks whether document.activeElement sits inside one of the managed sections and, if so, skips the refresh (defers 2s). The operator never has the form yanked out from under them mid-type; the update lands as soon as they blur.

<details> open-state preservation: any collapsible element tagged with data-restore-key="<stable-key>" survives the refresh. snapshotOpenDetails() walks managed sections before render, restoreOpenDetails() re-applies after. Used today for the journald viewer (journal:<container>), the agent-config viewer (agent-config:<name>), and approval diff blocks (approval-diff:<id>). Setting .open = true programmatically also fires the toggle event, so any lazy-fetch wired to it re-runs cleanly on restore.

Both bind their listeners with SO_REUSEADDR via tokio::net::TcpSocket plus a retry loop on AddrInUse (12 tries, exponential backoff capped at 2s) so an nspawn restart that races the previous process's socket release resolves itself.

Dashboard sections (top to bottom)

Notification row — 🔔 enable notifications button when permission ungranted; 🔕 mute / 🔔 unmute toggle once granted; inline "unsupported / blocked" message when applicable. Sits under the banner.
C0NTAINERS — live containers with their action surface. Pulsing red banner at the top of this section if any two sub-agents hash to the same port (port_conflicts from /api/state): the operator must rename one of them and rebuild. lifecycle::{spawn,rebuild} also preflight this and refuse with a clear error message naming the conflicting agent.
K3PT ST4T3 — destroyed-but-state-kept tombstones (size + age + claude-creds badge). Two actions: ⊕ R3V1V3 (queues a Spawn approval; existing state is reused), PURG3 (wipes state + applied dirs; POST /purge-tombstone/{name}).
M1ND H4S QU3STI0NS — pending operator-targeted ask questions, i.e. rows with target IS NULL (peer-to-peer questions live in the same table but never surface here) (amber pulsing border). Free-text fallback always rendered alongside any option list; multi=true renders options as checkboxes; submit merges selections + free text comma-joined. Each row has a ✗ CANC3L button that resolves the question with [cancelled]. Questions with a ttl_seconds show a ⏳ MM:SS chip; the host-side watchdog auto-cancels with [expired] when the deadline fires.
0PER4T0R 1NB0X — recent messages addressed to operator, derived client-side from the dashboard event stream (no longer a snapshot field). Cold load seeds from /dashboard/history's 200-message backfill; subsequent sent events with to == "operator" are appended live. Cap 50, newest-first.
P3NDING APPR0VALS — the queue. The R3QU3ST SP4WN form lives at the top of this section since submitting it immediately queues an approval that lands directly below.
MESS4GE FL0W — live broker tail wrapped in a .terminal-wrap (same chrome as the per-agent terminal). Cold load backfills the last ~200 messages from /dashboard/history; live frames arrive on /dashboard/stream and dispatch through HiveTerminal.create. Each row is one broker event — sent or delivered — with from → to: body; per-agent thinking / tool calls / claude chatter stay out of this view, only what passes through hive-c0re's broker. Sticky- bottom auto-scroll + "↓ N new" pill match the per-agent page. Below the stream sits a terminal-style compose box: @name picks the recipient (sticky across sends via localStorage; auto-complete from the live container list, Tab/Enter to confirm; @* broadcasts to every registered agent), starting a message with @<name> body retargets in one stroke, plain text sends to the sticky recipient. POST /op-send drops {from:"operator", to, body} into the broker and returns 200; the resulting SSE frame re-renders both the terminal row and the inbox section (no /api/state refetch). Manager is addressed as @manager (the broker recipient string), not @hm1nd (the container name); the auto-complete swaps automatically.

Container row

Two-line layout (assets/app.js::renderContainers):

Line 1: agent name (link → new tab), m1nd/ag3nt chip, needs login / needs update warning badges, in-flight ◐ pending-state… pill (replaces buttons during start / stop / restart / rebuild / destroy), container name + port.
Line 2: action buttons — ↻ R3BU1LD always, DESTR0Y + PURG3 on sub-agents, ↺ R3ST4RT + (sub-agents) ■ ST0P when running, ▶ ST4RT when stopped. Buttons dim + disable while a transient lifecycle action is in flight.
Plus two collapsible <details> blocks:
- ↳ logs · <container> — lazy-fetches journald output via GET /api/journal/{name}?unit=...&lines=... (journalctl -M <container> -b --no-pager --output=short-iso). A unit dropdown switches between the harness service (default) and the full machine journal; refresh button re-fetches.
- ↳ agent.nix · <name> — lazy-fetches the applied config file via GET /api/agent-config/{name} (read-only mirror of /var/lib/hyperhive/applied/<name>/agent.nix). Mutating this still requires request_apply_commit + approval.

↻ UPD4TE 4LL button appears above the containers list when any agent is stale. Banner pulses on each broker SSE event (pulseBanner with a 4s grace timer).

Browser notifications

Pure frontend (Notification API). Three signals trigger them:

new pending approval (per id, delta on /api/state)
new pending operator question (per id)
new broker message sent to: "operator" (live via SSE)

First /api/state after page load seeds "seen" sets without firing — only items that arrive while the page is open count. Per-event tags (hyperhive:approval:<id>, hyperhive:question:<id>, hyperhive:msg:<at>:<rand>) so distinct events stack in the OS notification center instead of overwriting each other. console.debug logs at every block point (unsupported, permission ungranted, muted) for in-browser debugging. Click focuses the dashboard tab. localStorage-backed mute toggle silences without revoking the OS permission. Requires a secure context (HTTPS or localhost); on other origins the controls hide themselves. Browsers typically suppress notifications while the originating tab is focused — that's a browser-level decision, not ours.

Dashboard endpoints

POST /approve/{id} — approve a pending approval. Fires ApprovalResolved on the dashboard event channel; client updates derived approvals state from the event.
POST /deny/{id} (note=<reason>, optional) — deny a pending approval with an optional operator-supplied reason. The reason travels to the manager as HelperEvent::ApprovalResolved.note and also rides on the dashboard's ApprovalResolved event. Dashboard prompts via window.prompt() on click.
POST /{rebuild,kill,restart,start,destroy}/{name} — lifecycle. destroy accepts purge=on to also wipe state dirs.
POST /purge-tombstone/{name} — wipe a tombstone's state dirs.
POST /answer-question/{id} — answer a pending operator question.
POST /cancel-question/{id} — cancel a pending question with the sentinel [cancelled]. Same code path as a real answer.
POST /request-spawn — queue a Spawn approval.
POST /update-all — rebuild every stale container.
POST /op-send (to=<name>, body=<text>) — drop an operator-authored message into <name>'s inbox. to=* fans out to every registered agent. Returns 200; the broker Sent event re-renders both the message-flow terminal and the operator inbox without a snapshot refetch. Used by the compose textbox under MESS4GE FL0W.
GET /api/journal/{name}?unit=&lines= — journalctl viewer for a managed container.
GET /api/agent-config/{name} — read-only view of the applied agent.nix.
GET /api/state-file?path=<host-or-container-path> — bounded text read of a file under the per-agent state/ subtree or the shared /var/lib/hyperhive/shared/. Accepts the container-view forms (/agents/<n>/state/..., /shared/...) and the host form. Canonicalises + verifies the path stays inside the allow-list, refuses anything but a regular file, refuses /agents/<n>/claude / config subtrees, truncates bodies at 1 MiB. Click-time backing for the inline path-link preview.

Detection of which tokens are path links is done server-side at broker-message ingest, not client-side: the broker forwarder calls scan_validated_paths(body) — same allow-list helper the read endpoint uses — and attaches the verified file tokens to the event as file_refs: Vec<String>. The client trusts that list and linkifies only those tokens, so directories, missing files, and forbidden subtrees never become anchors. No probe endpoint, no client-side regex heuristics. Historical messages get the same treatment on /dashboard/history backfill.
GET /api/reminders — list pending reminders for the dashboard's queued-reminders panel.
POST /cancel-reminder/{id} — hard-delete a pending reminder.
GET /dashboard/stream — unified live event channel: broker sent / delivered, plus the mutation events listed below. Each frame carries seq.
GET /dashboard/history — last ~200 broker messages (wrapped as { seq, events }) for the message-flow terminal's backfill on page load.

Dashboard event channel

Wire vocabulary on /dashboard/stream (kind tag is in the JSON payload):

sent / delivered — broker traffic, mirrored from the intra-process channel by a forwarder task. Used by the message-flow terminal renderer and the operator-inbox derived state.
approval_added (id, agent, approval_kind, sha_short, diff, description) / approval_resolved (id, agent, approval_kind, sha_short, status, resolved_at, note, description) — pending queue + history mutations. Client mutates a derived store and re-renders only the approvals section.
question_added (id, asker, question, options, multi, asked_at, deadline_at, target) / question_resolved (id, answer, answerer, answered_at, cancelled, target) — both operator-targeted and peer (agent-to-agent) threads fire these. The dashboard's questions pane surfaces both, with filter chips (all / @operator / @peer / per-participant) and an 0V3RR1D3 button on peer rows so the operator can answer when an agent is stuck. The ttl watchdog fires question_resolved with answerer = "ttl-watchdog" on expiry.
transient_set (name, transient_kind, since_unix) / transient_cleared (name) — lifecycle action spinners. The client ticks the elapsed-seconds badge off since_unix client-side, no polling.
container_state_changed (container: ContainerView) / container_removed (name) — per-row container mutations, emitted by Coordinator::rescan_containers_and_emit from every mutation site (actions::approve post-spawn, actions::destroy, the lifecycle_action wrapper, auto_update::rebuild_agent) and from the 10s crash_watch poll. Client upserts/removes by name; the pending overlay is read from transientsState since the payload doesn't carry it.

/api/state is only fetched on cold-load and on the few forms that mutate non-event-derived state (PURG3 + meta-update, since tombstones + meta_inputs aren't event- shaped yet). Every other section — approvals, questions, transients, containers, operator inbox, message flow — derives from /dashboard/stream after the initial snapshot, maintaining its own client-side store and applying events on top. The 5s periodic poll is gone.

Generalised form helpers: form[data-confirm="…"] pops confirm() before submit; form[data-prompt="…"] pops prompt() and stashes the answer in a hidden input named by data-prompt-field (default note).

Per-agent page

Layout, top to bottom:

Banner (gradient shimmer while state=thinking).
Title with ↑ DASHB04RD back-link (new tab) + ↻ R3BU1LD.
Status section: empty when online (alive-badge in the state row carries the signal), populated with the login form / OAuth URL when status is needs_login_*.
State row: alive badge + state badge + model chip + ctx badge + last-turn timing + cancel-turn button + new-session button. Every chip carries a title=... tooltip with the detailed breakdown.
- Alive badge: ● alive (green) / ◌ needs login (amber) / ◌ logging in / ○ offline / … connecting. Driven by LiveEvent::StatusChanged; replaces the old "harness alive — turn loop running" paragraph so the state row carries every reachability signal.
- State badge: 💤 idle / 🧠 thinking / 📦 compacting / ○ offline / … booting, with an age suffix (12s, 2m 14s). Driven by LiveEvent::TurnStateChanged ({state, since_unix}) — the bus emits on every Bus::set_state so the badge updates without a /api/state refetch. Cold-load via /api/state.turn_state + turn_state_since.
- Model chip: model · <name> (e.g. model · haiku). Driven by LiveEvent::ModelChanged; emitted from Bus::set_model.
- Ctx badge: ctx · 142k — last inference's prompt size (input + cache_read + cache_write of the most recent model call in the just-ended turn). This is the actual context window utilisation — the number to watch when deciding whether to compact.
- Cost badge: cost · 1.3M — cumulative tokens billed across every inference in the last turn (sum of all per-call prompts). Tool-heavy turns rebill the cached prefix per call, so this routinely exceeds the model's window — it's a cost signal, not a size signal.
- Both badges driven by LiveEvent::TokenUsageChanged { ctx, cost }, emitted once at turn-end from Bus::record_turn_usage. The harness tracks per-inference usage by walking assistant events in the stream-json and updating last_inference on each one; the result event supplies cost and triggers the emit.
- Last-turn chip: last turn 12.3s appears after the first turn ends, computed from the state-since deltas.
- ■ cancel turn button: visible only while state=thinking, POSTs /api/cancel.
- ↻ new session button: always visible, amber. Confirms via window.confirm() then POSTs /api/new-session to arm a one-shot Bus flag — the next turn drops --continue, starting a fresh claude session. Subsequent turns resume normal --continue.
Polling: /api/state is fetched once on cold load, and again while status === 'needs_login_in_progress' (login session output isn't event-shaped yet). Every other badge updates from SSE; no periodic refresh timer runs.
Inbox <details> block (collapsed): inbox · N — last 30 messages addressed to this agent, fetched via AgentRequest::Recent { limit: 30 }. (Separate from AgentRequest::Recv { wait_seconds } which the harness uses internally to long-poll the broker.)
Terminal-wrap: live event tail (sticky-bottom auto-scroll + ↓ N new pill when not at bottom) followed by an operator-input textarea acting as a prompt.

Live view

Each agent runs an events::Bus: a tokio::sync::broadcast<LiveEvent> plus a sqlite-backed history at /state/hyperhive-events.sqlite. The harness emits TurnStart { from, body, unread }, Stream(value) (one per parsed stream-json line), Note, TurnEnd { ok, note }. The web UI:

fetches GET /events/history on page load and replays the last 2000 events (oldest first, with .no-anim so they don't stagger);
then subscribes to GET /events/stream (SSE) for live tail;
shows a granular state badge above the terminal, driven authoritatively from /api/state.turn_state. SSE turn_start / turn_end still flip the badge instantly between renders;
sticky-bottom auto-scroll: scrolling up parks the view; new rows surface a "↓ N new" pill instead of yanking;
terminal-themed: phosphor mauve glow, Crust bg, backdrop-filter blur, row fade-in slide-up.

Per-stream rendering:

Stream tool_use →
- Write / Edit: collapsed <details> with a +/- diff body (- lines from input.old_string, + lines from input.new_string or every line of input.content). Summary carries the path + line counts.
- others (Read /path, Bash $ cmd, mcp__hyperhive__send → operator: "...", etc.): flat one-line per-tool format.
Stream tool_result short → flat ← ...; long → collapsed <details> ▸ ← Nl · headline (click to expand full body).
Stream thinking → text content if claude provided one, otherwise the bare · thinking … indicator.
Stream system init, result, rate_limit_event are dropped — too noisy.
Note → · text.
TurnEnd → ✓ turn ok / ✗ turn fail — note, triggers a refreshState().

Terminal-embedded prompt

The operator input lives inside the terminal-wrap as a prompt-style textarea below the live tail: multi-line (Enter sends, Shift+Enter newlines), tab-completes slash commands.

Slash commands today:

/help — list commands locally.
/clear — wipe the local terminal view (server history kept).
/cancel — POST /api/cancel → host shellouts pkill -INT claude, emits a Note. Also surfaces as a ■ cancel turn button in the state row while state=thinking.
/compact — POST /api/compact → host spawns turn::compact_session in the background; output streams into the live panel.
/model <name> — POST /api/model flipping Bus::set_model. Takes effect on the next turn; persisted to /state/hyperhive-model so the override survives harness restart / rebuild.
/new-session — POST /api/new-session (confirms first). Arms a one-shot on the Bus; next turn runs without --continue, dropping the resume session entirely.

Unknown /foo shows an error row instead of being silently sent.

Per-agent endpoints

All POSTs return 200 (no 303 redirects). The matching mutations fire LiveEvent variants on the per-agent bus, so the client doesn't refetch /api/state on submit — the SSE stream delivers the new state faster anyway. Only the login flow still polls (session output streams in updates that aren't event- shaped).

POST /send — operator-injected message into this agent's inbox.
POST /login/{start,code,cancel} — claude OAuth login flow. Start/cancel emit LiveEvent::StatusChanged to flip the badge to/from needs_login_in_progress.
POST /api/cancel — SIGINT the in-flight claude turn. Emits a LiveEvent::Note.
POST /api/compact — run /compact on the persistent session (same MCP config + system prompt + allowed tools as a normal turn — only the stdin payload differs). Flips state to Compacting via Bus::set_state, which emits TurnStateChanged.
POST /api/model (model=<name>) — switch the model for future turns. Bus::set_model emits ModelChanged.
POST /api/new-session — arm a one-shot for the next turn to drop --continue. Emits a LiveEvent::Note.
GET /events/history — replay buffer for the terminal.

Bus events (new vocabulary on /events/stream):

status_changed { status } — online / needs_login_idle / needs_login_in_progress. Drives the alive-badge.
model_changed { model } — drives the model chip.
token_usage_changed { ctx: TokenUsage, cost: TokenUsage } — drives the ctx + cost badges. Emitted from Bus::record_turn_usage at turn-end; ctx is the last inference's usage (current context size), cost is the cumulative across every inference (the result event's totals).
turn_state_changed { state, since_unix } — drives the state badge (idle/thinking/compacting).

22 KiB Raw Blame History