From 897e7c07ae06e39dd875270d6635aa6ef4914961 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?m=C3=BCde?= Date: Fri, 15 May 2026 20:02:54 +0200 Subject: [PATCH] dashboard: spawn form moves under approvals; docs synced MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit submitting R3QU3ST SP4WN immediately queues an approval that lands in the very next list. the form belonged with that list, not at the top of containers — the agent doesn't exist yet at form time anyway. docs: claude.md grows operator_questions.rs / events.rs sqlite / broker vacuum to the file map; web-ui shape lists the actual current endpoint set (per-agent cancel/compact/history, dashboard tombstone purge/answer/spawn); live-view section now describes the state badge, sticky-bottom scroll, history backfill, and the terminal- embedded prompt with its slash commands; dashboard-action-surface rewritten around the new six-section page (containers / kept-state / questions / inbox / approvals / message-flow) and the two-line container row. new 'persistence + retention' section documenting both sqlite databases and their vacuum cadences. readme picks up the new mgr mcp surface (start/restart/ask_operator) + operator-side features list + ask_operator answer flow. todo trimmed of shipped items (bigger terminal / sticky scroll / cancel button / /compact trigger / /cancel command). new entry for the two-step spawn-with-preconfig flow. --- CLAUDE.md | 144 +++++++++++++++++++++++++++++++--------- README.md | 19 ++++-- TODO.md | 45 +++++++------ hive-c0re/assets/app.js | 35 +++++----- 4 files changed, 169 insertions(+), 74 deletions(-) diff --git a/CLAUDE.md b/CLAUDE.md index fea64a4..be0efec 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -10,15 +10,19 @@ Operator + dev notes: conventions, gotchas, per-subsystem design. ``` hive-c0re/ host daemon + CLI (one binary, subcommand-dispatched) src/main.rs clap setup; serve / spawn / kill / rebuild / list / - pending / approve / deny / destroy / request-spawn + pending / approve / deny / destroy [--purge] / + request-spawn; periodic broker vacuum task src/server.rs host admin socket (HostRequest → dispatch) src/client.rs admin-socket client src/manager_server.rs manager-privileged socket (ManagerRequest) src/agent_server.rs per-sub-agent socket listener (long-poll Recv) - src/broker.rs sqlite Message store + broadcast channel for SSE + src/broker.rs sqlite Message store + broadcast channel for SSE + + hourly vacuum of delivered>30d src/approvals.rs sqlite Approval queue + kinds - src/coordinator.rs shared state (broker/approvals/transient/sockets) - src/actions.rs approve/deny/destroy + src/operator_questions.rs sqlite question queue backing `ask_operator` + src/coordinator.rs shared state (broker/approvals/questions/transient/ + sockets) + tombstone enumeration + src/actions.rs approve/deny/destroy (transient-aware) src/auto_update.rs startup rebuild scan + ensure_manager src/lifecycle.rs `nixos-container` shellouts, per-agent flake generator src/dashboard.rs axum HTTP: static shell + /api/state JSON + actions @@ -27,14 +31,16 @@ hive-c0re/ host daemon + CLI (one binary, subcommand-dispatched) hive-ag3nt/ in-container harness crate; produces TWO binaries src/lib.rs re-exports + DEFAULT_SOCKET, DEFAULT_WEB_PORT src/client.rs generic JSON-line request/response over unix socket - src/web_ui.rs per-container axum HTTP page - src/events.rs LiveEvent + broadcast Bus for the SSE stream + src/web_ui.rs per-container axum HTTP page (incl /api/cancel, + /api/compact, /events/history) + src/events.rs LiveEvent + broadcast Bus + sqlite-backed history + (/state/hyperhive-events.sqlite) + hourly vacuum src/turn.rs claude --print + stream-json pump; --compact retry src/mcp.rs embedded MCP server (rmcp): AgentServer + ManagerServer src/login.rs probe /root/.claude/ for a valid session src/login_session.rs drives `claude auth login` over stdio pipes - src/bin/hive-ag3nt.rs sub-agent main - src/bin/hive-m1nd.rs manager main + src/bin/hive-ag3nt.rs sub-agent main (Serve + Mcp subcommands) + src/bin/hive-m1nd.rs manager main (Serve + Mcp subcommands) assets/ index.html, agent.css, app.js (include_str!) prompts/ static role/tools/settings for claude (include_str!): agent.md — sub-agent system prompt @@ -138,16 +144,23 @@ Both the dashboard (port 7000) and the per-agent web UIs (8000 / - `GET /static/*.css` + `GET /static/*.js` → static assets shipped via `include_str!` so there's no runtime file dependency. - `GET /api/state` → JSON snapshot the JS app renders into the DOM. -- `POST /` (approve, deny, kill, restart, rebuild, destroy, - request-spawn, update-all, send, login/*) → idempotent action endpoints. - `GET /events/stream` (per-agent) and `GET /messages/stream` (dashboard) are `text/event-stream` SSE for live updates. +Per-agent endpoints: `POST /send`, `POST /login/{start,code,cancel}`, +`POST /api/cancel`, `POST /api/compact`, `GET /events/history`. + +Dashboard endpoints: `POST /{approve,deny}/{id}`, `POST +/{rebuild,kill,restart,start,destroy}/{name}`, `POST +/purge-tombstone/{name}`, `POST /answer-question/{id}`, `POST +/request-spawn`, `POST /update-all`. + The JS app handles all `form[data-async]` submissions via a delegated listener: read `data-confirm`, swap the button to a spinner, POST `application/x-www-form-urlencoded` (axum's `Form` extractor rejects -multipart), then on success call `refreshState()` (re-fetch `/api/state` -and re-render). No full-page reloads. +multipart), then on success re-enable the button (refreshState often +keeps the form mounted) and call `refreshState()` (re-fetch +`/api/state` and re-render). No full-page reloads. Per-agent + dashboard state shapes live in `dashboard.rs::StateSnapshot` and `web_ui.rs::StateSnapshot`. When adding new state fields, plumb @@ -226,12 +239,25 @@ into the wake prompt + UI header. New tools call this helper. `Bash` is on the allow-list pending a finer-grained pattern allow-list (`Bash(git *)`-style) — see [TODO.md](TODO.md). -**Live view.** Each agent runs an `events::Bus` (a -`tokio::sync::broadcast` wrapper). The harness emits -`TurnStart { from, body, unread }`, `Stream(value)` (one per parsed -stream-json line), `Note`, `TurnEnd { ok, note }`. The web UI subscribes -via `/events/stream` (SSE) and a JS panel (terminal-themed: Crust bg, inset -shadow, monospace) renders rows: +**Live view.** Each agent runs an `events::Bus` (broadcast channel + +sqlite-backed history at `/state/hyperhive-events.sqlite`). The harness +emits `TurnStart { from, body, unread }`, `Stream(value)` (one per +parsed stream-json line), `Note`, `TurnEnd { ok, note }`. The web UI: + +- fetches `GET /events/history` on page load and replays the last + 2000 events (oldest first, `.no-anim` so they don't stagger), +- then subscribes to `GET /events/stream` (SSE) for live tail, +- shows a granular state badge above the terminal (`💤 idle / 🧠 + thinking / ○ offline · `) driven from `turn_start`/`turn_end`, + with a flash animation on transition, +- sticky-bottom auto-scroll: scrolling up parks the view; new rows + surface a "↓ N new" pill instead of yanking. Scrolling back to + bottom clears the counter, +- terminal-themed: phosphor mauve glow, Crust bg, backdrop-filter + blur, row fade-in slide-up, banner gradient shimmer while + state=thinking. + +Per-tool rendering: - `TurnStart` → `◆ TURN ← · N unread` header + indented body. - `Stream` `tool_use` → `→ Read /path` / `→ Bash $ cmd` / @@ -247,8 +273,20 @@ shadow, monospace) renders rows: `refreshState()` so the page form view reflects state transitions (e.g. login just landed). -The operator send form sits below the live panel, so the tail is what -you read first. +The operator input lives *inside* the terminal-wrap as a prompt-style +textarea below the live tail: multi-line (Enter sends, Shift+Enter +newlines), tab-completes slash commands. Available slash commands: + +- `/help` — list commands locally +- `/clear` — wipe the visible terminal (server history kept) +- `/cancel` — POST `/api/cancel` (host shellouts `pkill -INT + claude`, emits a Note); also surfaces as a `■ cancel turn` button + in the state row while state=thinking +- `/compact` — POST `/api/compact` (host spawns + `turn::compact_session` in the background; output streams into the + live panel) + +Unknown `/foo` shows an error row instead of being silently sent. ## Manager (hm1nd) is hive-c0re-managed @@ -334,19 +372,40 @@ loops over every stale container. ## Dashboard action surface -Container row buttons (rendered per-state by `assets/app.js`): +Page sections (top to bottom): -- Always: `↻ R3BU1LD` (calls `lifecycle::rebuild`), and for sub-agents - `DESTR0Y` (container removed, state + creds kept) + `PURG3` - (DESTR0Y plus wipes `/var/lib/hyperhive/{agents,applied}//`; - no undo). -- Running: `↺ R3ST4RT` + (sub-agents only) `■ ST0P`. -- Stopped: `▶ ST4RT`. -- Stale marker: clickable `needs update ↻` badge (same target as rebuild - but only shown when out of date). +1. **C0NTAINERS** — live containers with their action surface (below). +2. **K3PT ST4T3** — destroyed-but-state-kept tombstones (size + + age + claude-creds badge). Two actions: `⊕ R3V1V3` (queues a + Spawn approval; existing state is reused), `PURG3` (wipes + state + applied dirs; `POST /purge-tombstone/{name}`). +3. **M1ND H4S QU3STI0NS** — pending `ask_operator` questions + (amber pulsing border). Always renders a free-text fallback + alongside any option list; `multi=true` renders options as + checkboxes; submit merges selections + free text comma-joined. +4. **0PER4T0R 1NB0X** — recent messages addressed to `operator` + (last 50, from the broker). +5. **P3NDING APPR0VALS** — the queue. The R3QU3ST SP4WN form + lives at the top of this section since submitting it immediately + queues an approval that lands directly below. +6. **MESS4GE FL0W** — live broker SSE tail. -Top of the containers list: `↻ UPD4TE 4LL` (when any stale) + the -"R3QU3ST SP4WN" form for queuing a new agent through the approval flow. +Container row (two-line layout, `assets/app.js::renderContainers`): + +- Line 1: agent name (link → new tab), m1nd/ag3nt chip, `needs + login` / `needs update` warning badges, in-flight `◐ pending-state…` + pill (replaces buttons during start/stop/restart/rebuild/destroy), + container name + port. +- Line 2: action buttons — `↻ R3BU1LD` always, `DESTR0Y` + `PURG3` + on sub-agents, `↺ R3ST4RT` + (sub-agents) `■ ST0P` when running, + `▶ ST4RT` when stopped. Buttons dim + disable while a transient + lifecycle action is in flight. + +`↻ UPD4TE 4LL` button appears above the containers list when any +agent is stale. + +Banner pulses on each broker SSE event (`pulseBanner` with a 4s +grace timer). ## Approval flow @@ -374,3 +433,26 @@ The container's `--flake` ref is `#default`. The flake extends an inline module setting `programs.git.config.user` (committer identity = the agent's name) and `systemd.services..environment` (HIVE_PORT, HIVE_LABEL, HIVE_DASHBOARD_PORT). + +## Persistence + retention + +Two sqlite files; both autovacuum on a 1h tokio task: + +- **`/var/lib/hyperhive/broker.sqlite`** (host) — `messages` + + `approvals` + `operator_questions` tables. `Broker::vacuum_delivered` + drops delivered messages older than 30 days; undelivered rows are + always kept. Approvals + questions are kept indefinitely (auditable). +- **`/state/hyperhive-events.sqlite`** (per-container, bind-mounted + from `/var/lib/hyperhive/agents//state/`) — every `LiveEvent` + emitted on the per-agent `Bus`. Hourly vacuum drops rows older than + 7 days, then trims to the most recent 2000. Path overridable via + `HYPERHIVE_EVENTS_DB` (for dev / no-`/state` setups; on open failure + the Bus falls back to no-store mode rather than crashing the + harness). Survives destroy/recreate; gone on `--purge`. + +State dirs (per agent, under `/var/lib/hyperhive/agents//`): +`config/` (proposed nix repo), `claude/` (creds, bind-mounted RW to +`/root/.claude`), `state/` (durable notes + events db, bind-mounted to +`/state`). Wiped only on explicit `--purge`. Tombstones (state dir +without a live container) surface in the dashboard's K3PT ST4T3 +section so the operator can either revive or purge. diff --git a/README.md b/README.md index 4878ebe..de64cce 100644 --- a/README.md +++ b/README.md @@ -29,8 +29,9 @@ host (NixOS, runs hive-c0re.service) │ manager additionally gets /agents RW) │ ├── hm1nd hive-m1nd serve : claude turn loop + - │ MCP (send / recv / request_spawn / kill / - │ request_apply_commit) + web UI on :8000 + │ MCP (send / recv / request_spawn / kill / start / + │ restart / request_apply_commit / ask_operator) + │ + web UI on :8000 │ └── h- hive-ag3nt serve : claude turn loop + MCP (send / recv) + web UI on a hashed :8100-8999 @@ -39,13 +40,23 @@ host (NixOS, runs hive-c0re.service) Each turn: harness pops one inbox message (Recv long-polls server-side and wakes on a broker Sent event) → builds a wake prompt → spawns `claude --print --continue --output-format stream-json --mcp-config …` → -streams JSON events into the per-agent SSE bus → claude drives any further -`recv`/`send` itself via the embedded MCP server. +streams JSON events into the per-agent SSE bus + a sqlite history db → +claude drives any further `recv`/`send` itself via the embedded MCP server. + +Operator surface per agent: terminal-themed live tail with a textarea +prompt; slash commands `/help` `/clear` `/cancel` `/compact`; granular +state badge (idle / thinking / offline) with age timer; cancel-turn +button while thinking; sticky-bottom auto-scroll with "↓ N new" pill; +event history backfilled on page load. Config changes flow the other way: manager edits `/agents//config/agent.nix` (bind-mounted from the host's proposed repo) → commits → submits the sha as an approval → operator clicks ◆ APPR0VE on the dashboard → hive-c0re copies the file into the applied repo and `nixos-container update`s the agent. +For decisions the manager needs human signal on, `ask_operator(question, +options?, multi?)` queues a free-text/checkbox/radio form on the +dashboard; the answer arrives later as a `HelperEvent::OperatorAnswered` +in the manager's inbox. ## Host config diff --git a/TODO.md b/TODO.md index c4bb48d..d5ccad3 100644 --- a/TODO.md +++ b/TODO.md @@ -34,28 +34,10 @@ Pick anything from here when relevant. Cross-cutting design notes live in `napping 😴` once the `/compact` trigger and `nap` tool exist — both need a harness signal (an explicit `LiveEvent::StateChange` variant or piggyback on Note). -- **Terminal: slash commands beyond /help and /clear.** Operator-facing - in-terminal commands still to add: `/model`, `/compact`, `/cancel`. - Each needs harness-side support (model override, force compaction, - cancel current claude turn). -- **Terminal: bigger.** The 32em max-height is cramped on a 1080p+ - screen. Grow it (e.g. `min(70vh, 60em)`) so the live tail is the - main visual element of the page rather than a strip. -- **Terminal: sticky-bottom auto-scroll.** Today every appended row - scrolls to bottom, so the view shifts while the operator is reading - scrolled-up. Track whether the user is *already* at the bottom - (within a small threshold), and only auto-scroll when that's true. - Show a small "↓ N new" indicator when not at bottom; click to jump. -- **Terminal: cancel-current-turn button.** Explicit "kill claude - process for this turn" control. Harness needs to track the - in-flight claude child PID and offer a `/cancel` endpoint that sends - SIGTERM; UI surfaces a button while the state badge is `thinking`. - Slash-command equivalent: `/cancel`. -- **`/compact` trigger.** Operator-initiated compaction of the current - claude session — `claude --print --continue` with `/compact` over the - same session id. Surfaces as a slash command in the terminal + a - toolbar button while the state badge is `idle`. Sets state to - `compacting` during the run. +- **Terminal: `/model` slash command.** Operator-typeable model + override from the terminal. Depends on the model-override work + above; once an override mechanism exists, wire a `/model ` + command that POSTs to a new endpoint. - **xterm.js terminal** embedded per-agent, attached to a PTY exposed by the harness. Pairs well with the unprivileged-container work — would let the operator drop into the container without `nixos-container root-login`. @@ -87,6 +69,25 @@ Pick anything from here when relevant. Cross-cutting design notes live in manager fall back. Wire the timeout into `OperatorQuestions::wait_answered` and surface remaining-time on the dashboard. +## Spawn flow + +- **Two-step spawn.** Today `request_spawn(name)` is one shot: manager + asks → operator approves → container is created with a default + `agent.nix` and empty `/state/`. Manager has no way to pre-stage + per-agent prompt material, package additions, or initial notes before + the agent first wakes. Split into: + 1. `request_spawn_draft(name)` — host creates the per-agent + `proposed/` repo (initial commit) and `state/` dir with no + container; manager now has `/agents//{config,state}/` to + edit + commit just like an existing agent. + 2. `request_spawn_commit(name, commit_ref)` — submits the queued + approval; operator sees the diff in the dashboard like a normal + `apply_commit`; on approve the container is created from that + commit. + Backwards-compat: keep the existing one-shot `request_spawn` for + trivial agents (operator can still type a name in the dashboard). + Surface "drafts" as a new section between K3PT ST4T3 and approvals. + ## Loop substance - **`nap` tool.** Agent-side MCP tool `mcp__hyperhive__nap(seconds)` that diff --git a/hive-c0re/assets/app.js b/hive-c0re/assets/app.js index 0596b6b..40e1590 100644 --- a/hive-c0re/assets/app.js +++ b/hive-c0re/assets/app.js @@ -83,23 +83,6 @@ )); } - const spawn = el('form', { - method: 'POST', action: '/request-spawn', - class: 'spawnform', 'data-async': '', - }); - spawn.append( - el('input', { - name: 'name', - placeholder: 'new agent name (≤9 chars)', - maxlength: '9', required: '', autocomplete: 'off', - }), - el('button', { type: 'submit', class: 'btn btn-spawn' }, '◆ R3QU3ST SP4WN'), - ); - root.append(spawn); - root.append(el('p', { class: 'meta' }, - 'spawn requests queue as approvals. operator approves below to actually create the container.', - )); - if (s.transients.length) { const ul = el('ul'); for (const t of s.transients) { @@ -337,6 +320,24 @@ function renderApprovals(s) { const root = $('approvals-section'); root.innerHTML = ''; + + // Spawn request form: submitting it queues a Spawn approval that + // lands in this same list, so the form belongs here rather than on + // the containers list (the agent doesn't exist yet). + const spawn = el('form', { + method: 'POST', action: '/request-spawn', + class: 'spawnform', 'data-async': '', + }); + spawn.append( + el('input', { + name: 'name', + placeholder: 'new agent name (≤9 chars)', + maxlength: '9', required: '', autocomplete: 'off', + }), + el('button', { type: 'submit', class: 'btn btn-spawn' }, '◆ R3QU3ST SP4WN'), + ); + root.append(spawn); + if (!s.approvals.length) { root.append(el('p', { class: 'empty' }, 'queue empty')); return;