Commit graph

39 commits

Author SHA1 Message Date
müde
0385d96bf3 dashboard: per-container journald viewer
new GET /api/journal/{name}?unit=&lines= shells out journalctl -M
<container> -b --no-pager --output=short-iso --lines=<N> (cap 5000).
optional unit filter, restricted to hive-ag3nt.service /
hive-m1nd.service so the shell-out can't be coerced into reading
unrelated units. validates the container name against the live list
before invoking journalctl.

frontend renders a collapsed '↳ logs · <container>' details block
on each container row. expanding triggers a lazy fetch; refresh
button re-fetches; unit dropdown switches between the harness
service (default) and the full machine journal. output sits in a
24em-tall monospace pre, auto-scrolled to the bottom on fresh
fetch.

hive-c0re's systemd unit already runs as root, so journalctl has
the access it needs.
2026-05-15 20:42:56 +02:00
müde
2146e47770 web ui: retry binding on AddrInUse during restart races
operator hit 'Address already in use (os error 98)' on a harness
restart — the new harness raced the old socket's release. add a
bind_with_retry helper that backs off (250ms doubling, capped at
2s, 12 tries ≈ 22s total) on AddrInUse before giving up. applied
to both the per-agent web UI and the hive-c0re dashboard.

proper fix would be SO_REUSEADDR via socket2 but retry covers the
TIME_WAIT case fine and keeps the dep count down. Other bind errors
still fail immediately (port permission, fd exhaustion).
2026-05-15 20:33:51 +02:00
müde
538e0446d7 agent page: inbox view of last 30 messages addressed to this agent
new wire request AgentRequest::Recent { limit } / ManagerRequest::Recent
(plus matching responses with Vec<InboxRow>). InboxRow moved to
hive-sh4re so it lives on both surfaces without an internal-to-wire
conversion. host-side dispatch in agent_server / manager_server
calls broker.recent_for(name, limit).

per-agent web_ui /api/state grew an inbox: Vec<InboxRow> populated
via the same per-agent socket (best-effort; transport failure
returns empty). frontend renders as a collapsible <details> section
between the state row and the terminal — fmt timestamp / from /
body in a tight grid, capped at 16em scrollable. only visible when
there are rows.
2026-05-15 20:32:19 +02:00
müde
ee5b85716d ask_operator: operator-side ✗ CANC3L on pending questions
new POST /cancel-question/{id} resolves a pending operator question
with the sentinel answer '[cancelled]' and fires the usual
HelperEvent::OperatorAnswered so the manager sees a terminal state
and can fall back. uses the same OperatorQuestions::answer path —
no special handling, the manager already has to deal with arbitrary
answer strings.

dashboard renders the cancel as a separate <form> below the main
qform so the answer-merge submit handler on the main form doesn't
inadvertently fire when the operator clicks cancel. confirm dialog
spells out what the manager will see.

ttl-based auto-cancel is still on the todo (would spawn a tokio task
per submitted question).
2026-05-15 20:25:11 +02:00
müde
2413d664a1 agents get a kickoff inbox message on start/restart/rebuild
new Coordinator::kick_agent(name, reason) drops a system message
into the agent's inbox so the next turn picks it up with a 'you
were just (re)started, check /state/ for notes, --continue session
is intact' hint. wakes the turn loop without any harness-side
handling needed — it's just another inbox message with sender =
'system'.

wired from:
- dashboard /start /restart /rebuild handlers (via lifecycle_action's
  on-success tail)
- manager mcp_hyperhive_start / restart

dashboard: pending approvals + tombstones + questions now refresh on
a 5s heartbeat when nothing else is happening. previously refresh
only fired on async-form submit or on broker traffic addressed to
operator — manager-queued approvals went through neither, so the
operator had to reload to see them. 5s is the slow-path; 2s
remains for in-flight transients.
2026-05-15 20:19:36 +02:00
müde
c27111ac32 dashboard: split api_state into per-section builders
drops the #[allow(clippy::too_many_lines)] on api_state by extracting
four pure helpers:

- build_container_views — live containers + any_stale flag
- build_transient_views — agents in pre-creation Spawning state only
- build_approval_views — pending approvals with diff html
- build_tombstone_views — destroyed-but-kept state dirs

api_state itself is now ~30 lines of orchestration. zero behavior
change. each helper is independently readable + testable.
2026-05-15 20:13:08 +02:00
müde
7b4adea325 dashboard: lifecycle_action helper collapses start/stop/restart/rebuild
five POST handlers (post_kill / post_restart / post_start / post_rebuild)
were all repeating the same boilerplate: strip prefix, set_transient,
call lifecycle::X, clear_transient, match the result. extract one
helper that takes the transient kind, error-message verb, the work
body, and an optional 'on success' tail (used by kill to also
unregister + emit HelperEvent::Killed). each handler shrinks to a
single lifecycle_action(..) call. zero behavior change.
2026-05-15 20:12:03 +02:00
müde
5ee65d2f15 dashboard: K3PT ST4T3 section + agent links open in new tab
new section between containers and questions: lists every name with a
state dir under /var/lib/hyperhive/agents/ that doesn't correspond to
a live container. shows state size + last-modified age + whether
claude creds are kept. two actions per row:

- R3V1V3 — queues a spawn approval with the same name (operator
  approves to recreate; spawn flow reuses prior config + claude
  creds, no re-login needed)
- PURG3 — wipes the agent's state + applied dirs (post /purge-tombstone/
  endpoint; refuses if a live container with that name still exists)

dashboard also opens agent links in new tabs now (target=_blank +
rel=noopener) so the operator's overview tab stays put when they
dive into an agent.
2026-05-15 19:55:27 +02:00
müde
c337cc06f8 dashboard: spinners on in-flight lifecycle actions + cleaner row layout
backend:
- TransientKind grows Starting / Stopping / Restarting / Rebuilding /
  Destroying alongside the existing Spawning. each dashboard handler
  (start/restart/kill/rebuild/destroy) wraps the lifecycle call with
  set_transient + clear_transient so the dashboard knows what's in
  flight. transient kind is surfaced inline on ContainerView.pending
  (existing-container actions) — only Spawning (pre-creation) lands
  in the separate transients list.

frontend:
- container row is now two lines: identity + meta on top, action
  buttons below. less cluttered, leaves room for the pending state
  pill. pending rows dim their actions and surface a pulsing
  '◐ spawning… / starting… / stopping… / restarting… / rebuilding…
  / destroying…' indicator next to the name.
- 'needs login' / 'needs update' chips moved into a unified .badge
  styling for consistency.
- auto-refresh kicks in not only on transient spawn but on any
  container with a pending action.
2026-05-15 19:49:43 +02:00
müde
48ebfefd1a destroy --purge: also wipe agent state dirs
new --purge flag on the destroy verb (cli + admin socket + dashboard).
default destroy still keeps /var/lib/hyperhive/{agents,applied}/<name>/
so recreating with the same name reuses prior config + creds.
with --purge, both dirs go too (config history, claude creds, /state/
notes). no undo. dashboard adds a separate PURG3 button with an
explicit confirmation copy; the existing DESTR0Y button keeps the
soft semantics.

claude.md dashboard-action-surface section updated; todo entry
dropped.
2026-05-15 19:29:14 +02:00
müde
2770630f33 ask_operator tool: non-blocking; operator answer arrives as helper event
new mcp tool on the manager surface that queues a question on the
dashboard and returns the question id immediately. operator submits an
answer via /answer-question/<id>; the dashboard fires
HelperEvent::OperatorAnswered { id, question, answer } into the manager
inbox so the next turn picks it up.

also: fix async-form button stuck on spinner after successful submit
(refreshState skipped re-rendering, so the button was never re-enabled).
2026-05-15 18:44:42 +02:00
müde
37c6504462 manager events: Spawned/Rebuilt/Killed/Destroyed + start button 2026-05-15 17:38:41 +02:00
müde
06ea0cf283 operator inbox view on dashboard; agent ui doesn't clobber typing 2026-05-15 17:23:53 +02:00
müde
6fc9862c3c dashboard: SPA shell — static index.html + app.js, /api/state JSON 2026-05-15 17:10:57 +02:00
müde
8428c693e0 dashboard: stop/restart per-container + update-all when any stale 2026-05-15 17:00:56 +02:00
müde
4f91dfef99 module: thread hyperhive package directly — operators don't apply overlays 2026-05-15 16:51:18 +02:00
müde
edf42b7e93 extract dashboard + agent CSS/JS to assets/ (include_str!) 2026-05-15 16:32:35 +02:00
müde
8fbee4fbf2 dashboard: async forms with spinner + rebuild button on every container 2026-05-15 16:21:25 +02:00
müde
e2ed58c1a7 dashboard: per-line color on approval diffs 2026-05-15 16:17:48 +02:00
müde
409263f1c9 operator input: per-agent /send form (dashboard T4LK removed) 2026-05-15 15:28:17 +02:00
müde
f99ed3fe7a manager: same lifecycle as agents; auto-spawn on hive-c0re start 2026-05-15 13:43:32 +02:00
müde
e777576528 auto-update: surface pending updates in dashboard + include manager 2026-05-15 13:31:33 +02:00
müde
d07f5eadaa dashboard: needs-login badge links to per-agent ui 2026-05-15 13:12:12 +02:00
müde
78fae44ee5 phase 8 step 3: needs-login partial-run mode + dashboard badge 2026-05-15 12:57:06 +02:00
müde
c59fa8541c phase 8 step 2: approval-gated spawn + dashboard spinner 2026-05-15 12:53:13 +02:00
müde
a42fdb3a5c phase 8 step 1: per-agent claude creds bind + destroy keeps state 2026-05-15 12:39:22 +02:00
müde
0fc287c768 fmt 2026-05-15 02:58:35 +02:00
müde
b711296460 destroy verb: CLI + admin socket + dashboard button; purges state + approvals 2026-05-15 02:57:22 +02:00
müde
fcd6563887 fmt 2026-05-15 02:02:20 +02:00
müde
1333532d3f dashboard: T4LK form — operator sends messages from the browser 2026-05-15 01:59:53 +02:00
müde
a146b147ff dashboard: GC orphan approvals on render (agent state dir missing) 2026-05-15 01:21:31 +02:00
müde
99867195e5 dashboard: distinguish missing-dir vs missing-git in approval_diff 2026-05-15 01:16:57 +02:00
müde
7c1ed07cf2 lifecycle: HYPERHIVE_GIT env override (bypass PATH); module sets it 2026-05-15 00:24:51 +02:00
müde
b20055293e fmt 2026-05-15 00:14:49 +02:00
müde
9133d9e1a3 Phase 7b: broker broadcast + dashboard SSE message-flow tail; pkgs.git in module 2026-05-15 00:13:34 +02:00
müde
46ff9c7aee dashboard: error_response takes &str 2026-05-15 00:06:51 +02:00
müde
c82d41728c Phase 7a: dashboard approve/deny + unified diff (similar crate) 2026-05-15 00:06:10 +02:00
müde
2e2989ef8c dashboard: writeln! instead of push_str(format!) 2026-05-14 23:44:19 +02:00
müde
8cf5d72798 Phase 6b: vibec0re-styled dashboard on hive-c0re + agent web UI restyled 2026-05-14 23:43:20 +02:00