Commit graph

211 commits

Author SHA1 Message Date
müde
6cf66e23dc actions: deny plants annotated denied/<id> tag
apply-commit denials now leave a git object behind: tag
denied/<id> annotated with the operator's note (or empty body
if they didn't supply one) at proposal/<id> inside the applied
repo. rejected configs become first-class git history — git
show denied/<id> in the manager's applied.git mount yields the
tree the operator rejected plus the reason. helper event
carries the tag for parity with deployed/failed. spawn denials
fall through unannotated since they have no proposal commit.
deny becomes async (single git plumbing call); dashboard +
admin-socket callers grow .await.
2026-05-15 23:01:22 +02:00
müde
315d4289c7 actions: tag-driven approve(ApplyCommit) flow
run_apply_commit walks the approval through the tag state
machine in applied: approved/<id> + building/<id> stamped
before the build, then git read-tree --reset to proposal/<id>
populates the working dir without moving HEAD. on rebuild
success deployed/<id> is planted and refs/heads/main fast-
forwards to the proposal. on failure failed/<id> is annotated
with the build error and the working tree resets back to main
so the agent stays evaluable. helper events Rebuilt +
ApprovalResolved both carry the terminal tag so the manager
can git-show the exact tree (and read the failure note from
an annotated tag) against its read-only applied.git mount.
finish_approval grows a terminal_tag param; spawn path passes
None. lifecycle::apply_commit deleted.
2026-05-15 23:00:01 +02:00
müde
35b0edaf27 manager_server: fetch+tag at request_apply_commit submit
submit_apply_commit (1) queues the approval row, (2) git-fetches
the manager-supplied sha from proposed into applied, pins it as
refs/tags/proposal/<id>, (3) persists the resolved sha on the
row via approvals.set_fetched_sha. from this point on the
proposal is immutable from the manager's perspective: amends
or force-pushes in proposed do not change what hive-c0re will
build. fetch failures mark the row failed and surface the error
to the manager so a phantom pending entry can't linger.
2026-05-15 22:57:43 +02:00
müde
8cb8fcedad lifecycle: setup_applied seeds via fetch + tags deployed/0
new shape: applied is git-init'd at first spawn, fetches
proposed's initial commit into its main, tags deployed/0 there.
the wrapper flake.nix is regenerated on every spawn/rebuild
but no longer tracked — apply churn vanishes, manager-authored
files in the proposal flow now survive untouched. setup_applied
gains an Option<&Path> for proposed (None on rebuild paths
that just refresh the flake). pre-overhaul applied dirs are
detected via the missing deployed/0 tag and bail loudly with
the destroy --purge migration hint. apply_commit is stubbed
with a clear error until the tag-driven approve flow lands.
2026-05-15 22:56:58 +02:00
müde
63ef69674b lifecycle: git helpers for tag-driven applied repo
new plumbing for the upcoming flow: git_fetch_to_tag (pulls a
sha from proposed into applied and pins it as a tag in one
shot), git_rev_parse (normalises shas + reads back tag
targets), git_tag / git_tag_annotated (lightweight vs body-
carrying for failed/denied), git_read_tree_reset (replace
working tree without moving HEAD — lets main stay on last
known-good across an in-flight build), git_update_ref (ff main
on deploy). annotated tag bodies go via stdin to avoid escape
games. all dead-code-allowed; callers land in subsequent
commits.
2026-05-15 22:52:23 +02:00
müde
b32c3d4f98 approvals: persist fetched_sha alongside the queue
new column fetched_sha records the canonical sha hive-c0re
plans to fetch from the proposed repo into applied at submit
time. distinct from commit_ref (manager-supplied, may be
amended out from under the queue). set_fetched_sha is unused
until manager_server wires the fetch step next commit.
2026-05-15 22:49:04 +02:00
müde
871e7bf3fa wire types: add sha + tag to Approval and HelperEvent
approval grows fetched_sha (canonical hive-c0re-vouched sha,
distinct from manager-supplied commit_ref). helperevent
{approvalresolved,spawned,rebuilt} grow optional sha + tag so
the manager can git-show the exact tree it's hearing about
(against the upcoming /agents/<n>/applied.git RO mount) and
know which terminal tag landed. all serde-defaulted; existing
construction sites pass none until the tag-driven flow lands.
2026-05-15 22:47:39 +02:00
müde
6a2ffd521b surface agent-vs-agent port collisions (manager:8000 can't collide)
manager is fixed at 8000, sub-agents are 8100-8999, so collisions
are strictly between two sub-agents hashing to the same value.
the colliding container's harness restart-loops on AddrInUse —
which the user just hit on :8945. previously the only sign was a
buried journalctl warn line.

now surfaced two ways:

- lifecycle::spawn / rebuild preflight: walks the live container
  list, computes each agent's hashed port, refuses with
  'port N already taken by <other> — rename one of them' if any
  running sub-agent shares the new agent's port. so the operator
  sees an actionable error in the dashboard's transient pill /
  approve-result instead of waiting for the harness to die.

- /api/state grows a port_conflicts: [{port, agents: [...]}]
  array; dashboard renders a pulsing red banner above the
  containers list listing each cluster. matches the questions
  panel pulse so it's hard to miss.
2026-05-15 22:08:19 +02:00
müde
2029840671 deny: operator can attach a reason that reaches the manager
clicking DENY on the dashboard now prompts for an optional reason
('reason for denying (optional, sent to manager):'). the value
rides along as a hidden 'note' form field; backend chain:

  POST /deny/{id} { note }
    → actions::deny(coord, id, Some(note))
    → Approvals::mark_denied writes it to the row
    → HelperEvent::ApprovalResolved { ..., note: Some("...") }

manager already had note: Option<String> on the event, just never
populated for denials before. host admin socket (hive-c0re deny)
still passes None.

generalized the prompt-on-submit pattern: any form with a
data-prompt attribute pops a window.prompt() before the POST and
stashes the answer in a hidden input named by data-prompt-field
(default 'note'). reusable for future opt-in note fields.
2026-05-15 21:58:42 +02:00
müde
91c78d626f dashboard: per-container applied agent.nix viewer
new GET /api/agent-config/{name} returns the contents of
/var/lib/hyperhive/applied/<name>/agent.nix — the file the
container actually builds against. validated against the live
container list to avoid arbitrary filesystem reads.

frontend mirrors the journald viewer: collapsed <details> on each
container row, lazy-fetches on expand, refresh button re-fetches.
restore-keyed (agent-config:<name>) so it survives the dashboard
heartbeat refresh.

read-only — mutating the applied config goes through the existing
request_apply_commit + operator approval flow.
2026-05-15 21:46:25 +02:00
müde
80229c6af9 manager: needs_login / logged_in / needs_update events + update tool
crash_watch grows two more state-axes alongside running/stopped:

- logged-in (claude session dir populated for the agent)
- up-to-date (recorded flake rev matches current)

per-tick transitions emit HelperEvent::NeedsLogin / LoggedIn /
NeedsUpdate. seed-on-first-tick semantics retained — nothing fires
on harness boot for agents that were already in their state. only
needs_update fires the 'stale appeared' direction; the resolved
direction is already covered by Rebuilt.

new mcp__hyperhive__update(name) on the manager surface: idempotent
rebuild via auto_update::rebuild_agent. transient-aware (Rebuilding)
so the dashboard shows the spinner. login intentionally has NO tool
— it's interactive OAuth, only the operator can complete it.

prompts + approvals doc + turn-loop doc updated. todo grows a
'show per-agent applied config in dashboard' entry (separate
follow-up).
2026-05-15 21:42:13 +02:00
müde
b374f39b0d dashboard: preserve <details open> across refresh via data-restore-key
generalises the focus-preservation pattern to expanded details
sections (journald viewer was collapsing on every 5s refresh; same
issue for approval diff blocks). before re-render we snapshot
which <details data-restore-key=...> are open; after render we
re-apply. setting .open = true programmatically also fires the
toggle event, so journald's lazy-fetch listener re-runs cleanly.

tagged: journal:<container>, approval-diff:<id>. anything else
that should survive a refresh just needs a stable data-restore-key
attribute.
2026-05-15 21:37:17 +02:00
müde
3b532753b3 notifications: per-event tags + debug logs
bug: all notifications used tag='hyperhive', so each new fire
replaced the previous — operator only ever saw one at a time and
might miss the fact that a second arrived. now per-event tags
(hyperhive:approval:<id>, hyperhive<id>,
hyperhive:msg:<at>:<rand>) so distinct events stack in the OS
notification center.

dropped the bogus icon (was pointing at dashboard.css) — some
browsers refuse to display a notification with an invalid icon.

added console.debug at every block point (not supported, permission
not granted, muted) and a 'shown' log on success, so the operator
can see in the browser console exactly why a notification didn't
fire.

note for the operator: most browsers also suppress notifications
while the originating tab is FOCUSED. that's a browser-level
decision, not ours.
2026-05-15 21:34:21 +02:00
müde
d275b50177 dashboard: don't yank the form away while operator is typing
every refreshState tick does root.innerHTML = '' across the managed
sections, which destroys any focused input. detect the case before
re-rendering: if document.activeElement is an INPUT / TEXTAREA /
SELECT inside one of the managed sections, skip this tick and try
again in 2s. eventually the operator blurs and the refresh lands.

managed section ids: containers / tombstones / questions / inbox /
approvals. msgflow + message-flow SSE rows don't have inputs so
they're not affected.
2026-05-15 21:19:01 +02:00
müde
acaa0eb895 agent_web_port: back to pure hash, drop port-file dance
operator's call: probing-forward + state-file machinery is more
brittle than the bug it tried to fix. revert to the original
deterministic FNV-1a hash mod 900. collisions are real but rare;
operator resolves by renaming (different name → different hash) and
rebuilding. no per-agent port file, no scan, no migration path,
nothing to drift out of sync with the running container.

existing port files on disk are silently ignored — operator
rebuilds affected agents to regenerate flakes from the deterministic
hash.
2026-05-15 21:17:31 +02:00
müde
c35f566d15 agent_web_port: actually resolve legacy collisions
previous attempt was wrong: the legacy branch returned port_hash
unconditionally, so two legacies hashing to the same port both
wrote that port and the collision persisted (test still trying to
bind coder's port).

new rule: always probe forward from port_hash, with scan_taken_ports
parameterised by include_implicit_hashes:

- legacy migration (applied dir exists, no port file): pass false.
  scan only counts other agents' port files. first-queried legacy
  claims its hash; subsequent colliders see the first's port file
  and probe forward. we don't know which legacy originally won
  the bind race, so first-write-wins; the loser was already
  crash-looping anyway and gets a fresh port to rebuild to.

- fresh spawn (no applied dir): pass true. counts port files AND
  implicit hashes for not-yet-migrated legacies, so a new spawn
  doesn't race with an unmigrated peer.

migration note for affected users: agents whose port file says
something other than their hashed port may have been corrupted by
the previous fix. Hit ↻ R3BU1LD on the offender to regenerate the
flake (uses the current port file) and the container will bind
the right port on restart.
2026-05-15 21:13:17 +02:00
müde
237b215c55 dashboard: browser notifications for operator-bound events
three signals fire OS notifications:
- new approval lands in the queue (per id, via /api/state delta)
- new ask_operator question queued (per id)
- broker message sent to operator (live via SSE)

first /api/state render after page load seeds the 'seen' sets
without firing — only items that arrive while the page is open
count. controls in a row under the banner: 🔔 enable
notifications (calls requestPermission, hides on grant), 🔕 mute /
🔔 unmute toggle (localStorage-backed so operator can silence
without revoking the permission), inline status text when blocked
or unsupported.

notification tag='hyperhive' collapses rapid bursts; onclick
focuses the dashboard tab. requires secure context (HTTPS or
localhost) — on other origins the API is unavailable and the
controls hide themselves.

todo: entry dropped.
2026-05-15 21:10:20 +02:00
müde
58c3cd853b container crash watcher → HelperEvent::ContainerCrash
new hive_c0re::crash_watch task polls every 10s, builds the set of
currently-running containers, and on running→stopped transitions
checks the transient snapshot: if no Stopping / Restarting /
Destroying / Rebuilding flag is set, the container exited
unexpectedly and we fire HelperEvent::ContainerCrash into the
manager's inbox so it can react (typically: start it again).

first poll is a seeding pass — no events on harness startup. dbus
subscription would be lower-latency but polling is honest and
debuggable, and a 10s delay on crash detection is fine for our
scale.

manager prompt + approvals doc updated to advertise the new
event variant. todo drops the entry (and the journald-viewer
entry that already shipped).
2026-05-15 21:02:05 +02:00
müde
6db38cf70c model: runtime override via /model slash; fixes for port + bind
- runtime model override: Bus::{model,set_model} + POST /api/model
  (form-encoded {model: name}). turn.rs reads bus.model() per turn
  so a flip lands on the next claude invocation. /api/state grows
  a model field; agent page shows a 'model · <name>' chip in the
  state row. '/model <name>' slash command POSTs to the endpoint
  and refreshes state.

- port regression fix: agent_web_port no longer probes forward for
  *existing* agents (the previous fix shifted ports for any agent
  without a port file, including legacy ones whose container was
  already bound to the bare hashed port — dashboard rendered the
  new port, container was still on the old one, conn errors). new
  rule: port file exists → use it; absent + applied flake present
  → legacy, persist port_hash without probing; absent + no applied
  flake → fresh spawn, probe forward.

- SO_REUSEADDR on both the dashboard and per-agent web UI binds
  via tokio::net::TcpSocket. operator hit 12 retries failing on
  manager :8000 — REUSEADDR handles the TIME_WAIT case cleanly
  without a new dep; retry still covers the genuine
  process-still-alive overlap.

todo: drops the model-override entry (shipped); adds two new
items — model persistence (optional, future), and custom
per-agent MCP tools (groundwork for moving bitburner-agent into
hyperhive).
2026-05-15 20:59:45 +02:00
müde
7d93dd9db4 no nap tool — recv with long wait_seconds replaces it; max raised to 180s
recv-with-timeout is strictly better than a fixed sleep because it
wakes instantly on incoming messages. drop the half-written nap MCP
tool, raise the recv wait_seconds cap from 60s to 180s on both
agent and manager sockets.

prompts updated: agent.md + manager.md now spell out the pattern —
when there's nothing else useful to do, call recv with
wait_seconds=180 to park the turn; do NOT use Bash sleep for the
same purpose. todo drops the nap entry and the napping-state-badge
follow-up; both replaced by 'just use a long recv'.
2026-05-15 20:53:15 +02:00
müde
f65ee88269 recv: optional wait_seconds parameter, capped at 60s
AgentRequest::Recv and ManagerRequest::Recv grow an optional
wait_seconds field (default None → 30s, capped at 60s server-side).
agent_server / manager_server clamp via recv_timeout(). MCP tool
schemas advertise the param so claude can pick its own poll window
— useful when an agent wants to throttle wakes without entering a
distinct nap state.

both harness loops still pass None, keeping the existing 30s
default behaviour for system-level Recvs.
2026-05-15 20:49:33 +02:00
müde
0385d96bf3 dashboard: per-container journald viewer
new GET /api/journal/{name}?unit=&lines= shells out journalctl -M
<container> -b --no-pager --output=short-iso --lines=<N> (cap 5000).
optional unit filter, restricted to hive-ag3nt.service /
hive-m1nd.service so the shell-out can't be coerced into reading
unrelated units. validates the container name against the live list
before invoking journalctl.

frontend renders a collapsed '↳ logs · <container>' details block
on each container row. expanding triggers a lazy fetch; refresh
button re-fetches; unit dropdown switches between the harness
service (default) and the full machine journal. output sits in a
24em-tall monospace pre, auto-scrolled to the bottom on fresh
fetch.

hive-c0re's systemd unit already runs as root, so journalctl has
the access it needs.
2026-05-15 20:42:56 +02:00
müde
79a46f359a agent_web_port: collision-aware sticky allocation
operator hit 'coder' and 'test' colliding on the same hashed port —
fnv-1a mod 900 has ~0.1% collision probability per pair and clearly
that's not enough. agent_web_port goes stateful:

- per-agent port persisted to /var/lib/hyperhive/agents/<name>/port
- on first call, look up the file; if absent, hash, then probe
  forward through the allocated range skipping any port other
  agents already claim, then write the chosen value back
- subsequent calls return the persisted port (sticky)

other agents' ports come from their port file if present, else the
fallback is the hashed value — that handles existing deployments
without forcing a rebuild-all just to migrate. rebuilding the
colliding agent re-runs agent_web_port, sees its peer's implicit
hash port as taken, picks the next free slot, persists.

range exhaustion (very unlikely — 900 slots) logs a warning and
returns the hash; the bind-with-retry on the harness will surface
the failure honestly rather than silently looping.
2026-05-15 20:41:18 +02:00
müde
754db7830e ask_operator: ttl_seconds auto-cancel + remaining-time chip
manager can pass ttl_seconds to ask_operator. on submit, host
stores deadline_at = now + ttl in operator_questions (new column,
migrated via existing pragma_table_info pattern), spawns a tokio
task that sleeps until the deadline then resolves the question with
answer '[expired]' and fires the same OperatorAnswered helper event.
already-resolved races no-op silently.

dashboard renders a ' MM:SS' chip on the question row when
deadline_at is set. format collapses seconds → s, < 1h → m s, ≥ 1h
→ h m. heartbeat refresh (5s) keeps the chip current; the operator
sees it tick down.

manager prompt + mcp tool description updated. journald viewer per
container queued in todo (separate task).
2026-05-15 20:38:02 +02:00
müde
2146e47770 web ui: retry binding on AddrInUse during restart races
operator hit 'Address already in use (os error 98)' on a harness
restart — the new harness raced the old socket's release. add a
bind_with_retry helper that backs off (250ms doubling, capped at
2s, 12 tries ≈ 22s total) on AddrInUse before giving up. applied
to both the per-agent web UI and the hive-c0re dashboard.

proper fix would be SO_REUSEADDR via socket2 but retry covers the
TIME_WAIT case fine and keeps the dep count down. Other bind errors
still fail immediately (port permission, fd exhaustion).
2026-05-15 20:33:51 +02:00
müde
538e0446d7 agent page: inbox view of last 30 messages addressed to this agent
new wire request AgentRequest::Recent { limit } / ManagerRequest::Recent
(plus matching responses with Vec<InboxRow>). InboxRow moved to
hive-sh4re so it lives on both surfaces without an internal-to-wire
conversion. host-side dispatch in agent_server / manager_server
calls broker.recent_for(name, limit).

per-agent web_ui /api/state grew an inbox: Vec<InboxRow> populated
via the same per-agent socket (best-effort; transport failure
returns empty). frontend renders as a collapsible <details> section
between the state row and the terminal — fmt timestamp / from /
body in a tight grid, capped at 16em scrollable. only visible when
there are rows.
2026-05-15 20:32:19 +02:00
müde
ee5b85716d ask_operator: operator-side ✗ CANC3L on pending questions
new POST /cancel-question/{id} resolves a pending operator question
with the sentinel answer '[cancelled]' and fires the usual
HelperEvent::OperatorAnswered so the manager sees a terminal state
and can fall back. uses the same OperatorQuestions::answer path —
no special handling, the manager already has to deal with arbitrary
answer strings.

dashboard renders the cancel as a separate <form> below the main
qform so the answer-merge submit handler on the main form doesn't
inadvertently fire when the operator clicks cancel. confirm dialog
spells out what the manager will see.

ttl-based auto-cancel is still on the todo (would spawn a tokio task
per submitted question).
2026-05-15 20:25:11 +02:00
müde
2413d664a1 agents get a kickoff inbox message on start/restart/rebuild
new Coordinator::kick_agent(name, reason) drops a system message
into the agent's inbox so the next turn picks it up with a 'you
were just (re)started, check /state/ for notes, --continue session
is intact' hint. wakes the turn loop without any harness-side
handling needed — it's just another inbox message with sender =
'system'.

wired from:
- dashboard /start /restart /rebuild handlers (via lifecycle_action's
  on-success tail)
- manager mcp_hyperhive_start / restart

dashboard: pending approvals + tombstones + questions now refresh on
a 5s heartbeat when nothing else is happening. previously refresh
only fired on async-form submit or on broker traffic addressed to
operator — manager-queued approvals went through neither, so the
operator had to reload to see them. 5s is the slow-path; 2s
remains for in-flight transients.
2026-05-15 20:19:36 +02:00
müde
c27111ac32 dashboard: split api_state into per-section builders
drops the #[allow(clippy::too_many_lines)] on api_state by extracting
four pure helpers:

- build_container_views — live containers + any_stale flag
- build_transient_views — agents in pre-creation Spawning state only
- build_approval_views — pending approvals with diff html
- build_tombstone_views — destroyed-but-kept state dirs

api_state itself is now ~30 lines of orchestration. zero behavior
change. each helper is independently readable + testable.
2026-05-15 20:13:08 +02:00
müde
7b4adea325 dashboard: lifecycle_action helper collapses start/stop/restart/rebuild
five POST handlers (post_kill / post_restart / post_start / post_rebuild)
were all repeating the same boilerplate: strip prefix, set_transient,
call lifecycle::X, clear_transient, match the result. extract one
helper that takes the transient kind, error-message verb, the work
body, and an optional 'on success' tail (used by kill to also
unregister + emit HelperEvent::Killed). each handler shrinks to a
single lifecycle_action(..) call. zero behavior change.
2026-05-15 20:12:03 +02:00
müde
89ccc5e6c5 events.sqlite vacuum moves host-side
retention is a host concern — agents have no business doing their
own cleanup, and a misbehaving harness could skip it. drop
spawn_events_vacuum from both hive-ag3nt and hive-m1nd, drop the
matching Bus::vacuum + EventStore::vacuum methods. new
hive_c0re::events_vacuum module sweeps every existing
agents/<name>/state/hyperhive-events.sqlite on the same hourly
cadence as the broker vacuum. same two-stage delete (older than 7
days, trim to 2000 newest). called from main alongside broker
vacuum.

also: server-side state badge entered into todo.md (today's badge
is derived client-side from sse, fine for idle/thinking but a
state machine that grows compacting/napping wants authoritative
status from the harness).
2026-05-15 20:10:34 +02:00
müde
897e7c07ae dashboard: spawn form moves under approvals; docs synced
submitting R3QU3ST SP4WN immediately queues an approval that lands
in the very next list. the form belonged with that list, not at the
top of containers — the agent doesn't exist yet at form time anyway.

docs: claude.md grows operator_questions.rs / events.rs sqlite /
broker vacuum to the file map; web-ui shape lists the actual current
endpoint set (per-agent cancel/compact/history, dashboard tombstone
purge/answer/spawn); live-view section now describes the state
badge, sticky-bottom scroll, history backfill, and the terminal-
embedded prompt with its slash commands; dashboard-action-surface
rewritten around the new six-section page (containers / kept-state /
questions / inbox / approvals / message-flow) and the two-line
container row. new 'persistence + retention' section documenting both
sqlite databases and their vacuum cadences. readme picks up the new
mgr mcp surface (start/restart/ask_operator) + operator-side
features list + ask_operator answer flow.

todo trimmed of shipped items (bigger terminal / sticky scroll /
cancel button / /compact trigger / /cancel command). new entry for
the two-step spawn-with-preconfig flow.
2026-05-15 20:02:54 +02:00
müde
5ee65d2f15 dashboard: K3PT ST4T3 section + agent links open in new tab
new section between containers and questions: lists every name with a
state dir under /var/lib/hyperhive/agents/ that doesn't correspond to
a live container. shows state size + last-modified age + whether
claude creds are kept. two actions per row:

- R3V1V3 — queues a spawn approval with the same name (operator
  approves to recreate; spawn flow reuses prior config + claude
  creds, no re-login needed)
- PURG3 — wipes the agent's state + applied dirs (post /purge-tombstone/
  endpoint; refuses if a live container with that name still exists)

dashboard also opens agent links in new tabs now (target=_blank +
rel=noopener) so the operator's overview tab stays put when they
dive into an agent.
2026-05-15 19:55:27 +02:00
müde
8344dd9ab7 ask_operator: multi-select + free-text fallback
ask_operator now accepts a multi: bool. when true and options is
non-empty, the dashboard renders the choices as checkboxes — operator
picks any subset, answer comes back as a ', '-joined string. when
false (default), options are radio buttons.

independent of multi, a free-text input ('or type your own…') is
always rendered alongside options so the operator is never trapped
by an incomplete list. submit merges checked options + free text into
the single 'answer' field.

schema migration: operator_questions grows a multi INTEGER column
with a one-shot ALTER TABLE on open. backward compatible — old rows
default to 0 (not multi).

prompt + mcp tool description updated; existing dashboard css for
.qform was rewritten around the new vertical layout.
2026-05-15 19:52:44 +02:00
müde
c337cc06f8 dashboard: spinners on in-flight lifecycle actions + cleaner row layout
backend:
- TransientKind grows Starting / Stopping / Restarting / Rebuilding /
  Destroying alongside the existing Spawning. each dashboard handler
  (start/restart/kill/rebuild/destroy) wraps the lifecycle call with
  set_transient + clear_transient so the dashboard knows what's in
  flight. transient kind is surfaced inline on ContainerView.pending
  (existing-container actions) — only Spawning (pre-creation) lands
  in the separate transients list.

frontend:
- container row is now two lines: identity + meta on top, action
  buttons below. less cluttered, leaves room for the pending state
  pill. pending rows dim their actions and surface a pulsing
  '◐ spawning… / starting… / stopping… / restarting… / rebuilding…
  / destroying…' indicator next to the name.
- 'needs login' / 'needs update' chips moved into a unified .badge
  styling for consistency.
- auto-refresh kicks in not only on transient spawn but on any
  container with a pending action.
2026-05-15 19:49:43 +02:00
müde
6d52f67292 broker: hourly vacuum of delivered messages older than 30 days
undelivered rows are always kept regardless of age (still in flight).
sweep runs immediately on serve start then every hour. logs row count
when non-zero. keep_secs is hard-coded for now (30 days); can be
config-driven later if a host wants to retain more / less for audit.
2026-05-15 19:40:38 +02:00
müde
48ebfefd1a destroy --purge: also wipe agent state dirs
new --purge flag on the destroy verb (cli + admin socket + dashboard).
default destroy still keeps /var/lib/hyperhive/{agents,applied}/<name>/
so recreating with the same name reuses prior config + creds.
with --purge, both dirs go too (config history, claude creds, /state/
notes). no undo. dashboard adds a separate PURG3 button with an
explicit confirmation copy; the existing DESTR0Y button keeps the
soft semantics.

claude.md dashboard-action-surface section updated; todo entry
dropped.
2026-05-15 19:29:14 +02:00
müde
fd39226883 visuals: frosted-glass terminal/msgflow, row fade-in, badge pulses
agent terminal-wrap + dashboard msgflow get a translucent bg with
backdrop-filter blur+saturate so page-bg glow softens behind them.
new rows in the live panel and the dashboard message flow fade in
with a 4px slide-up. unread badge pulses; pending-operator-questions
section pulses its glow. history-backfilled rows skip the animation
(.no-anim) so the page doesn't stagger 100 fade-ins on load.
2026-05-15 19:20:15 +02:00
müde
ac1b5fde8e manager: start/restart at will, no approval; refuse self
new manager tools mcp__hyperhive__{start,restart} that delegate to the
existing lifecycle::start / lifecycle::restart on the host. kill was
already at the manager's discretion; rounding out start + restart for
parity so day-to-day container care doesn't have to round-trip through
the operator.

guard: refuse self-targeting on kill/start/restart — the manager would
just be cutting its own legs. spawn (request_spawn) and config changes
(request_apply_commit) still go through the approval queue, since those
are the actual gate. prompt + claude.md updated to make the boundary
explicit. kill now also emits HelperEvent::Killed (it didn't before).
2026-05-15 18:57:25 +02:00
müde
d943bddd9e agent ui: input lives in terminal section, banner shimmer on activity
agent page restructure:
- send form moves into the terminal panel as a prompt-style row beneath
  the live tail (status line stays above so it still reads as a header).
- live panel + prompt share a single bordered 'terminal-wrap' box.
- harness-alive / login-state status lines drop their decorative ascii
  bookends; just a leading dot/glyph remains.
- banner gradient is now a real css gradient with a shimmer animation
  toggled by an .active class. turn_start adds it, turn_end removes it.
  dashboard side mirrors this: each broker sse event nudges a 4s
  shimmer window.
- dashboard container rows drop their static ▓█▓▒░ / ▒░▒░░ glyph
  prefixes; the role chips already disambiguate m1nd vs ag3nt.
- empty-state placeholders drop the ▓ bookends.

terminal pre-fill: hive-ag3nt::events::Bus grows a 500-event ring
buffer; new GET /events/history endpoint returns it. The agent JS
fetches history before opening the SSE stream so opening the page mid-
turn shows the last N events instead of a blank panel. The replay
walks turn_start/turn_end pairs to seed the banner-active state
correctly if a turn was still open.
2026-05-15 18:54:19 +02:00
müde
2770630f33 ask_operator tool: non-blocking; operator answer arrives as helper event
new mcp tool on the manager surface that queues a question on the
dashboard and returns the question id immediately. operator submits an
answer via /answer-question/<id>; the dashboard fires
HelperEvent::OperatorAnswered { id, question, answer } into the manager
inbox so the next turn picks it up.

also: fix async-form button stuck on spinner after successful submit
(refreshState skipped re-rendering, so the button was never re-enabled).
2026-05-15 18:44:42 +02:00
müde
ff8f8c7c56 per-agent /state dir for durable notes; manager sees them via /agents 2026-05-15 18:00:08 +02:00
müde
7be64c5e66 theme: bring back the vibec0re glow on catppuccin mocha 2026-05-15 17:51:36 +02:00
müde
f33fc3dd50 theme: catppuccin mocha across dashboard + agent UI 2026-05-15 17:49:54 +02:00
müde
37c6504462 manager events: Spawned/Rebuilt/Killed/Destroyed + start button 2026-05-15 17:38:41 +02:00
müde
06ea0cf283 operator inbox view on dashboard; agent ui doesn't clobber typing 2026-05-15 17:23:53 +02:00
müde
6fc9862c3c dashboard: SPA shell — static index.html + app.js, /api/state JSON 2026-05-15 17:10:57 +02:00
müde
8428c693e0 dashboard: stop/restart per-container + update-all when any stale 2026-05-15 17:00:56 +02:00
müde
4f91dfef99 module: thread hyperhive package directly — operators don't apply overlays 2026-05-15 16:51:18 +02:00
müde
970f645461 docs: README + TODO split; trim CLAUDE.md; fix async form 415 2026-05-15 16:41:15 +02:00