setup_proposed now lands a git remote named 'applied' on every
proposed/<n>/config pointing at /applied/<n>/.git — the path as
seen from inside the manager container, where the RO bind in
set_nspawn_flags makes the URL resolve. From the manager:
git fetch applied
git log applied/main
git show applied/refs/tags/deployed/<id>
git diff applied/main HEAD
git rebase applied/main
all work without manually constructing the path each time. The
RO bind blocks push at the kernel level so the remote can only
fetch. Idempotent — also applied to pre-existing proposed repos
(no-op if the remote is already correct, set-url if drifted)
so the startup migration picks up the wiring on existing
agents.
set_nspawn_flags now adds a third manager-only bind alongside
/agents (RW) and /applied (RO): --bind-ro=/var/lib/hyperhive/meta
:/meta. manager can git log /meta to see every deploy across the
swarm and cat /meta/flake.lock to introspect which sha each agent
is currently pinned at. defensive create_dir_all on the host
side so a cold start with no agents (meta repo not yet seeded)
doesn't trip systemd-nspawn's missing-bind-source check before
the migration plants the dir.
leaf module with no runtime callers yet (every public item is
#[allow(dead_code)] until lifecycle / actions / auto_update
rewire to use it). API surface:
- sync_agents — idempotent: render flake.nix for the given
agent set, git-init on first call, nix flake lock, commit if
anything changed.
- prepare_deploy / finalize_deploy / abort_deploy — two-phase
for the request_apply_commit path. prepare runs nix flake
lock --update-input agent-<n> without committing; finalize
commits with a 'deploy <n> deployed/<id> <sha12>' message;
abort git-restores the lock so a failed build leaves no
orphan commit.
- lock_update_hyperhive — one-shot for the auto-update path.
flake.nix template defines mkAgent that pulls each agent's
nixosModules.default from its input and wraps with the
identity / HIVE_PORT / HIVE_LABEL / HIVE_DASHBOARD_PORT
module — what setup_applied used to generate inline. nix
invocations carry --extra-experimental-features as a belt
in case flakes aren't enabled in nix.conf.
setup_proposed now seeds both agent.nix (a regular NixOS module
function) and flake.nix (boilerplate exporting nixosModules.default
= import ./agent.nix) into the manager-editable proposed repo,
committed together. setup_applied's hyperhive_flake + dashboard
port wrapper generation is deleted entirely — the meta flake at
/var/lib/hyperhive/meta/ now owns the wrapper module. setup_
applied just fetches proposed's main on first spawn and tags
deployed/0; subsequent rebuilds touch nothing in applied that
the manager didn't author. spawn + rebuild keep their old param
list with the now-unused hyperhive_flake + dashboard_port
underscored — call sites get cleaned up after the meta module
lands and consumes them.
approval_diff now runs git diff refs/heads/main..refs/tags/
proposal/<id> against the applied repo instead of cobbling a
single-file diff from proposed. consequences: multi-file
proposals show every change, manager amendments in proposed
cannot lie about what'll be deployed, no-op proposals render
an explicit '(proposal matches currently-deployed tree)'.
displayed sha prefers fetched_sha (hive-c0re-vouched) and
falls back to commit_ref only for the brief pre-fetch window.
unified_diff helper + similar dep dropped — git diff is the
source of truth now. dead-code allows on the lifecycle git
helpers + approvals.set_fetched_sha come off since all are
wired up. readme picks up the tag flow + /applied RO mount.
set_nspawn_flags now adds --bind-ro=/var/lib/hyperhive/applied
:/applied for the manager container alongside the existing
/agents RW mount. manager can git-fetch deployed/failed/denied
tags out of /applied/<n>/.git to mirror them into its proposed
clones; the read-only bind means git plumbing inside the
container cannot corrupt the authoritative repos. picked up by
the next rebuild of hm1nd (no spawn-time change needed since
set_nspawn_flags runs on every spawn + rebuild).
apply-commit denials now leave a git object behind: tag
denied/<id> annotated with the operator's note (or empty body
if they didn't supply one) at proposal/<id> inside the applied
repo. rejected configs become first-class git history — git
show denied/<id> in the manager's applied.git mount yields the
tree the operator rejected plus the reason. helper event
carries the tag for parity with deployed/failed. spawn denials
fall through unannotated since they have no proposal commit.
deny becomes async (single git plumbing call); dashboard +
admin-socket callers grow .await.
run_apply_commit walks the approval through the tag state
machine in applied: approved/<id> + building/<id> stamped
before the build, then git read-tree --reset to proposal/<id>
populates the working dir without moving HEAD. on rebuild
success deployed/<id> is planted and refs/heads/main fast-
forwards to the proposal. on failure failed/<id> is annotated
with the build error and the working tree resets back to main
so the agent stays evaluable. helper events Rebuilt +
ApprovalResolved both carry the terminal tag so the manager
can git-show the exact tree (and read the failure note from
an annotated tag) against its read-only applied.git mount.
finish_approval grows a terminal_tag param; spawn path passes
None. lifecycle::apply_commit deleted.
submit_apply_commit (1) queues the approval row, (2) git-fetches
the manager-supplied sha from proposed into applied, pins it as
refs/tags/proposal/<id>, (3) persists the resolved sha on the
row via approvals.set_fetched_sha. from this point on the
proposal is immutable from the manager's perspective: amends
or force-pushes in proposed do not change what hive-c0re will
build. fetch failures mark the row failed and surface the error
to the manager so a phantom pending entry can't linger.
new shape: applied is git-init'd at first spawn, fetches
proposed's initial commit into its main, tags deployed/0 there.
the wrapper flake.nix is regenerated on every spawn/rebuild
but no longer tracked — apply churn vanishes, manager-authored
files in the proposal flow now survive untouched. setup_applied
gains an Option<&Path> for proposed (None on rebuild paths
that just refresh the flake). pre-overhaul applied dirs are
detected via the missing deployed/0 tag and bail loudly with
the destroy --purge migration hint. apply_commit is stubbed
with a clear error until the tag-driven approve flow lands.
new plumbing for the upcoming flow: git_fetch_to_tag (pulls a
sha from proposed into applied and pins it as a tag in one
shot), git_rev_parse (normalises shas + reads back tag
targets), git_tag / git_tag_annotated (lightweight vs body-
carrying for failed/denied), git_read_tree_reset (replace
working tree without moving HEAD — lets main stay on last
known-good across an in-flight build), git_update_ref (ff main
on deploy). annotated tag bodies go via stdin to avoid escape
games. all dead-code-allowed; callers land in subsequent
commits.
new column fetched_sha records the canonical sha hive-c0re
plans to fetch from the proposed repo into applied at submit
time. distinct from commit_ref (manager-supplied, may be
amended out from under the queue). set_fetched_sha is unused
until manager_server wires the fetch step next commit.
approval grows fetched_sha (canonical hive-c0re-vouched sha,
distinct from manager-supplied commit_ref). helperevent
{approvalresolved,spawned,rebuilt} grow optional sha + tag so
the manager can git-show the exact tree it's hearing about
(against the upcoming /agents/<n>/applied.git RO mount) and
know which terminal tag landed. all serde-defaulted; existing
construction sites pass none until the tag-driven flow lands.
manager is fixed at 8000, sub-agents are 8100-8999, so collisions
are strictly between two sub-agents hashing to the same value.
the colliding container's harness restart-loops on AddrInUse —
which the user just hit on :8945. previously the only sign was a
buried journalctl warn line.
now surfaced two ways:
- lifecycle::spawn / rebuild preflight: walks the live container
list, computes each agent's hashed port, refuses with
'port N already taken by <other> — rename one of them' if any
running sub-agent shares the new agent's port. so the operator
sees an actionable error in the dashboard's transient pill /
approve-result instead of waiting for the harness to die.
- /api/state grows a port_conflicts: [{port, agents: [...]}]
array; dashboard renders a pulsing red banner above the
containers list listing each cluster. matches the questions
panel pulse so it's hard to miss.
clicking DENY on the dashboard now prompts for an optional reason
('reason for denying (optional, sent to manager):'). the value
rides along as a hidden 'note' form field; backend chain:
POST /deny/{id} { note }
→ actions::deny(coord, id, Some(note))
→ Approvals::mark_denied writes it to the row
→ HelperEvent::ApprovalResolved { ..., note: Some("...") }
manager already had note: Option<String> on the event, just never
populated for denials before. host admin socket (hive-c0re deny)
still passes None.
generalized the prompt-on-submit pattern: any form with a
data-prompt attribute pops a window.prompt() before the POST and
stashes the answer in a hidden input named by data-prompt-field
(default 'note'). reusable for future opt-in note fields.
new GET /api/agent-config/{name} returns the contents of
/var/lib/hyperhive/applied/<name>/agent.nix — the file the
container actually builds against. validated against the live
container list to avoid arbitrary filesystem reads.
frontend mirrors the journald viewer: collapsed <details> on each
container row, lazy-fetches on expand, refresh button re-fetches.
restore-keyed (agent-config:<name>) so it survives the dashboard
heartbeat refresh.
read-only — mutating the applied config goes through the existing
request_apply_commit + operator approval flow.
crash_watch grows two more state-axes alongside running/stopped:
- logged-in (claude session dir populated for the agent)
- up-to-date (recorded flake rev matches current)
per-tick transitions emit HelperEvent::NeedsLogin / LoggedIn /
NeedsUpdate. seed-on-first-tick semantics retained — nothing fires
on harness boot for agents that were already in their state. only
needs_update fires the 'stale appeared' direction; the resolved
direction is already covered by Rebuilt.
new mcp__hyperhive__update(name) on the manager surface: idempotent
rebuild via auto_update::rebuild_agent. transient-aware (Rebuilding)
so the dashboard shows the spinner. login intentionally has NO tool
— it's interactive OAuth, only the operator can complete it.
prompts + approvals doc + turn-loop doc updated. todo grows a
'show per-agent applied config in dashboard' entry (separate
follow-up).
generalises the focus-preservation pattern to expanded details
sections (journald viewer was collapsing on every 5s refresh; same
issue for approval diff blocks). before re-render we snapshot
which <details data-restore-key=...> are open; after render we
re-apply. setting .open = true programmatically also fires the
toggle event, so journald's lazy-fetch listener re-runs cleanly.
tagged: journal:<container>, approval-diff:<id>. anything else
that should survive a refresh just needs a stable data-restore-key
attribute.
bug: all notifications used tag='hyperhive', so each new fire
replaced the previous — operator only ever saw one at a time and
might miss the fact that a second arrived. now per-event tags
(hyperhive:approval:<id>, hyperhive❓<id>,
hyperhive:msg:<at>:<rand>) so distinct events stack in the OS
notification center.
dropped the bogus icon (was pointing at dashboard.css) — some
browsers refuse to display a notification with an invalid icon.
added console.debug at every block point (not supported, permission
not granted, muted) and a 'shown' log on success, so the operator
can see in the browser console exactly why a notification didn't
fire.
note for the operator: most browsers also suppress notifications
while the originating tab is FOCUSED. that's a browser-level
decision, not ours.
every refreshState tick does root.innerHTML = '' across the managed
sections, which destroys any focused input. detect the case before
re-rendering: if document.activeElement is an INPUT / TEXTAREA /
SELECT inside one of the managed sections, skip this tick and try
again in 2s. eventually the operator blurs and the refresh lands.
managed section ids: containers / tombstones / questions / inbox /
approvals. msgflow + message-flow SSE rows don't have inputs so
they're not affected.
operator's call: probing-forward + state-file machinery is more
brittle than the bug it tried to fix. revert to the original
deterministic FNV-1a hash mod 900. collisions are real but rare;
operator resolves by renaming (different name → different hash) and
rebuilding. no per-agent port file, no scan, no migration path,
nothing to drift out of sync with the running container.
existing port files on disk are silently ignored — operator
rebuilds affected agents to regenerate flakes from the deterministic
hash.
previous attempt was wrong: the legacy branch returned port_hash
unconditionally, so two legacies hashing to the same port both
wrote that port and the collision persisted (test still trying to
bind coder's port).
new rule: always probe forward from port_hash, with scan_taken_ports
parameterised by include_implicit_hashes:
- legacy migration (applied dir exists, no port file): pass false.
scan only counts other agents' port files. first-queried legacy
claims its hash; subsequent colliders see the first's port file
and probe forward. we don't know which legacy originally won
the bind race, so first-write-wins; the loser was already
crash-looping anyway and gets a fresh port to rebuild to.
- fresh spawn (no applied dir): pass true. counts port files AND
implicit hashes for not-yet-migrated legacies, so a new spawn
doesn't race with an unmigrated peer.
migration note for affected users: agents whose port file says
something other than their hashed port may have been corrupted by
the previous fix. Hit ↻ R3BU1LD on the offender to regenerate the
flake (uses the current port file) and the container will bind
the right port on restart.
three signals fire OS notifications:
- new approval lands in the queue (per id, via /api/state delta)
- new ask_operator question queued (per id)
- broker message sent to operator (live via SSE)
first /api/state render after page load seeds the 'seen' sets
without firing — only items that arrive while the page is open
count. controls in a row under the banner: 🔔 enable
notifications (calls requestPermission, hides on grant), 🔕 mute /
🔔 unmute toggle (localStorage-backed so operator can silence
without revoking the permission), inline status text when blocked
or unsupported.
notification tag='hyperhive' collapses rapid bursts; onclick
focuses the dashboard tab. requires secure context (HTTPS or
localhost) — on other origins the API is unavailable and the
controls hide themselves.
todo: entry dropped.
new hive_c0re::crash_watch task polls every 10s, builds the set of
currently-running containers, and on running→stopped transitions
checks the transient snapshot: if no Stopping / Restarting /
Destroying / Rebuilding flag is set, the container exited
unexpectedly and we fire HelperEvent::ContainerCrash into the
manager's inbox so it can react (typically: start it again).
first poll is a seeding pass — no events on harness startup. dbus
subscription would be lower-latency but polling is honest and
debuggable, and a 10s delay on crash detection is fine for our
scale.
manager prompt + approvals doc updated to advertise the new
event variant. todo drops the entry (and the journald-viewer
entry that already shipped).
- runtime model override: Bus::{model,set_model} + POST /api/model
(form-encoded {model: name}). turn.rs reads bus.model() per turn
so a flip lands on the next claude invocation. /api/state grows
a model field; agent page shows a 'model · <name>' chip in the
state row. '/model <name>' slash command POSTs to the endpoint
and refreshes state.
- port regression fix: agent_web_port no longer probes forward for
*existing* agents (the previous fix shifted ports for any agent
without a port file, including legacy ones whose container was
already bound to the bare hashed port — dashboard rendered the
new port, container was still on the old one, conn errors). new
rule: port file exists → use it; absent + applied flake present
→ legacy, persist port_hash without probing; absent + no applied
flake → fresh spawn, probe forward.
- SO_REUSEADDR on both the dashboard and per-agent web UI binds
via tokio::net::TcpSocket. operator hit 12 retries failing on
manager :8000 — REUSEADDR handles the TIME_WAIT case cleanly
without a new dep; retry still covers the genuine
process-still-alive overlap.
todo: drops the model-override entry (shipped); adds two new
items — model persistence (optional, future), and custom
per-agent MCP tools (groundwork for moving bitburner-agent into
hyperhive).
recv-with-timeout is strictly better than a fixed sleep because it
wakes instantly on incoming messages. drop the half-written nap MCP
tool, raise the recv wait_seconds cap from 60s to 180s on both
agent and manager sockets.
prompts updated: agent.md + manager.md now spell out the pattern —
when there's nothing else useful to do, call recv with
wait_seconds=180 to park the turn; do NOT use Bash sleep for the
same purpose. todo drops the nap entry and the napping-state-badge
follow-up; both replaced by 'just use a long recv'.
AgentRequest::Recv and ManagerRequest::Recv grow an optional
wait_seconds field (default None → 30s, capped at 60s server-side).
agent_server / manager_server clamp via recv_timeout(). MCP tool
schemas advertise the param so claude can pick its own poll window
— useful when an agent wants to throttle wakes without entering a
distinct nap state.
both harness loops still pass None, keeping the existing 30s
default behaviour for system-level Recvs.
new GET /api/journal/{name}?unit=&lines= shells out journalctl -M
<container> -b --no-pager --output=short-iso --lines=<N> (cap 5000).
optional unit filter, restricted to hive-ag3nt.service /
hive-m1nd.service so the shell-out can't be coerced into reading
unrelated units. validates the container name against the live list
before invoking journalctl.
frontend renders a collapsed '↳ logs · <container>' details block
on each container row. expanding triggers a lazy fetch; refresh
button re-fetches; unit dropdown switches between the harness
service (default) and the full machine journal. output sits in a
24em-tall monospace pre, auto-scrolled to the bottom on fresh
fetch.
hive-c0re's systemd unit already runs as root, so journalctl has
the access it needs.
operator hit 'coder' and 'test' colliding on the same hashed port —
fnv-1a mod 900 has ~0.1% collision probability per pair and clearly
that's not enough. agent_web_port goes stateful:
- per-agent port persisted to /var/lib/hyperhive/agents/<name>/port
- on first call, look up the file; if absent, hash, then probe
forward through the allocated range skipping any port other
agents already claim, then write the chosen value back
- subsequent calls return the persisted port (sticky)
other agents' ports come from their port file if present, else the
fallback is the hashed value — that handles existing deployments
without forcing a rebuild-all just to migrate. rebuilding the
colliding agent re-runs agent_web_port, sees its peer's implicit
hash port as taken, picks the next free slot, persists.
range exhaustion (very unlikely — 900 slots) logs a warning and
returns the hash; the bind-with-retry on the harness will surface
the failure honestly rather than silently looping.
manager can pass ttl_seconds to ask_operator. on submit, host
stores deadline_at = now + ttl in operator_questions (new column,
migrated via existing pragma_table_info pattern), spawns a tokio
task that sleeps until the deadline then resolves the question with
answer '[expired]' and fires the same OperatorAnswered helper event.
already-resolved races no-op silently.
dashboard renders a '⏳ MM:SS' chip on the question row when
deadline_at is set. format collapses seconds → s, < 1h → m s, ≥ 1h
→ h m. heartbeat refresh (5s) keeps the chip current; the operator
sees it tick down.
manager prompt + mcp tool description updated. journald viewer per
container queued in todo (separate task).
operator hit 'Address already in use (os error 98)' on a harness
restart — the new harness raced the old socket's release. add a
bind_with_retry helper that backs off (250ms doubling, capped at
2s, 12 tries ≈ 22s total) on AddrInUse before giving up. applied
to both the per-agent web UI and the hive-c0re dashboard.
proper fix would be SO_REUSEADDR via socket2 but retry covers the
TIME_WAIT case fine and keeps the dep count down. Other bind errors
still fail immediately (port permission, fd exhaustion).
new wire request AgentRequest::Recent { limit } / ManagerRequest::Recent
(plus matching responses with Vec<InboxRow>). InboxRow moved to
hive-sh4re so it lives on both surfaces without an internal-to-wire
conversion. host-side dispatch in agent_server / manager_server
calls broker.recent_for(name, limit).
per-agent web_ui /api/state grew an inbox: Vec<InboxRow> populated
via the same per-agent socket (best-effort; transport failure
returns empty). frontend renders as a collapsible <details> section
between the state row and the terminal — fmt timestamp / from /
body in a tight grid, capped at 16em scrollable. only visible when
there are rows.
new POST /cancel-question/{id} resolves a pending operator question
with the sentinel answer '[cancelled]' and fires the usual
HelperEvent::OperatorAnswered so the manager sees a terminal state
and can fall back. uses the same OperatorQuestions::answer path —
no special handling, the manager already has to deal with arbitrary
answer strings.
dashboard renders the cancel as a separate <form> below the main
qform so the answer-merge submit handler on the main form doesn't
inadvertently fire when the operator clicks cancel. confirm dialog
spells out what the manager will see.
ttl-based auto-cancel is still on the todo (would spawn a tokio task
per submitted question).
new Coordinator::kick_agent(name, reason) drops a system message
into the agent's inbox so the next turn picks it up with a 'you
were just (re)started, check /state/ for notes, --continue session
is intact' hint. wakes the turn loop without any harness-side
handling needed — it's just another inbox message with sender =
'system'.
wired from:
- dashboard /start /restart /rebuild handlers (via lifecycle_action's
on-success tail)
- manager mcp_hyperhive_start / restart
dashboard: pending approvals + tombstones + questions now refresh on
a 5s heartbeat when nothing else is happening. previously refresh
only fired on async-form submit or on broker traffic addressed to
operator — manager-queued approvals went through neither, so the
operator had to reload to see them. 5s is the slow-path; 2s
remains for in-flight transients.
drops the #[allow(clippy::too_many_lines)] on api_state by extracting
four pure helpers:
- build_container_views — live containers + any_stale flag
- build_transient_views — agents in pre-creation Spawning state only
- build_approval_views — pending approvals with diff html
- build_tombstone_views — destroyed-but-kept state dirs
api_state itself is now ~30 lines of orchestration. zero behavior
change. each helper is independently readable + testable.
five POST handlers (post_kill / post_restart / post_start / post_rebuild)
were all repeating the same boilerplate: strip prefix, set_transient,
call lifecycle::X, clear_transient, match the result. extract one
helper that takes the transient kind, error-message verb, the work
body, and an optional 'on success' tail (used by kill to also
unregister + emit HelperEvent::Killed). each handler shrinks to a
single lifecycle_action(..) call. zero behavior change.
retention is a host concern — agents have no business doing their
own cleanup, and a misbehaving harness could skip it. drop
spawn_events_vacuum from both hive-ag3nt and hive-m1nd, drop the
matching Bus::vacuum + EventStore::vacuum methods. new
hive_c0re::events_vacuum module sweeps every existing
agents/<name>/state/hyperhive-events.sqlite on the same hourly
cadence as the broker vacuum. same two-stage delete (older than 7
days, trim to 2000 newest). called from main alongside broker
vacuum.
also: server-side state badge entered into todo.md (today's badge
is derived client-side from sse, fine for idle/thinking but a
state machine that grows compacting/napping wants authoritative
status from the harness).
submitting R3QU3ST SP4WN immediately queues an approval that lands
in the very next list. the form belonged with that list, not at the
top of containers — the agent doesn't exist yet at form time anyway.
docs: claude.md grows operator_questions.rs / events.rs sqlite /
broker vacuum to the file map; web-ui shape lists the actual current
endpoint set (per-agent cancel/compact/history, dashboard tombstone
purge/answer/spawn); live-view section now describes the state
badge, sticky-bottom scroll, history backfill, and the terminal-
embedded prompt with its slash commands; dashboard-action-surface
rewritten around the new six-section page (containers / kept-state /
questions / inbox / approvals / message-flow) and the two-line
container row. new 'persistence + retention' section documenting both
sqlite databases and their vacuum cadences. readme picks up the new
mgr mcp surface (start/restart/ask_operator) + operator-side
features list + ask_operator answer flow.
todo trimmed of shipped items (bigger terminal / sticky scroll /
cancel button / /compact trigger / /cancel command). new entry for
the two-step spawn-with-preconfig flow.
new section between containers and questions: lists every name with a
state dir under /var/lib/hyperhive/agents/ that doesn't correspond to
a live container. shows state size + last-modified age + whether
claude creds are kept. two actions per row:
- R3V1V3 — queues a spawn approval with the same name (operator
approves to recreate; spawn flow reuses prior config + claude
creds, no re-login needed)
- PURG3 — wipes the agent's state + applied dirs (post /purge-tombstone/
endpoint; refuses if a live container with that name still exists)
dashboard also opens agent links in new tabs now (target=_blank +
rel=noopener) so the operator's overview tab stays put when they
dive into an agent.
ask_operator now accepts a multi: bool. when true and options is
non-empty, the dashboard renders the choices as checkboxes — operator
picks any subset, answer comes back as a ', '-joined string. when
false (default), options are radio buttons.
independent of multi, a free-text input ('or type your own…') is
always rendered alongside options so the operator is never trapped
by an incomplete list. submit merges checked options + free text into
the single 'answer' field.
schema migration: operator_questions grows a multi INTEGER column
with a one-shot ALTER TABLE on open. backward compatible — old rows
default to 0 (not multi).
prompt + mcp tool description updated; existing dashboard css for
.qform was rewritten around the new vertical layout.
backend:
- TransientKind grows Starting / Stopping / Restarting / Rebuilding /
Destroying alongside the existing Spawning. each dashboard handler
(start/restart/kill/rebuild/destroy) wraps the lifecycle call with
set_transient + clear_transient so the dashboard knows what's in
flight. transient kind is surfaced inline on ContainerView.pending
(existing-container actions) — only Spawning (pre-creation) lands
in the separate transients list.
frontend:
- container row is now two lines: identity + meta on top, action
buttons below. less cluttered, leaves room for the pending state
pill. pending rows dim their actions and surface a pulsing
'◐ spawning… / starting… / stopping… / restarting… / rebuilding…
/ destroying…' indicator next to the name.
- 'needs login' / 'needs update' chips moved into a unified .badge
styling for consistency.
- auto-refresh kicks in not only on transient spawn but on any
container with a pending action.
undelivered rows are always kept regardless of age (still in flight).
sweep runs immediately on serve start then every hour. logs row count
when non-zero. keep_secs is hard-coded for now (30 days); can be
config-driven later if a host wants to retain more / less for audit.
new --purge flag on the destroy verb (cli + admin socket + dashboard).
default destroy still keeps /var/lib/hyperhive/{agents,applied}/<name>/
so recreating with the same name reuses prior config + creds.
with --purge, both dirs go too (config history, claude creds, /state/
notes). no undo. dashboard adds a separate PURG3 button with an
explicit confirmation copy; the existing DESTR0Y button keeps the
soft semantics.
claude.md dashboard-action-surface section updated; todo entry
dropped.
agent terminal-wrap + dashboard msgflow get a translucent bg with
backdrop-filter blur+saturate so page-bg glow softens behind them.
new rows in the live panel and the dashboard message flow fade in
with a 4px slide-up. unread badge pulses; pending-operator-questions
section pulses its glow. history-backfilled rows skip the animation
(.no-anim) so the page doesn't stagger 100 fade-ins on load.
new manager tools mcp__hyperhive__{start,restart} that delegate to the
existing lifecycle::start / lifecycle::restart on the host. kill was
already at the manager's discretion; rounding out start + restart for
parity so day-to-day container care doesn't have to round-trip through
the operator.
guard: refuse self-targeting on kill/start/restart — the manager would
just be cutting its own legs. spawn (request_spawn) and config changes
(request_apply_commit) still go through the approval queue, since those
are the actual gate. prompt + claude.md updated to make the boundary
explicit. kill now also emits HelperEvent::Killed (it didn't before).
agent page restructure:
- send form moves into the terminal panel as a prompt-style row beneath
the live tail (status line stays above so it still reads as a header).
- live panel + prompt share a single bordered 'terminal-wrap' box.
- harness-alive / login-state status lines drop their decorative ascii
bookends; just a leading dot/glyph remains.
- banner gradient is now a real css gradient with a shimmer animation
toggled by an .active class. turn_start adds it, turn_end removes it.
dashboard side mirrors this: each broker sse event nudges a 4s
shimmer window.
- dashboard container rows drop their static ▓█▓▒░ / ▒░▒░░ glyph
prefixes; the role chips already disambiguate m1nd vs ag3nt.
- empty-state placeholders drop the ▓ bookends.
terminal pre-fill: hive-ag3nt::events::Bus grows a 500-event ring
buffer; new GET /events/history endpoint returns it. The agent JS
fetches history before opening the SSE stream so opening the page mid-
turn shows the last N events instead of a blank panel. The replay
walks turn_start/turn_end pairs to seed the banner-active state
correctly if a turn was still open.
new mcp tool on the manager surface that queues a question on the
dashboard and returns the question id immediately. operator submits an
answer via /answer-question/<id>; the dashboard fires
HelperEvent::OperatorAnswered { id, question, answer } into the manager
inbox so the next turn picks it up.
also: fix async-form button stuck on spinner after successful submit
(refreshState skipped re-rendering, so the button was never re-enabled).