The main dashboard had no favicon — PR #145 added them to the
per-agent pages but missed hive-c0re's index. Serve branding/
hyperhive.svg at /favicon.svg and declare it in the index head.
The dashboard represents the whole hive, so it uses the project
mark (per-agent pages keep their own configurable /icon).
closes#173
- MessageEvent and DashboardEvent Sent/Delivered now carry id and in_reply_to
- broker.send() includes last_insert_rowid in the emitted event
- recent_all() and recv_batch() include id and in_reply_to from the DB
- deliver_reminders_batch() tracks per-row rowids within the transaction
- dashboard message flow: reply rows are indented with a border-left and a
clickable '↳ reply' tag that scroll-jumps + briefly highlights the parent
- per-agent inbox: reply messages get a '↳ reply ·' prefix and indent
Closes#26
- forge nix option moves to hyperhive.forge.enable, defaults true;
hive-c0re imports the forge module so it's on by default.
- drop the agent.nix container-row viewer + /api/agent-config; link
to the agent-configs forge repo instead.
- restructure pending approvals into a card (identity header /
what-changed body / decision actions) with a link to the proposal
commit on the forge.
- diff opens in the side panel with a 3-way base toggle: vs applied
(running) / vs last-approved / vs previous proposal, served by the
new /api/approval-diff/{id}?base= endpoint.
clicking a .md / .markdown path reference now opens a marked-rendered
view in the slide-in panel instead of raw text; other files stay raw
in a <pre>. serves the vendored marked bundle at /static/marked.js and
scopes a .md stylesheet to the panel body.
loose-ends question rows get a textarea + send button; the operator
answers as operator by POSTing to the core dashboard's
/answer-question route, not the per-agent socket — keeps the
operator-authority path off the agent's own socket. cross-origin POST
needs a CORS shim on that route for now; drops out once the gateway
makes the page same-origin.
also splits deployment/ops/boundaries/gateway work into TODO-ops.md.
Broker schema gains attempt_count INTEGER + last_error TEXT
columns via idempotent ALTER TABLE migration (pragma-probed so
fresh + existing dbs converge). reminder_scheduler::tick calls
record_reminder_failure on every deliver_reminder error,
bumping the counter + stashing the message. get_due_reminders
filters out rows where attempt_count >= MAX_REMINDER_ATTEMPTS
(5) so the scheduler stops retrying a stuck row until the
operator intervenes.
new POST /retry-reminder/{id} → reset_reminder_failure clears
the counters; next 5s tick re-attempts. cancel-reminder
unchanged (hard-delete).
dashboard renders failed rows with a red left rule, the error
text inline, and a ⚠ N failed badge. ↻ R3TRY button appears
when attempt_count > 0 — sits next to ✗ C4NC3L in a small
actions row below the body.
DashboardEvent::QuestionAdded gains question_refs and
QuestionResolved gains answer_refs — both populated via
scan_validated_paths at emit time, same helper the broker
forwarder uses for Sent/Delivered. cold-load snapshot wraps
each OpQuestion in QuestionView with the same fields computed
once per /api/state.
client threads refs through questionsState rows (pending +
history) and passes them to appendLinkified at every render
site (live pane, history details). path tokens in question and
answer bodies now linkify with the same server-vouched
guarantee broker messages already enjoyed.
new DashboardEvent::TombstonesChanged + MetaInputsChanged carry
full snapshots (lists are tiny; snapshot beats diff for race
avoidance). Coordinator-side helpers
emit_tombstones_snapshot + emit_meta_inputs_snapshot fire from
every mutation site: actions::destroy + post_purge_tombstone +
actions::approve (spawn finalise consumes tombstone) +
run_meta_update + auto_update::rebuild_agent (lock bumps).
client adds derived stores + apply* handlers + drops the
post-submit refetch on PURG3 (container row + tombstone row)
and meta-update.
after this commit /api/state is fetched exactly once per page
session (cold load); every other change rides the SSE channel.
drop the /api/state-file/check probe endpoint (which let any
dashboard visitor enumerate filesystem layout by feeding paths)
and the client's optimistic-then-downgrade dance. instead, the
broker forwarder calls scan_validated_paths(body) — same
allow-list helper as the read endpoint — and attaches the
verified file tokens to DashboardEvent::Sent/Delivered as
file_refs: Vec<String>. /dashboard/history backfill does the
same per-row.
client appendLinkified takes a (text, refs) pair, walks
left-to-right linkifying every occurrence of any ref token,
longest-first tie-break. no regex, no probe, no cache, no
queue. when refs is empty/absent the body emits as plain text
(question/answer/reminder rendering — refs for those are a
follow-up).
operator inbox stores file_refs from the sent event so its
renderer gets the same anchors as the message-flow terminal.
regex back to permissive ("looks like a path") — the server is
authoritative on whether each match is a file. anchors render
optimistically, paths queue for batch validation (50ms coalesce),
non-files downgrade to plain text + the sibling <details>
preview is dropped. session-scoped cache (pathValidity Map) so
repeated paths skip the roundtrip.
new endpoint POST /api/state-file/check accepts { paths } and
returns { results: {<path>: bool} }. shares resolve_state_path
helper with the read endpoint so security rules can't drift —
both refuse anything outside the allow-list, anything resolved
outside via symlink, or anything in a per-agent subdir other
than state/. capped at 64 paths/request.
drops the brittle client-side filename heuristic (the .ext-
required rule that missed README/Makefile and still matched bare
dirs without trailing slash). single source of truth.
new 'qu3u3d r3m1nd3rs' section between approvals and operator
inbox. lists every pending reminder with agent, due-relative
timestamp, body, payload path (path-linkified), and a cancel
button. drives off a new /api/reminders endpoint and a
POST /cancel-reminder/{id} that hard-deletes the row.
failure surface (last_error / attempt_count + retry) deferred —
needs a sqlite migration; tracked in TODO.md.
agents constantly emit pointer strings to /agents/<n>/state/foo.md
since broker bodies cap at 1 KiB. now those tokens linkify in the
message flow, question bodies, answer text, and operator inbox;
clicking expands an inline <details> that lazy-fetches via the
new /api/state-file?path=... endpoint.
endpoint allow-list: per-agent state dirs + shared docs, both
in their container-mount form (/agents/<n>/state, /shared) and
host form (/var/lib/hyperhive/...). 1 MiB read cap; canonicalises
before the prefix check so `..` / symlinks can't escape.
legacy bare `/state/...` is deliberately not matched — ambiguous
from the host's perspective (we'd need to know which agent the
message references to translate). agents should use the qualified
form going forward.
questions pane now shows both operator-targeted threads
(target IS NULL) and agent-to-agent threads (target = some
agent). filter chips above the list: all / @operator / @peer /
per-participant. peer rows get a mauve left rule + a 0V3RR1D3
button that POSTs the same /answer-question endpoint
(OperatorQuestions::answer already permits the operator as
answerer on any target).
wire changes: OperatorQuestions gains pending_all +
recent_answered_all; QuestionAdded + QuestionResolved events
carry target: Option<String>; emit sites drop their
target.is_none() guard. answered-history rows show the
answerer prefix so override answers are auditable at a glance.
new DashboardEvent::ContainerStateChanged + ContainerRemoved
close the last refetch loop on the dashboard. Coordinator's
rescan_containers_and_emit diffs a fresh container_view::build_all
against a cached last_containers map and fires per-row events.
called from actions::approve (post-spawn), actions::destroy,
the lifecycle_action wrapper, auto_update::rebuild_agent, and
the existing 10s crash_watch poll.
ContainerView extracted to its own module so coordinator and
dashboard can both build it. dashboard endpoints flip to 200;
container-lifecycle forms carry data-no-refresh. client drops
the periodic poll entirely — initial cold load + SSE for
everything afterwards. pending overlay reads from the existing
transientsState since the new event payload doesn't carry it.
PURG3 + meta-update keep the post-submit refetch since
tombstones + meta_inputs aren't event-derived yet; tracked in
TODO.md.
bare set_transient/clear_transient pairs leak the in-memory transient
on task cancellation, panics, or any early return between the two
calls — dashboard then shows the agent stuck in 'rebuilding…'
forever (coder hit this today). add Coordinator::transient_guard
returning a TransientGuard whose Drop clears, and convert every
caller (dashboard lifecycle_action, auto_update::rebuild_agent,
manager_server Update, actions::destroy, actions Spawn task,
migrate phase 4). destroy() now takes &Arc<Coordinator> so it can
hold a guard. existing stuck transients clear on next hive-c0re
restart since transient state is in-memory only.
agents weren't being woken with the 'you were rebuilt — check
/state/ for notes, --continue intact' system message after
several recent rebuild surfaces:
- auto_update::rebuild_agent — used by the dashboard rebuild
button, admin-CLI rebuild via lifecycle_action, the startup
rev-scan, AND the new meta-input update batch loop. kick
moves *into* rebuild_agent's success arm so all four
paths benefit. (the dashboard's lifecycle_action extra
closure was already firing kick — now it's a no-op for the
rebuild path since rebuild_agent does it.)
- actions::run_apply_commit — apply-commit approve flow built
+ tagged deployed/<id> but never kicked. add kick on
success with the more specific 'config update applied' hint.
- server.rs::HostRequest::Rebuild — the admin-CLI direct path
calls lifecycle::rebuild bypassing rebuild_agent. add kick
on success.
dashboard's restart / start lifecycle_action extras still
kick via their own closures since they don't route through
rebuild_agent. stop / kill / destroy intentionally don't
kick — there's nothing to wake.
read_meta_inputs() previously only included direct inputs of
meta's root node — so a manager-added 'inputs.mcp-matrix' in
agent-dmatrix's flake.nix never surfaced in the dashboard
panel even though it's a real fetched input that nix can
update.
now: BFS the flake.lock graph from root to depth 2. emits
one MetaInputView per fetched (non-follows) node, names are
slash-paths from root — 'hyperhive', 'agent-coder',
'agent-dmatrix/mcp-matrix', 'hyperhive/nixpkgs', etc. that's
the same syntax 'nix flake update' accepts for transitive
inputs, so the existing POST /meta-update path needs no
nix-side change.
depth limit of 2 keeps the panel readable — deeper transitives
(nixpkgs's own deps etc.) would explode it; bumping a level-2
entry re-fetches its sub-inputs anyway.
POST /meta-update's 'which agents to rebuild' derivation
updated for the slash names: anything under hyperhive/
fans out to all agents (shared base); 'agent-<n>/...' picks
out the agent name from before the first slash.
read_meta_locked_revs (used by the deployed:<sha> chip per
container) split out into its own straight root-input lookup
since the chip only cares about the agent's own input.
every snapshot source backing /api/state used .unwrap_or_default()
— sqlite errors, broker errors, nixos-container list failures,
operator_questions decode crashes all degraded to empty lists
without a log line. the 'pending question doesn't render'
bug we've been chasing was likely a row-decode panic in
OperatorQuestions::pending() being swallowed this way.
new log_default(what, result) replaces each call site: same
default value on Err but emits target=api_state warn with the
source name + dbg error first. five sources covered:
nixos-container list, approvals.pending,
approvals.recent_resolved, broker.recent_for(operator),
questions.pending. next time the question goes missing the
journal will say which source failed and how.
todo updated — pending-question entry now points at the new
log instead of three suspect paths.
new section 'M3T4 1NPUTS' between approvals and message flow:
one row per input in meta/flake.lock (hyperhive first, then
agent-<n> alphabetically). each row shows the input name, the
first 12 chars of the locked sha, a relative timestamp from
locked.lastModified, and the original.url when available.
checkbox per row; submit button is disabled until at least one
box is checked; submitting confirms then POSTs the selected
names to /meta-update.
backend:
- meta::lock_update(inputs: &[String]) — runs 'nix flake update
<names>' in the meta dir, commits the lock change with a
combined message ('lock update: hyperhive, agent-coder').
preserves the existing META_LOCK serialization. existing
lock_update_for_rebuild / lock_update_hyperhive stay for
their single-input callers.
- POST /meta-update — comma-separated 'inputs' form field
(JS joins checkboxes since axum::Form doesn't natively
decode repeated keys); spawns a background task that runs
the lock update + per-agent rebuild loop. hyperhive
selection fans out to all agents; agent-<n> selection only
rebuilds <n>. each rebuild fires Rebuilt to the manager
exactly like dashboard / admin-CLI / auto-update.
rebuild loop is sequential — auto_update::run too (was
parallel via tokio::spawn). parallel rebuilds collide on
nix-store's sqlite cache ('sqlite db busy, not using cache')
and the meta META_LOCK contention. nix-daemon serializes the
heavy build steps anyway, so this isn't a throughput loss.
new tabs above the approvals list: 'pending · N' and
'history · M'. active tab persists in localStorage so the
operator can park on history if they prefer. on a fresh
dashboard the default is pending (matches the prior shape).
history view shows the last 30 resolved approvals — newest
first by resolved_at — with one row per approval: status
glyph (✓ approved / ✗ denied / ⚠ failed), id, agent, kind,
short sha, status label, and a relative time chip. when the
row has a note (deny reason or build error), it renders
below in a muted block with line wraps preserved.
backend: Approvals::recent_resolved(limit) queries by
status IN ('approved', 'denied', 'failed') ORDER BY
resolved_at DESC. StateSnapshot gets approval_history (a
lean ApprovalHistoryView without diff_html — rendering 30
git diffs per state poll would be expensive and the operator
already saw the diff at decision time). dashboard's
history_view fn projects the sqlite row.
retires the matching TODO entry.
new section under MESS4GE FL0W. msgflow already tails only
broker traffic (sent + delivered), which is exactly the
'messages through core' view the operator wants; no
per-agent thinking leaks through. compose box below:
- a prompt span renders the sticky recipient ('@coder>'),
rendered outside the textarea so it can't be edited
inadvertently. on submit the recipient gets persisted to
localStorage so it survives reload.
- start the input with '@name body' to redirect — the parser
splits at the first whitespace and the new recipient
becomes sticky.
- typing '@' at the start opens a completion dropdown over
the textarea pulled from window.__hyperhive_state.containers;
arrow keys cycle, tab/enter selects, escape closes. clicking
works too.
- manager swap: agents flagged is_manager are surfaced as
'@manager' (the broker's recipient string) instead of
'@hm1nd' (the container name), so the message actually
routes to the manager's inbox.
backend: new POST /op-send accepts {to, body} and drops a
broker.send({from:'operator', to, body}) — same shape as the
per-agent web UI's OperatorMsg, but lets the operator choose
the recipient explicitly from the main dashboard.
new AgentRequest::AskOperator + AgentResponse::QuestionQueued on
the per-agent socket — same shape as the manager flavor, agent
gets the same wire surface (still uses the same operator_questions
table). agent_server::dispatch wires AskOperator through coord
.questions.submit(agent, ...) so the row's asker is the sub-agent
name; the ttl watchdog already in manager_server gets shared and
spawn_question_watchdog goes pub.
answer routing: operator_questions::answer now returns (question,
asker). post_answer_question + post_cancel_question + the watchdog
fire OperatorAnswered through new coord.notify_agent(asker, event)
instead of always notify_manager — the event lands in whichever
agent originally asked. notify_manager is now a thin wrapper.
agent socket plumbing: agent_server::start takes Arc<Coordinator>
instead of Arc<Broker> so dispatch has access to questions +
notify path; coordinator::{register_agent,ensure_runtime} take
self: &Arc<Self>. mcp::AgentServer grows the ask_operator tool;
allowed_mcp_tools(Agent) adds it; prompts/agent.md replaces the
'message the manager to ask the operator' guidance with the
direct tool description.
ContainerView grows deployed_sha (first 12 chars of the rev
that /var/lib/hyperhive/meta/flake.lock currently has locked
for agent-<name>). renderContainers appends a 'deployed:<sha12>'
chip next to the container name + port — title attribute
explains it's the meta-lock sha. degrades gracefully when the
meta repo isn't seeded yet (missing / unparsable lock = empty
map = no chip). new read_meta_locked_revs helper does the JSON
parsing without unwraps.
approval_diff now runs git diff refs/heads/main..refs/tags/
proposal/<id> against the applied repo instead of cobbling a
single-file diff from proposed. consequences: multi-file
proposals show every change, manager amendments in proposed
cannot lie about what'll be deployed, no-op proposals render
an explicit '(proposal matches currently-deployed tree)'.
displayed sha prefers fetched_sha (hive-c0re-vouched) and
falls back to commit_ref only for the brief pre-fetch window.
unified_diff helper + similar dep dropped — git diff is the
source of truth now. dead-code allows on the lifecycle git
helpers + approvals.set_fetched_sha come off since all are
wired up. readme picks up the tag flow + /applied RO mount.
apply-commit denials now leave a git object behind: tag
denied/<id> annotated with the operator's note (or empty body
if they didn't supply one) at proposal/<id> inside the applied
repo. rejected configs become first-class git history — git
show denied/<id> in the manager's applied.git mount yields the
tree the operator rejected plus the reason. helper event
carries the tag for parity with deployed/failed. spawn denials
fall through unannotated since they have no proposal commit.
deny becomes async (single git plumbing call); dashboard +
admin-socket callers grow .await.
manager is fixed at 8000, sub-agents are 8100-8999, so collisions
are strictly between two sub-agents hashing to the same value.
the colliding container's harness restart-loops on AddrInUse —
which the user just hit on :8945. previously the only sign was a
buried journalctl warn line.
now surfaced two ways:
- lifecycle::spawn / rebuild preflight: walks the live container
list, computes each agent's hashed port, refuses with
'port N already taken by <other> — rename one of them' if any
running sub-agent shares the new agent's port. so the operator
sees an actionable error in the dashboard's transient pill /
approve-result instead of waiting for the harness to die.
- /api/state grows a port_conflicts: [{port, agents: [...]}]
array; dashboard renders a pulsing red banner above the
containers list listing each cluster. matches the questions
panel pulse so it's hard to miss.
clicking DENY on the dashboard now prompts for an optional reason
('reason for denying (optional, sent to manager):'). the value
rides along as a hidden 'note' form field; backend chain:
POST /deny/{id} { note }
→ actions::deny(coord, id, Some(note))
→ Approvals::mark_denied writes it to the row
→ HelperEvent::ApprovalResolved { ..., note: Some("...") }
manager already had note: Option<String> on the event, just never
populated for denials before. host admin socket (hive-c0re deny)
still passes None.
generalized the prompt-on-submit pattern: any form with a
data-prompt attribute pops a window.prompt() before the POST and
stashes the answer in a hidden input named by data-prompt-field
(default 'note'). reusable for future opt-in note fields.
new GET /api/agent-config/{name} returns the contents of
/var/lib/hyperhive/applied/<name>/agent.nix — the file the
container actually builds against. validated against the live
container list to avoid arbitrary filesystem reads.
frontend mirrors the journald viewer: collapsed <details> on each
container row, lazy-fetches on expand, refresh button re-fetches.
restore-keyed (agent-config:<name>) so it survives the dashboard
heartbeat refresh.
read-only — mutating the applied config goes through the existing
request_apply_commit + operator approval flow.