hyperhive

Author	SHA1	Message	Date
müde	c42ad1330c	lifecycle: pre-wire applied remote in proposed setup_proposed now lands a git remote named 'applied' on every proposed/<n>/config pointing at /applied/<n>/.git — the path as seen from inside the manager container, where the RO bind in set_nspawn_flags makes the URL resolve. From the manager: git fetch applied git log applied/main git show applied/refs/tags/deployed/<id> git diff applied/main HEAD git rebase applied/main all work without manually constructing the path each time. The RO bind blocks push at the kernel level so the remote can only fetch. Idempotent — also applied to pre-existing proposed repos (no-op if the remote is already correct, set-url if drifted) so the startup migration picks up the wiring on existing agents.	2026-05-16 00:25:43 +02:00
müde	3d14ddeb7d	lifecycle: bind /meta RO into manager set_nspawn_flags now adds a third manager-only bind alongside /agents (RW) and /applied (RO): --bind-ro=/var/lib/hyperhive/meta :/meta. manager can git log /meta to see every deploy across the swarm and cat /meta/flake.lock to introspect which sha each agent is currently pinned at. defensive create_dir_all on the host side so a cold start with no agents (meta repo not yet seeded) doesn't trip systemd-nspawn's missing-bind-source check before the migration plants the dir.	2026-05-16 00:24:39 +02:00
müde	92822efe16	meta: new hive-c0re module owns /var/lib/hyperhive/meta/ leaf module with no runtime callers yet (every public item is #[allow(dead_code)] until lifecycle / actions / auto_update rewire to use it). API surface: - sync_agents — idempotent: render flake.nix for the given agent set, git-init on first call, nix flake lock, commit if anything changed. - prepare_deploy / finalize_deploy / abort_deploy — two-phase for the request_apply_commit path. prepare runs nix flake lock --update-input agent-<n> without committing; finalize commits with a 'deploy <n> deployed/<id> <sha12>' message; abort git-restores the lock so a failed build leaves no orphan commit. - lock_update_hyperhive — one-shot for the auto-update path. flake.nix template defines mkAgent that pulls each agent's nixosModules.default from its input and wraps with the identity / HIVE_PORT / HIVE_LABEL / HIVE_DASHBOARD_PORT module — what setup_applied used to generate inline. nix invocations carry --extra-experimental-features as a belt in case flakes aren't enabled in nix.conf.	2026-05-16 00:22:37 +02:00
müde	5b5a93e0c6	lifecycle: module-only agent flake.nix, tracked in proposed setup_proposed now seeds both agent.nix (a regular NixOS module function) and flake.nix (boilerplate exporting nixosModules.default = import ./agent.nix) into the manager-editable proposed repo, committed together. setup_applied's hyperhive_flake + dashboard port wrapper generation is deleted entirely — the meta flake at /var/lib/hyperhive/meta/ now owns the wrapper module. setup_ applied just fetches proposed's main on first spawn and tags deployed/0; subsequent rebuilds touch nothing in applied that the manager didn't author. spawn + rebuild keep their old param list with the now-unused hyperhive_flake + dashboard_port underscored — call sites get cleaned up after the meta module lands and consumes them.	2026-05-16 00:10:06 +02:00
müde	e26143a412	dashboard: diff against applied/proposal/<id>, prefer fetched_sha approval_diff now runs git diff refs/heads/main..refs/tags/ proposal/<id> against the applied repo instead of cobbling a single-file diff from proposed. consequences: multi-file proposals show every change, manager amendments in proposed cannot lie about what'll be deployed, no-op proposals render an explicit '(proposal matches currently-deployed tree)'. displayed sha prefers fetched_sha (hive-c0re-vouched) and falls back to commit_ref only for the brief pre-fetch window. unified_diff helper + similar dep dropped — git diff is the source of truth now. dead-code allows on the lifecycle git helpers + approvals.set_fetched_sha come off since all are wired up. readme picks up the tag flow + /applied RO mount.	2026-05-15 23:18:17 +02:00
müde	fc61cb9310	fmt: clippy doc_markdown backticks	2026-05-15 23:11:10 +02:00
müde	4a8204f035	lifecycle: bind /applied into manager read-only set_nspawn_flags now adds --bind-ro=/var/lib/hyperhive/applied :/applied for the manager container alongside the existing /agents RW mount. manager can git-fetch deployed/failed/denied tags out of /applied/<n>/.git to mirror them into its proposed clones; the read-only bind means git plumbing inside the container cannot corrupt the authoritative repos. picked up by the next rebuild of hm1nd (no spawn-time change needed since set_nspawn_flags runs on every spawn + rebuild).	2026-05-15 23:02:31 +02:00
müde	6cf66e23dc	actions: deny plants annotated denied/<id> tag apply-commit denials now leave a git object behind: tag denied/<id> annotated with the operator's note (or empty body if they didn't supply one) at proposal/<id> inside the applied repo. rejected configs become first-class git history — git show denied/<id> in the manager's applied.git mount yields the tree the operator rejected plus the reason. helper event carries the tag for parity with deployed/failed. spawn denials fall through unannotated since they have no proposal commit. deny becomes async (single git plumbing call); dashboard + admin-socket callers grow .await.	2026-05-15 23:01:22 +02:00
müde	315d4289c7	actions: tag-driven approve(ApplyCommit) flow run_apply_commit walks the approval through the tag state machine in applied: approved/<id> + building/<id> stamped before the build, then git read-tree --reset to proposal/<id> populates the working dir without moving HEAD. on rebuild success deployed/<id> is planted and refs/heads/main fast- forwards to the proposal. on failure failed/<id> is annotated with the build error and the working tree resets back to main so the agent stays evaluable. helper events Rebuilt + ApprovalResolved both carry the terminal tag so the manager can git-show the exact tree (and read the failure note from an annotated tag) against its read-only applied.git mount. finish_approval grows a terminal_tag param; spawn path passes None. lifecycle::apply_commit deleted.	2026-05-15 23:00:01 +02:00
müde	35b0edaf27	manager_server: fetch+tag at request_apply_commit submit submit_apply_commit (1) queues the approval row, (2) git-fetches the manager-supplied sha from proposed into applied, pins it as refs/tags/proposal/<id>, (3) persists the resolved sha on the row via approvals.set_fetched_sha. from this point on the proposal is immutable from the manager's perspective: amends or force-pushes in proposed do not change what hive-c0re will build. fetch failures mark the row failed and surface the error to the manager so a phantom pending entry can't linger.	2026-05-15 22:57:43 +02:00
müde	8cb8fcedad	lifecycle: setup_applied seeds via fetch + tags deployed/0 new shape: applied is git-init'd at first spawn, fetches proposed's initial commit into its main, tags deployed/0 there. the wrapper flake.nix is regenerated on every spawn/rebuild but no longer tracked — apply churn vanishes, manager-authored files in the proposal flow now survive untouched. setup_applied gains an Option<&Path> for proposed (None on rebuild paths that just refresh the flake). pre-overhaul applied dirs are detected via the missing deployed/0 tag and bail loudly with the destroy --purge migration hint. apply_commit is stubbed with a clear error until the tag-driven approve flow lands.	2026-05-15 22:56:58 +02:00
müde	63ef69674b	lifecycle: git helpers for tag-driven applied repo new plumbing for the upcoming flow: git_fetch_to_tag (pulls a sha from proposed into applied and pins it as a tag in one shot), git_rev_parse (normalises shas + reads back tag targets), git_tag / git_tag_annotated (lightweight vs body- carrying for failed/denied), git_read_tree_reset (replace working tree without moving HEAD — lets main stay on last known-good across an in-flight build), git_update_ref (ff main on deploy). annotated tag bodies go via stdin to avoid escape games. all dead-code-allowed; callers land in subsequent commits.	2026-05-15 22:52:23 +02:00
müde	b32c3d4f98	approvals: persist fetched_sha alongside the queue new column fetched_sha records the canonical sha hive-c0re plans to fetch from the proposed repo into applied at submit time. distinct from commit_ref (manager-supplied, may be amended out from under the queue). set_fetched_sha is unused until manager_server wires the fetch step next commit.	2026-05-15 22:49:04 +02:00
müde	871e7bf3fa	wire types: add sha + tag to Approval and HelperEvent approval grows fetched_sha (canonical hive-c0re-vouched sha, distinct from manager-supplied commit_ref). helperevent {approvalresolved,spawned,rebuilt} grow optional sha + tag so the manager can git-show the exact tree it's hearing about (against the upcoming /agents/<n>/applied.git RO mount) and know which terminal tag landed. all serde-defaulted; existing construction sites pass none until the tag-driven flow lands.	2026-05-15 22:47:39 +02:00
müde	6a2ffd521b	surface agent-vs-agent port collisions (manager:8000 can't collide) manager is fixed at 8000, sub-agents are 8100-8999, so collisions are strictly between two sub-agents hashing to the same value. the colliding container's harness restart-loops on AddrInUse — which the user just hit on :8945. previously the only sign was a buried journalctl warn line. now surfaced two ways: - lifecycle::spawn / rebuild preflight: walks the live container list, computes each agent's hashed port, refuses with 'port N already taken by <other> — rename one of them' if any running sub-agent shares the new agent's port. so the operator sees an actionable error in the dashboard's transient pill / approve-result instead of waiting for the harness to die. - /api/state grows a port_conflicts: [{port, agents: [...]}] array; dashboard renders a pulsing red banner above the containers list listing each cluster. matches the questions panel pulse so it's hard to miss.	2026-05-15 22:08:19 +02:00
müde	2029840671	deny: operator can attach a reason that reaches the manager clicking DENY on the dashboard now prompts for an optional reason ('reason for denying (optional, sent to manager):'). the value rides along as a hidden 'note' form field; backend chain: POST /deny/{id} { note } → actions::deny(coord, id, Some(note)) → Approvals::mark_denied writes it to the row → HelperEvent::ApprovalResolved { ..., note: Some("...") } manager already had note: Option<String> on the event, just never populated for denials before. host admin socket (hive-c0re deny) still passes None. generalized the prompt-on-submit pattern: any form with a data-prompt attribute pops a window.prompt() before the POST and stashes the answer in a hidden input named by data-prompt-field (default 'note'). reusable for future opt-in note fields.	2026-05-15 21:58:42 +02:00
müde	91c78d626f	dashboard: per-container applied agent.nix viewer new GET /api/agent-config/{name} returns the contents of /var/lib/hyperhive/applied/<name>/agent.nix — the file the container actually builds against. validated against the live container list to avoid arbitrary filesystem reads. frontend mirrors the journald viewer: collapsed <details> on each container row, lazy-fetches on expand, refresh button re-fetches. restore-keyed (agent-config:<name>) so it survives the dashboard heartbeat refresh. read-only — mutating the applied config goes through the existing request_apply_commit + operator approval flow.	2026-05-15 21:46:25 +02:00
müde	80229c6af9	manager: needs_login / logged_in / needs_update events + update tool crash_watch grows two more state-axes alongside running/stopped: - logged-in (claude session dir populated for the agent) - up-to-date (recorded flake rev matches current) per-tick transitions emit HelperEvent::NeedsLogin / LoggedIn / NeedsUpdate. seed-on-first-tick semantics retained — nothing fires on harness boot for agents that were already in their state. only needs_update fires the 'stale appeared' direction; the resolved direction is already covered by Rebuilt. new mcp__hyperhive__update(name) on the manager surface: idempotent rebuild via auto_update::rebuild_agent. transient-aware (Rebuilding) so the dashboard shows the spinner. login intentionally has NO tool — it's interactive OAuth, only the operator can complete it. prompts + approvals doc + turn-loop doc updated. todo grows a 'show per-agent applied config in dashboard' entry (separate follow-up).	2026-05-15 21:42:13 +02:00
müde	b374f39b0d	dashboard: preserve <details open> across refresh via data-restore-key generalises the focus-preservation pattern to expanded details sections (journald viewer was collapsing on every 5s refresh; same issue for approval diff blocks). before re-render we snapshot which <details data-restore-key=...> are open; after render we re-apply. setting .open = true programmatically also fires the toggle event, so journald's lazy-fetch listener re-runs cleanly. tagged: journal:<container>, approval-diff:<id>. anything else that should survive a refresh just needs a stable data-restore-key attribute.	2026-05-15 21:37:17 +02:00
müde	3b532753b3	notifications: per-event tags + debug logs bug: all notifications used tag='hyperhive', so each new fire replaced the previous — operator only ever saw one at a time and might miss the fact that a second arrived. now per-event tags (hyperhive:approval:<id>, hyperhive❓<id>, hyperhive:msg:<at>:<rand>) so distinct events stack in the OS notification center. dropped the bogus icon (was pointing at dashboard.css) — some browsers refuse to display a notification with an invalid icon. added console.debug at every block point (not supported, permission not granted, muted) and a 'shown' log on success, so the operator can see in the browser console exactly why a notification didn't fire. note for the operator: most browsers also suppress notifications while the originating tab is FOCUSED. that's a browser-level decision, not ours.	2026-05-15 21:34:21 +02:00
müde	d275b50177	dashboard: don't yank the form away while operator is typing every refreshState tick does root.innerHTML = '' across the managed sections, which destroys any focused input. detect the case before re-rendering: if document.activeElement is an INPUT / TEXTAREA / SELECT inside one of the managed sections, skip this tick and try again in 2s. eventually the operator blurs and the refresh lands. managed section ids: containers / tombstones / questions / inbox / approvals. msgflow + message-flow SSE rows don't have inputs so they're not affected.	2026-05-15 21:19:01 +02:00
müde	acaa0eb895	agent_web_port: back to pure hash, drop port-file dance operator's call: probing-forward + state-file machinery is more brittle than the bug it tried to fix. revert to the original deterministic FNV-1a hash mod 900. collisions are real but rare; operator resolves by renaming (different name → different hash) and rebuilding. no per-agent port file, no scan, no migration path, nothing to drift out of sync with the running container. existing port files on disk are silently ignored — operator rebuilds affected agents to regenerate flakes from the deterministic hash.	2026-05-15 21:17:31 +02:00
müde	c35f566d15	agent_web_port: actually resolve legacy collisions previous attempt was wrong: the legacy branch returned port_hash unconditionally, so two legacies hashing to the same port both wrote that port and the collision persisted (test still trying to bind coder's port). new rule: always probe forward from port_hash, with scan_taken_ports parameterised by include_implicit_hashes: - legacy migration (applied dir exists, no port file): pass false. scan only counts other agents' port files. first-queried legacy claims its hash; subsequent colliders see the first's port file and probe forward. we don't know which legacy originally won the bind race, so first-write-wins; the loser was already crash-looping anyway and gets a fresh port to rebuild to. - fresh spawn (no applied dir): pass true. counts port files AND implicit hashes for not-yet-migrated legacies, so a new spawn doesn't race with an unmigrated peer. migration note for affected users: agents whose port file says something other than their hashed port may have been corrupted by the previous fix. Hit ↻ R3BU1LD on the offender to regenerate the flake (uses the current port file) and the container will bind the right port on restart.	2026-05-15 21:13:17 +02:00
müde	237b215c55	dashboard: browser notifications for operator-bound events three signals fire OS notifications: - new approval lands in the queue (per id, via /api/state delta) - new ask_operator question queued (per id) - broker message sent to operator (live via SSE) first /api/state render after page load seeds the 'seen' sets without firing — only items that arrive while the page is open count. controls in a row under the banner: 🔔 enable notifications (calls requestPermission, hides on grant), 🔕 mute / 🔔 unmute toggle (localStorage-backed so operator can silence without revoking the permission), inline status text when blocked or unsupported. notification tag='hyperhive' collapses rapid bursts; onclick focuses the dashboard tab. requires secure context (HTTPS or localhost) — on other origins the API is unavailable and the controls hide themselves. todo: entry dropped.	2026-05-15 21:10:20 +02:00
müde	58c3cd853b	container crash watcher → HelperEvent::ContainerCrash new hive_c0re::crash_watch task polls every 10s, builds the set of currently-running containers, and on running→stopped transitions checks the transient snapshot: if no Stopping / Restarting / Destroying / Rebuilding flag is set, the container exited unexpectedly and we fire HelperEvent::ContainerCrash into the manager's inbox so it can react (typically: start it again). first poll is a seeding pass — no events on harness startup. dbus subscription would be lower-latency but polling is honest and debuggable, and a 10s delay on crash detection is fine for our scale. manager prompt + approvals doc updated to advertise the new event variant. todo drops the entry (and the journald-viewer entry that already shipped).	2026-05-15 21:02:05 +02:00
müde	6db38cf70c	model: runtime override via /model slash; fixes for port + bind - runtime model override: Bus::{model,set_model} + POST /api/model (form-encoded {model: name}). turn.rs reads bus.model() per turn so a flip lands on the next claude invocation. /api/state grows a model field; agent page shows a 'model · <name>' chip in the state row. '/model <name>' slash command POSTs to the endpoint and refreshes state. - port regression fix: agent_web_port no longer probes forward for existing agents (the previous fix shifted ports for any agent without a port file, including legacy ones whose container was already bound to the bare hashed port — dashboard rendered the new port, container was still on the old one, conn errors). new rule: port file exists → use it; absent + applied flake present → legacy, persist port_hash without probing; absent + no applied flake → fresh spawn, probe forward. - SO_REUSEADDR on both the dashboard and per-agent web UI binds via tokio::net::TcpSocket. operator hit 12 retries failing on manager :8000 — REUSEADDR handles the TIME_WAIT case cleanly without a new dep; retry still covers the genuine process-still-alive overlap. todo: drops the model-override entry (shipped); adds two new items — model persistence (optional, future), and custom per-agent MCP tools (groundwork for moving bitburner-agent into hyperhive).	2026-05-15 20:59:45 +02:00
müde	7d93dd9db4	no nap tool — recv with long wait_seconds replaces it; max raised to 180s recv-with-timeout is strictly better than a fixed sleep because it wakes instantly on incoming messages. drop the half-written nap MCP tool, raise the recv wait_seconds cap from 60s to 180s on both agent and manager sockets. prompts updated: agent.md + manager.md now spell out the pattern — when there's nothing else useful to do, call recv with wait_seconds=180 to park the turn; do NOT use Bash sleep for the same purpose. todo drops the nap entry and the napping-state-badge follow-up; both replaced by 'just use a long recv'.	2026-05-15 20:53:15 +02:00
müde	f65ee88269	recv: optional wait_seconds parameter, capped at 60s AgentRequest::Recv and ManagerRequest::Recv grow an optional wait_seconds field (default None → 30s, capped at 60s server-side). agent_server / manager_server clamp via recv_timeout(). MCP tool schemas advertise the param so claude can pick its own poll window — useful when an agent wants to throttle wakes without entering a distinct nap state. both harness loops still pass None, keeping the existing 30s default behaviour for system-level Recvs.	2026-05-15 20:49:33 +02:00
müde	0385d96bf3	dashboard: per-container journald viewer new GET /api/journal/{name}?unit=&lines= shells out journalctl -M <container> -b --no-pager --output=short-iso --lines=<N> (cap 5000). optional unit filter, restricted to hive-ag3nt.service / hive-m1nd.service so the shell-out can't be coerced into reading unrelated units. validates the container name against the live list before invoking journalctl. frontend renders a collapsed '↳ logs · <container>' details block on each container row. expanding triggers a lazy fetch; refresh button re-fetches; unit dropdown switches between the harness service (default) and the full machine journal. output sits in a 24em-tall monospace pre, auto-scrolled to the bottom on fresh fetch. hive-c0re's systemd unit already runs as root, so journalctl has the access it needs.	2026-05-15 20:42:56 +02:00
müde	79a46f359a	agent_web_port: collision-aware sticky allocation operator hit 'coder' and 'test' colliding on the same hashed port — fnv-1a mod 900 has ~0.1% collision probability per pair and clearly that's not enough. agent_web_port goes stateful: - per-agent port persisted to /var/lib/hyperhive/agents/<name>/port - on first call, look up the file; if absent, hash, then probe forward through the allocated range skipping any port other agents already claim, then write the chosen value back - subsequent calls return the persisted port (sticky) other agents' ports come from their port file if present, else the fallback is the hashed value — that handles existing deployments without forcing a rebuild-all just to migrate. rebuilding the colliding agent re-runs agent_web_port, sees its peer's implicit hash port as taken, picks the next free slot, persists. range exhaustion (very unlikely — 900 slots) logs a warning and returns the hash; the bind-with-retry on the harness will surface the failure honestly rather than silently looping.	2026-05-15 20:41:18 +02:00
müde	754db7830e	ask_operator: ttl_seconds auto-cancel + remaining-time chip manager can pass ttl_seconds to ask_operator. on submit, host stores deadline_at = now + ttl in operator_questions (new column, migrated via existing pragma_table_info pattern), spawns a tokio task that sleeps until the deadline then resolves the question with answer '[expired]' and fires the same OperatorAnswered helper event. already-resolved races no-op silently. dashboard renders a '⏳ MM:SS' chip on the question row when deadline_at is set. format collapses seconds → s, < 1h → m s, ≥ 1h → h m. heartbeat refresh (5s) keeps the chip current; the operator sees it tick down. manager prompt + mcp tool description updated. journald viewer per container queued in todo (separate task).	2026-05-15 20:38:02 +02:00
müde	2146e47770	web ui: retry binding on AddrInUse during restart races operator hit 'Address already in use (os error 98)' on a harness restart — the new harness raced the old socket's release. add a bind_with_retry helper that backs off (250ms doubling, capped at 2s, 12 tries ≈ 22s total) on AddrInUse before giving up. applied to both the per-agent web UI and the hive-c0re dashboard. proper fix would be SO_REUSEADDR via socket2 but retry covers the TIME_WAIT case fine and keeps the dep count down. Other bind errors still fail immediately (port permission, fd exhaustion).	2026-05-15 20:33:51 +02:00
müde	538e0446d7	agent page: inbox view of last 30 messages addressed to this agent new wire request AgentRequest::Recent { limit } / ManagerRequest::Recent (plus matching responses with Vec<InboxRow>). InboxRow moved to hive-sh4re so it lives on both surfaces without an internal-to-wire conversion. host-side dispatch in agent_server / manager_server calls broker.recent_for(name, limit). per-agent web_ui /api/state grew an inbox: Vec<InboxRow> populated via the same per-agent socket (best-effort; transport failure returns empty). frontend renders as a collapsible <details> section between the state row and the terminal — fmt timestamp / from / body in a tight grid, capped at 16em scrollable. only visible when there are rows.	2026-05-15 20:32:19 +02:00
müde	ee5b85716d	ask_operator: operator-side ✗ CANC3L on pending questions new POST /cancel-question/{id} resolves a pending operator question with the sentinel answer '[cancelled]' and fires the usual HelperEvent::OperatorAnswered so the manager sees a terminal state and can fall back. uses the same OperatorQuestions::answer path — no special handling, the manager already has to deal with arbitrary answer strings. dashboard renders the cancel as a separate <form> below the main qform so the answer-merge submit handler on the main form doesn't inadvertently fire when the operator clicks cancel. confirm dialog spells out what the manager will see. ttl-based auto-cancel is still on the todo (would spawn a tokio task per submitted question).	2026-05-15 20:25:11 +02:00
müde	2413d664a1	agents get a kickoff inbox message on start/restart/rebuild new Coordinator::kick_agent(name, reason) drops a system message into the agent's inbox so the next turn picks it up with a 'you were just (re)started, check /state/ for notes, --continue session is intact' hint. wakes the turn loop without any harness-side handling needed — it's just another inbox message with sender = 'system'. wired from: - dashboard /start /restart /rebuild handlers (via lifecycle_action's on-success tail) - manager mcp_hyperhive_start / restart dashboard: pending approvals + tombstones + questions now refresh on a 5s heartbeat when nothing else is happening. previously refresh only fired on async-form submit or on broker traffic addressed to operator — manager-queued approvals went through neither, so the operator had to reload to see them. 5s is the slow-path; 2s remains for in-flight transients.	2026-05-15 20:19:36 +02:00
müde	c27111ac32	dashboard: split api_state into per-section builders drops the #[allow(clippy::too_many_lines)] on api_state by extracting four pure helpers: - build_container_views — live containers + any_stale flag - build_transient_views — agents in pre-creation Spawning state only - build_approval_views — pending approvals with diff html - build_tombstone_views — destroyed-but-kept state dirs api_state itself is now ~30 lines of orchestration. zero behavior change. each helper is independently readable + testable.	2026-05-15 20:13:08 +02:00
müde	7b4adea325	dashboard: lifecycle_action helper collapses start/stop/restart/rebuild five POST handlers (post_kill / post_restart / post_start / post_rebuild) were all repeating the same boilerplate: strip prefix, set_transient, call lifecycle::X, clear_transient, match the result. extract one helper that takes the transient kind, error-message verb, the work body, and an optional 'on success' tail (used by kill to also unregister + emit HelperEvent::Killed). each handler shrinks to a single lifecycle_action(..) call. zero behavior change.	2026-05-15 20:12:03 +02:00
müde	89ccc5e6c5	events.sqlite vacuum moves host-side retention is a host concern — agents have no business doing their own cleanup, and a misbehaving harness could skip it. drop spawn_events_vacuum from both hive-ag3nt and hive-m1nd, drop the matching Bus::vacuum + EventStore::vacuum methods. new hive_c0re::events_vacuum module sweeps every existing agents/<name>/state/hyperhive-events.sqlite on the same hourly cadence as the broker vacuum. same two-stage delete (older than 7 days, trim to 2000 newest). called from main alongside broker vacuum. also: server-side state badge entered into todo.md (today's badge is derived client-side from sse, fine for idle/thinking but a state machine that grows compacting/napping wants authoritative status from the harness).	2026-05-15 20:10:34 +02:00
müde	897e7c07ae	dashboard: spawn form moves under approvals; docs synced submitting R3QU3ST SP4WN immediately queues an approval that lands in the very next list. the form belonged with that list, not at the top of containers — the agent doesn't exist yet at form time anyway. docs: claude.md grows operator_questions.rs / events.rs sqlite / broker vacuum to the file map; web-ui shape lists the actual current endpoint set (per-agent cancel/compact/history, dashboard tombstone purge/answer/spawn); live-view section now describes the state badge, sticky-bottom scroll, history backfill, and the terminal- embedded prompt with its slash commands; dashboard-action-surface rewritten around the new six-section page (containers / kept-state / questions / inbox / approvals / message-flow) and the two-line container row. new 'persistence + retention' section documenting both sqlite databases and their vacuum cadences. readme picks up the new mgr mcp surface (start/restart/ask_operator) + operator-side features list + ask_operator answer flow. todo trimmed of shipped items (bigger terminal / sticky scroll / cancel button / /compact trigger / /cancel command). new entry for the two-step spawn-with-preconfig flow.	2026-05-15 20:02:54 +02:00
müde	5ee65d2f15	dashboard: K3PT ST4T3 section + agent links open in new tab new section between containers and questions: lists every name with a state dir under /var/lib/hyperhive/agents/ that doesn't correspond to a live container. shows state size + last-modified age + whether claude creds are kept. two actions per row: - R3V1V3 — queues a spawn approval with the same name (operator approves to recreate; spawn flow reuses prior config + claude creds, no re-login needed) - PURG3 — wipes the agent's state + applied dirs (post /purge-tombstone/ endpoint; refuses if a live container with that name still exists) dashboard also opens agent links in new tabs now (target=_blank + rel=noopener) so the operator's overview tab stays put when they dive into an agent.	2026-05-15 19:55:27 +02:00
müde	8344dd9ab7	ask_operator: multi-select + free-text fallback ask_operator now accepts a multi: bool. when true and options is non-empty, the dashboard renders the choices as checkboxes — operator picks any subset, answer comes back as a ', '-joined string. when false (default), options are radio buttons. independent of multi, a free-text input ('or type your own…') is always rendered alongside options so the operator is never trapped by an incomplete list. submit merges checked options + free text into the single 'answer' field. schema migration: operator_questions grows a multi INTEGER column with a one-shot ALTER TABLE on open. backward compatible — old rows default to 0 (not multi). prompt + mcp tool description updated; existing dashboard css for .qform was rewritten around the new vertical layout.	2026-05-15 19:52:44 +02:00
müde	c337cc06f8	dashboard: spinners on in-flight lifecycle actions + cleaner row layout backend: - TransientKind grows Starting / Stopping / Restarting / Rebuilding / Destroying alongside the existing Spawning. each dashboard handler (start/restart/kill/rebuild/destroy) wraps the lifecycle call with set_transient + clear_transient so the dashboard knows what's in flight. transient kind is surfaced inline on ContainerView.pending (existing-container actions) — only Spawning (pre-creation) lands in the separate transients list. frontend: - container row is now two lines: identity + meta on top, action buttons below. less cluttered, leaves room for the pending state pill. pending rows dim their actions and surface a pulsing '◐ spawning… / starting… / stopping… / restarting… / rebuilding… / destroying…' indicator next to the name. - 'needs login' / 'needs update' chips moved into a unified .badge styling for consistency. - auto-refresh kicks in not only on transient spawn but on any container with a pending action.	2026-05-15 19:49:43 +02:00
müde	6d52f67292	broker: hourly vacuum of delivered messages older than 30 days undelivered rows are always kept regardless of age (still in flight). sweep runs immediately on serve start then every hour. logs row count when non-zero. keep_secs is hard-coded for now (30 days); can be config-driven later if a host wants to retain more / less for audit.	2026-05-15 19:40:38 +02:00
müde	48ebfefd1a	destroy --purge: also wipe agent state dirs new --purge flag on the destroy verb (cli + admin socket + dashboard). default destroy still keeps /var/lib/hyperhive/{agents,applied}/<name>/ so recreating with the same name reuses prior config + creds. with --purge, both dirs go too (config history, claude creds, /state/ notes). no undo. dashboard adds a separate PURG3 button with an explicit confirmation copy; the existing DESTR0Y button keeps the soft semantics. claude.md dashboard-action-surface section updated; todo entry dropped.	2026-05-15 19:29:14 +02:00
müde	fd39226883	visuals: frosted-glass terminal/msgflow, row fade-in, badge pulses agent terminal-wrap + dashboard msgflow get a translucent bg with backdrop-filter blur+saturate so page-bg glow softens behind them. new rows in the live panel and the dashboard message flow fade in with a 4px slide-up. unread badge pulses; pending-operator-questions section pulses its glow. history-backfilled rows skip the animation (.no-anim) so the page doesn't stagger 100 fade-ins on load.	2026-05-15 19:20:15 +02:00
müde	ac1b5fde8e	manager: start/restart at will, no approval; refuse self new manager tools mcp__hyperhive__{start,restart} that delegate to the existing lifecycle::start / lifecycle::restart on the host. kill was already at the manager's discretion; rounding out start + restart for parity so day-to-day container care doesn't have to round-trip through the operator. guard: refuse self-targeting on kill/start/restart — the manager would just be cutting its own legs. spawn (request_spawn) and config changes (request_apply_commit) still go through the approval queue, since those are the actual gate. prompt + claude.md updated to make the boundary explicit. kill now also emits HelperEvent::Killed (it didn't before).	2026-05-15 18:57:25 +02:00
müde	d943bddd9e	agent ui: input lives in terminal section, banner shimmer on activity agent page restructure: - send form moves into the terminal panel as a prompt-style row beneath the live tail (status line stays above so it still reads as a header). - live panel + prompt share a single bordered 'terminal-wrap' box. - harness-alive / login-state status lines drop their decorative ascii bookends; just a leading dot/glyph remains. - banner gradient is now a real css gradient with a shimmer animation toggled by an .active class. turn_start adds it, turn_end removes it. dashboard side mirrors this: each broker sse event nudges a 4s shimmer window. - dashboard container rows drop their static ▓█▓▒░ / ▒░▒░░ glyph prefixes; the role chips already disambiguate m1nd vs ag3nt. - empty-state placeholders drop the ▓ bookends. terminal pre-fill: hive-ag3nt::events::Bus grows a 500-event ring buffer; new GET /events/history endpoint returns it. The agent JS fetches history before opening the SSE stream so opening the page mid- turn shows the last N events instead of a blank panel. The replay walks turn_start/turn_end pairs to seed the banner-active state correctly if a turn was still open.	2026-05-15 18:54:19 +02:00
müde	2770630f33	ask_operator tool: non-blocking; operator answer arrives as helper event new mcp tool on the manager surface that queues a question on the dashboard and returns the question id immediately. operator submits an answer via /answer-question/<id>; the dashboard fires HelperEvent::OperatorAnswered { id, question, answer } into the manager inbox so the next turn picks it up. also: fix async-form button stuck on spinner after successful submit (refreshState skipped re-rendering, so the button was never re-enabled).	2026-05-15 18:44:42 +02:00
müde	ff8f8c7c56	per-agent /state dir for durable notes; manager sees them via /agents	2026-05-15 18:00:08 +02:00
müde	7be64c5e66	theme: bring back the vibec0re glow on catppuccin mocha	2026-05-15 17:51:36 +02:00

1 2 3 4 5

218 commits