hyperhive

Author	SHA1	Message	Date
müde	b32c3d4f98	approvals: persist fetched_sha alongside the queue new column fetched_sha records the canonical sha hive-c0re plans to fetch from the proposed repo into applied at submit time. distinct from commit_ref (manager-supplied, may be amended out from under the queue). set_fetched_sha is unused until manager_server wires the fetch step next commit.	2026-05-15 22:49:04 +02:00
müde	871e7bf3fa	wire types: add sha + tag to Approval and HelperEvent approval grows fetched_sha (canonical hive-c0re-vouched sha, distinct from manager-supplied commit_ref). helperevent {approvalresolved,spawned,rebuilt} grow optional sha + tag so the manager can git-show the exact tree it's hearing about (against the upcoming /agents/<n>/applied.git RO mount) and know which terminal tag landed. all serde-defaulted; existing construction sites pass none until the tag-driven flow lands.	2026-05-15 22:47:39 +02:00
müde	497cd15137	docs: tag-driven config-apply plan + migration story scratchpad in claude.md marks this as in-flight; docs/approvals.md gets the new tag state machine (proposal/approved/building/deployed/ failed/denied) and the manager applied.git read-only mount. todo picks up the unprivileged-containers git-identity caveat and a web ui for config repos as a downstream follow-up.	2026-05-15 22:43:47 +02:00
müde	75e7faff0c	docs: full sync ahead of compaction + config-management overhaul readme: manager mcp surface picks up update; operator-surface recap mentions /model + last-turn + model chip + the three collapsibles (inbox / journald / agent.nix). web-ui.md: details-restore-key story under shape; port-conflict banner mention on containers; agent.nix viewer alongside journald; notifications use per-event tags + console.debug log on block/show; deny endpoint takes note=<reason>; data-prompt / data-prompt-field generalisation noted. conventions.md: data-prompt and snapshot/restoreOpenDetails added to the async-forms section. persistence.md: operator_questions row picks up deadline_at (ttl) column with a migration note. todo.md: new 'Bugs' section captures the manager-question not-rendering issue with three suspect paths to chase. claude.md scratchpad rewritten as a clean handoff for the compaction + the upcoming config-git overhaul. flags the two-repo (proposed/ + applied/) split as the thing to reconsider.	2026-05-15 22:12:40 +02:00
müde	6a2ffd521b	surface agent-vs-agent port collisions (manager:8000 can't collide) manager is fixed at 8000, sub-agents are 8100-8999, so collisions are strictly between two sub-agents hashing to the same value. the colliding container's harness restart-loops on AddrInUse — which the user just hit on :8945. previously the only sign was a buried journalctl warn line. now surfaced two ways: - lifecycle::spawn / rebuild preflight: walks the live container list, computes each agent's hashed port, refuses with 'port N already taken by <other> — rename one of them' if any running sub-agent shares the new agent's port. so the operator sees an actionable error in the dashboard's transient pill / approve-result instead of waiting for the harness to die. - /api/state grows a port_conflicts: [{port, agents: [...]}] array; dashboard renders a pulsing red banner above the containers list listing each cluster. matches the questions panel pulse so it's hard to miss.	2026-05-15 22:08:19 +02:00
müde	2029840671	deny: operator can attach a reason that reaches the manager clicking DENY on the dashboard now prompts for an optional reason ('reason for denying (optional, sent to manager):'). the value rides along as a hidden 'note' form field; backend chain: POST /deny/{id} { note } → actions::deny(coord, id, Some(note)) → Approvals::mark_denied writes it to the row → HelperEvent::ApprovalResolved { ..., note: Some("...") } manager already had note: Option<String> on the event, just never populated for denials before. host admin socket (hive-c0re deny) still passes None. generalized the prompt-on-submit pattern: any form with a data-prompt attribute pops a window.prompt() before the POST and stashes the answer in a hidden input named by data-prompt-field (default 'note'). reusable for future opt-in note fields.	2026-05-15 21:58:42 +02:00
müde	91c78d626f	dashboard: per-container applied agent.nix viewer new GET /api/agent-config/{name} returns the contents of /var/lib/hyperhive/applied/<name>/agent.nix — the file the container actually builds against. validated against the live container list to avoid arbitrary filesystem reads. frontend mirrors the journald viewer: collapsed <details> on each container row, lazy-fetches on expand, refresh button re-fetches. restore-keyed (agent-config:<name>) so it survives the dashboard heartbeat refresh. read-only — mutating the applied config goes through the existing request_apply_commit + operator approval flow.	2026-05-15 21:46:25 +02:00
müde	80229c6af9	manager: needs_login / logged_in / needs_update events + update tool crash_watch grows two more state-axes alongside running/stopped: - logged-in (claude session dir populated for the agent) - up-to-date (recorded flake rev matches current) per-tick transitions emit HelperEvent::NeedsLogin / LoggedIn / NeedsUpdate. seed-on-first-tick semantics retained — nothing fires on harness boot for agents that were already in their state. only needs_update fires the 'stale appeared' direction; the resolved direction is already covered by Rebuilt. new mcp__hyperhive__update(name) on the manager surface: idempotent rebuild via auto_update::rebuild_agent. transient-aware (Rebuilding) so the dashboard shows the spinner. login intentionally has NO tool — it's interactive OAuth, only the operator can complete it. prompts + approvals doc + turn-loop doc updated. todo grows a 'show per-agent applied config in dashboard' entry (separate follow-up).	2026-05-15 21:42:13 +02:00
müde	b374f39b0d	dashboard: preserve <details open> across refresh via data-restore-key generalises the focus-preservation pattern to expanded details sections (journald viewer was collapsing on every 5s refresh; same issue for approval diff blocks). before re-render we snapshot which <details data-restore-key=...> are open; after render we re-apply. setting .open = true programmatically also fires the toggle event, so journald's lazy-fetch listener re-runs cleanly. tagged: journal:<container>, approval-diff:<id>. anything else that should survive a refresh just needs a stable data-restore-key attribute.	2026-05-15 21:37:17 +02:00
müde	fd0e493bf5	agent terminal: show full body for send tool calls send was truncating to 80 chars in the tool_use row, hiding anything past the first sentence. now renders as a collapsed <details> like Write/Edit — summary still shows the recipient + headline (so the operator can scan), expanding reveals the full body unchanged. recv side was already covered: the wake prompt shows the full incoming body, and explicit recv() tool_result rows expand to the full text via the existing collapsed-results path.	2026-05-15 21:35:48 +02:00
müde	3b532753b3	notifications: per-event tags + debug logs bug: all notifications used tag='hyperhive', so each new fire replaced the previous — operator only ever saw one at a time and might miss the fact that a second arrived. now per-event tags (hyperhive:approval:<id>, hyperhive❓<id>, hyperhive:msg:<at>:<rand>) so distinct events stack in the OS notification center. dropped the bogus icon (was pointing at dashboard.css) — some browsers refuse to display a notification with an invalid icon. added console.debug at every block point (not supported, permission not granted, muted) and a 'shown' log on success, so the operator can see in the browser console exactly why a notification didn't fire. note for the operator: most browsers also suppress notifications while the originating tab is FOCUSED. that's a browser-level decision, not ours.	2026-05-15 21:34:21 +02:00
müde	62d1a74929	docs sync + revert auto-unfree removal revert the earlier 'operator must set allowUnfree' move: per-agent containers evaluate their own nixpkgs and the operator's host-level allowUnfree doesn't propagate in. restoring the scoped allowUnfreePredicate inside both the claude-unstable overlay and harness-base.nix; documented in README + gotchas as 'nothing to set on the operator side'. docs: - claude.md file map adds crash_watch.rs, kick_agent on coordinator, /api/model + journald viewer + bind-with-retry references. - scratchpad rewritten to reflect the recent run. - web-ui.md: notification row + browser notifications section, state row (badge + model chip + last-turn chip + cancel button), per-agent inbox, /model slash, /cancel-question + journald endpoints, focus-preservation on refresh. - turn-loop.md: --model is read from Bus::model() per turn (runtime override via /model); recv(wait_seconds) up to 180s with the rationale; ask_operator gains ttl_seconds; new TurnState section; kick_agent inbox-on-startup hint. - approvals.md: ttl/cancel resolution paths for operator questions. - persistence.md: /state/hyperhive-model file. - gotchas.md: web UI port collision policy (rename, don't probe); bind retry + SO_REUSEADDR shape; auto-unfree restored. - todo.md: cleaned up empty sections and stale entries; /model shipped, dropped from the list.	2026-05-15 21:26:13 +02:00
müde	d275b50177	dashboard: don't yank the form away while operator is typing every refreshState tick does root.innerHTML = '' across the managed sections, which destroys any focused input. detect the case before re-rendering: if document.activeElement is an INPUT / TEXTAREA / SELECT inside one of the managed sections, skip this tick and try again in 2s. eventually the operator blurs and the refresh lands. managed section ids: containers / tombstones / questions / inbox / approvals. msgflow + message-flow SSE rows don't have inputs so they're not affected.	2026-05-15 21:19:01 +02:00
müde	acaa0eb895	agent_web_port: back to pure hash, drop port-file dance operator's call: probing-forward + state-file machinery is more brittle than the bug it tried to fix. revert to the original deterministic FNV-1a hash mod 900. collisions are real but rare; operator resolves by renaming (different name → different hash) and rebuilding. no per-agent port file, no scan, no migration path, nothing to drift out of sync with the running container. existing port files on disk are silently ignored — operator rebuilds affected agents to regenerate flakes from the deterministic hash.	2026-05-15 21:17:31 +02:00
müde	c35f566d15	agent_web_port: actually resolve legacy collisions previous attempt was wrong: the legacy branch returned port_hash unconditionally, so two legacies hashing to the same port both wrote that port and the collision persisted (test still trying to bind coder's port). new rule: always probe forward from port_hash, with scan_taken_ports parameterised by include_implicit_hashes: - legacy migration (applied dir exists, no port file): pass false. scan only counts other agents' port files. first-queried legacy claims its hash; subsequent colliders see the first's port file and probe forward. we don't know which legacy originally won the bind race, so first-write-wins; the loser was already crash-looping anyway and gets a fresh port to rebuild to. - fresh spawn (no applied dir): pass true. counts port files AND implicit hashes for not-yet-migrated legacies, so a new spawn doesn't race with an unmigrated peer. migration note for affected users: agents whose port file says something other than their hashed port may have been corrupted by the previous fix. Hit ↻ R3BU1LD on the offender to regenerate the flake (uses the current port file) and the container will bind the right port on restart.	2026-05-15 21:13:17 +02:00
müde	237b215c55	dashboard: browser notifications for operator-bound events three signals fire OS notifications: - new approval lands in the queue (per id, via /api/state delta) - new ask_operator question queued (per id) - broker message sent to operator (live via SSE) first /api/state render after page load seeds the 'seen' sets without firing — only items that arrive while the page is open count. controls in a row under the banner: 🔔 enable notifications (calls requestPermission, hides on grant), 🔕 mute / 🔔 unmute toggle (localStorage-backed so operator can silence without revoking the permission), inline status text when blocked or unsupported. notification tag='hyperhive' collapses rapid bursts; onclick focuses the dashboard tab. requires secure context (HTTPS or localhost) — on other origins the API is unavailable and the controls hide themselves. todo: entry dropped.	2026-05-15 21:10:20 +02:00
müde	a67aada7c9	todo: browser notifications for approvals / questions / operator msgs pure frontend — Notification API + existing /api/state and /messages/stream signals. Caveats: secure-context requirement (HTTPS or localhost), per-browser permission grant. Includes a sketch of the implementation: request-permission button, count deltas on refreshState, SSE hook on operator-bound sends, localStorage 'muted' toggle.	2026-05-15 21:07:21 +02:00
müde	8b9f7d21b7	model persisted to /state; stop auto-allowing claude-code unfree model persistence: /model <name> now writes to /state/hyperhive-model (in-container), Bus::new reads it on init. operator override survives harness restart and container rebuild; gone on --purge like every other piece of agent state. path overridable via HYPERHIVE_MODEL_FILE for tests. failure to persist is a warn, not fatal — runtime override still applies, just won't survive a restart. unfree opt-in: drop the auto-allowUnfreePredicate from harness-base.nix and the claude-unstable overlay. operator now has to set nixpkgs.config.allowUnfree (or a predicate listing claude-code) in their own host config. silent unfree bypass was sketchy; this is honest. readme + gotchas updated to spell out the snippet. todo: drops model-persistence + container-crash + journald (all shipped); adds per-agent send allow-list (constrain who an agent can message).	2026-05-15 21:05:40 +02:00
müde	58c3cd853b	container crash watcher → HelperEvent::ContainerCrash new hive_c0re::crash_watch task polls every 10s, builds the set of currently-running containers, and on running→stopped transitions checks the transient snapshot: if no Stopping / Restarting / Destroying / Rebuilding flag is set, the container exited unexpectedly and we fire HelperEvent::ContainerCrash into the manager's inbox so it can react (typically: start it again). first poll is a seeding pass — no events on harness startup. dbus subscription would be lower-latency but polling is honest and debuggable, and a 10s delay on crash detection is fine for our scale. manager prompt + approvals doc updated to advertise the new event variant. todo drops the entry (and the journald-viewer entry that already shipped).	2026-05-15 21:02:05 +02:00
müde	6db38cf70c	model: runtime override via /model slash; fixes for port + bind - runtime model override: Bus::{model,set_model} + POST /api/model (form-encoded {model: name}). turn.rs reads bus.model() per turn so a flip lands on the next claude invocation. /api/state grows a model field; agent page shows a 'model · <name>' chip in the state row. '/model <name>' slash command POSTs to the endpoint and refreshes state. - port regression fix: agent_web_port no longer probes forward for existing agents (the previous fix shifted ports for any agent without a port file, including legacy ones whose container was already bound to the bare hashed port — dashboard rendered the new port, container was still on the old one, conn errors). new rule: port file exists → use it; absent + applied flake present → legacy, persist port_hash without probing; absent + no applied flake → fresh spawn, probe forward. - SO_REUSEADDR on both the dashboard and per-agent web UI binds via tokio::net::TcpSocket. operator hit 12 retries failing on manager :8000 — REUSEADDR handles the TIME_WAIT case cleanly without a new dep; retry still covers the genuine process-still-alive overlap. todo: drops the model-override entry (shipped); adds two new items — model persistence (optional, future), and custom per-agent MCP tools (groundwork for moving bitburner-agent into hyperhive).	2026-05-15 20:59:45 +02:00
müde	7d93dd9db4	no nap tool — recv with long wait_seconds replaces it; max raised to 180s recv-with-timeout is strictly better than a fixed sleep because it wakes instantly on incoming messages. drop the half-written nap MCP tool, raise the recv wait_seconds cap from 60s to 180s on both agent and manager sockets. prompts updated: agent.md + manager.md now spell out the pattern — when there's nothing else useful to do, call recv with wait_seconds=180 to park the turn; do NOT use Bash sleep for the same purpose. todo drops the nap entry and the napping-state-badge follow-up; both replaced by 'just use a long recv'.	2026-05-15 20:53:15 +02:00
müde	f65ee88269	recv: optional wait_seconds parameter, capped at 60s AgentRequest::Recv and ManagerRequest::Recv grow an optional wait_seconds field (default None → 30s, capped at 60s server-side). agent_server / manager_server clamp via recv_timeout(). MCP tool schemas advertise the param so claude can pick its own poll window — useful when an agent wants to throttle wakes without entering a distinct nap state. both harness loops still pass None, keeping the existing 30s default behaviour for system-level Recvs.	2026-05-15 20:49:33 +02:00
müde	637085644d	server-side TurnState in the harness, exposed via /api/state new TurnState { Idle, Thinking, Compacting } on hive_ag3nt::events::Bus with set_state + state_snapshot. the turn loops in hive-ag3nt and hive-m1nd flip Thinking before drive_turn and Idle after; the web_ui's /api/compact handler flips Compacting around compact_session. per-agent /api/state grows turn_state + turn_state_since (unix seconds). frontend prefers the server-reported state over the client-derived one — setStateAbs takes the absolute since-time so the 'last turn' chip reads the actual server-side duration instead of the client's perceived gap between SSE events. SSE turn_start / turn_end still drive state instantly between renders; /api/state re-anchors on each turn_end refresh. new compacting state gets its own purple badge with pulse animation (mirrors thinking's amber). napping will slot in the same way once the nap tool lands.	2026-05-15 20:46:38 +02:00
müde	0385d96bf3	dashboard: per-container journald viewer new GET /api/journal/{name}?unit=&lines= shells out journalctl -M <container> -b --no-pager --output=short-iso --lines=<N> (cap 5000). optional unit filter, restricted to hive-ag3nt.service / hive-m1nd.service so the shell-out can't be coerced into reading unrelated units. validates the container name against the live list before invoking journalctl. frontend renders a collapsed '↳ logs · <container>' details block on each container row. expanding triggers a lazy fetch; refresh button re-fetches; unit dropdown switches between the harness service (default) and the full machine journal. output sits in a 24em-tall monospace pre, auto-scrolled to the bottom on fresh fetch. hive-c0re's systemd unit already runs as root, so journalctl has the access it needs.	2026-05-15 20:42:56 +02:00
müde	79a46f359a	agent_web_port: collision-aware sticky allocation operator hit 'coder' and 'test' colliding on the same hashed port — fnv-1a mod 900 has ~0.1% collision probability per pair and clearly that's not enough. agent_web_port goes stateful: - per-agent port persisted to /var/lib/hyperhive/agents/<name>/port - on first call, look up the file; if absent, hash, then probe forward through the allocated range skipping any port other agents already claim, then write the chosen value back - subsequent calls return the persisted port (sticky) other agents' ports come from their port file if present, else the fallback is the hashed value — that handles existing deployments without forcing a rebuild-all just to migrate. rebuilding the colliding agent re-runs agent_web_port, sees its peer's implicit hash port as taken, picks the next free slot, persists. range exhaustion (very unlikely — 900 slots) logs a warning and returns the hash; the bind-with-retry on the harness will surface the failure honestly rather than silently looping.	2026-05-15 20:41:18 +02:00
müde	754db7830e	ask_operator: ttl_seconds auto-cancel + remaining-time chip manager can pass ttl_seconds to ask_operator. on submit, host stores deadline_at = now + ttl in operator_questions (new column, migrated via existing pragma_table_info pattern), spawns a tokio task that sleeps until the deadline then resolves the question with answer '[expired]' and fires the same OperatorAnswered helper event. already-resolved races no-op silently. dashboard renders a '⏳ MM:SS' chip on the question row when deadline_at is set. format collapses seconds → s, < 1h → m s, ≥ 1h → h m. heartbeat refresh (5s) keeps the chip current; the operator sees it tick down. manager prompt + mcp tool description updated. journald viewer per container queued in todo (separate task).	2026-05-15 20:38:02 +02:00
müde	2146e47770	web ui: retry binding on AddrInUse during restart races operator hit 'Address already in use (os error 98)' on a harness restart — the new harness raced the old socket's release. add a bind_with_retry helper that backs off (250ms doubling, capped at 2s, 12 tries ≈ 22s total) on AddrInUse before giving up. applied to both the per-agent web UI and the hive-c0re dashboard. proper fix would be SO_REUSEADDR via socket2 but retry covers the TIME_WAIT case fine and keeps the dep count down. Other bind errors still fail immediately (port permission, fd exhaustion).	2026-05-15 20:33:51 +02:00
müde	538e0446d7	agent page: inbox view of last 30 messages addressed to this agent new wire request AgentRequest::Recent { limit } / ManagerRequest::Recent (plus matching responses with Vec<InboxRow>). InboxRow moved to hive-sh4re so it lives on both surfaces without an internal-to-wire conversion. host-side dispatch in agent_server / manager_server calls broker.recent_for(name, limit). per-agent web_ui /api/state grew an inbox: Vec<InboxRow> populated via the same per-agent socket (best-effort; transport failure returns empty). frontend renders as a collapsible <details> section between the state row and the terminal — fmt timestamp / from / body in a tight grid, capped at 16em scrollable. only visible when there are rows.	2026-05-15 20:32:19 +02:00
müde	bd7d2d4860	agent page: dashboard back-link + last-turn timing chip title bar grows a '↑ DASHB04RD' link next to the rebuild button — opens the host dashboard in a new tab so the operator can pivot between agents without losing the live tail. uses the dashboardPort already plumbed via /api/state. state row picks up a 'last turn 12.3s' chip that fills in when state transitions away from thinking. format: ms / s.s / m s. hidden until the first turn completes.	2026-05-15 20:27:09 +02:00
müde	ee5b85716d	ask_operator: operator-side ✗ CANC3L on pending questions new POST /cancel-question/{id} resolves a pending operator question with the sentinel answer '[cancelled]' and fires the usual HelperEvent::OperatorAnswered so the manager sees a terminal state and can fall back. uses the same OperatorQuestions::answer path — no special handling, the manager already has to deal with arbitrary answer strings. dashboard renders the cancel as a separate <form> below the main qform so the answer-merge submit handler on the main form doesn't inadvertently fire when the operator clicks cancel. confirm dialog spells out what the manager will see. ttl-based auto-cancel is still on the todo (would spawn a tokio task per submitted question).	2026-05-15 20:25:11 +02:00
müde	bc87ff80d2	agent terminal: inline +/- diffs on Write and Edit tool calls Write and Edit tool_use rows used to render as the bare file path. now they're collapsed <details> blocks with the actual change inside — Write shows every content line prefixed '+', Edit shows old_string as '-' lines then new_string as '+' lines. summary carries the file path + counts ('→ Edit /foo · -3 +5'). lines colored via diff-add / diff-del / diff-ctx; click to expand the full body. renderFileWriteEdit returns null for any other tool so the existing flat-row path (fmtToolUse) is untouched.	2026-05-15 20:23:22 +02:00
müde	2413d664a1	agents get a kickoff inbox message on start/restart/rebuild new Coordinator::kick_agent(name, reason) drops a system message into the agent's inbox so the next turn picks it up with a 'you were just (re)started, check /state/ for notes, --continue session is intact' hint. wakes the turn loop without any harness-side handling needed — it's just another inbox message with sender = 'system'. wired from: - dashboard /start /restart /rebuild handlers (via lifecycle_action's on-success tail) - manager mcp_hyperhive_start / restart dashboard: pending approvals + tombstones + questions now refresh on a 5s heartbeat when nothing else is happening. previously refresh only fired on async-form submit or on broker traffic addressed to operator — manager-queued approvals went through neither, so the operator had to reload to see them. 5s is the slow-path; 2s remains for in-flight transients.	2026-05-15 20:19:36 +02:00
müde	8b10731aa4	split claude.md into docs/ — per-topic, human-readable claude.md was eating 400 lines of subsystem detail that's useful when you're working on that subsystem and noise the rest of the time. split into: - docs/conventions.md naming, identity, async forms, commit style - docs/gotchas.md nspawn / nixos-container quirks - docs/web-ui.md dashboard + per-agent layouts and endpoints - docs/turn-loop.md claude invocation, wake prompt, mcp surface - docs/approvals.md approval flow, manager policy, helper events - docs/persistence.md sqlite dbs, retention, state dir layout claude.md is now the entry point — file map, reading paths ("pick the doc that matches your task"), quick reminders that fit on one screen, and a small scratchpad section for in-flight context. references the docs; the docs don't reference claude.md. no content was lost — the docs/ files cover everything the old claude.md did, plus things i wrote up better while extracting.	2026-05-15 20:17:11 +02:00
müde	c27111ac32	dashboard: split api_state into per-section builders drops the #[allow(clippy::too_many_lines)] on api_state by extracting four pure helpers: - build_container_views — live containers + any_stale flag - build_transient_views — agents in pre-creation Spawning state only - build_approval_views — pending approvals with diff html - build_tombstone_views — destroyed-but-kept state dirs api_state itself is now ~30 lines of orchestration. zero behavior change. each helper is independently readable + testable.	2026-05-15 20:13:08 +02:00
müde	7b4adea325	dashboard: lifecycle_action helper collapses start/stop/restart/rebuild five POST handlers (post_kill / post_restart / post_start / post_rebuild) were all repeating the same boilerplate: strip prefix, set_transient, call lifecycle::X, clear_transient, match the result. extract one helper that takes the transient kind, error-message verb, the work body, and an optional 'on success' tail (used by kill to also unregister + emit HelperEvent::Killed). each handler shrinks to a single lifecycle_action(..) call. zero behavior change.	2026-05-15 20:12:03 +02:00
müde	89ccc5e6c5	events.sqlite vacuum moves host-side retention is a host concern — agents have no business doing their own cleanup, and a misbehaving harness could skip it. drop spawn_events_vacuum from both hive-ag3nt and hive-m1nd, drop the matching Bus::vacuum + EventStore::vacuum methods. new hive_c0re::events_vacuum module sweeps every existing agents/<name>/state/hyperhive-events.sqlite on the same hourly cadence as the broker vacuum. same two-stage delete (older than 7 days, trim to 2000 newest). called from main alongside broker vacuum. also: server-side state badge entered into todo.md (today's badge is derived client-side from sse, fine for idle/thinking but a state machine that grows compacting/napping wants authoritative status from the harness).	2026-05-15 20:10:34 +02:00
müde	897e7c07ae	dashboard: spawn form moves under approvals; docs synced submitting R3QU3ST SP4WN immediately queues an approval that lands in the very next list. the form belonged with that list, not at the top of containers — the agent doesn't exist yet at form time anyway. docs: claude.md grows operator_questions.rs / events.rs sqlite / broker vacuum to the file map; web-ui shape lists the actual current endpoint set (per-agent cancel/compact/history, dashboard tombstone purge/answer/spawn); live-view section now describes the state badge, sticky-bottom scroll, history backfill, and the terminal- embedded prompt with its slash commands; dashboard-action-surface rewritten around the new six-section page (containers / kept-state / questions / inbox / approvals / message-flow) and the two-line container row. new 'persistence + retention' section documenting both sqlite databases and their vacuum cadences. readme picks up the new mgr mcp surface (start/restart/ask_operator) + operator-side features list + ask_operator answer flow. todo trimmed of shipped items (bigger terminal / sticky scroll / cancel button / /compact trigger / /cancel command). new entry for the two-step spawn-with-preconfig flow.	2026-05-15 20:02:54 +02:00
müde	c9647f4106	operator control: /compact slash command + endpoint new POST /api/compact on the per-agent web UI: spawns turn::compact_session in the background so the http handler returns immediately. claude runs '/compact' over the persistent --continue session; output streams into the live panel like any other turn. slash command /compact wired to the new endpoint. SLASH_COMMANDS list now lists all four (/help /clear /cancel /compact). postCancelTurn + postCompact share a postSimple() helper. deliberately not gated against an in-flight turn — claude's own session lock will reject a concurrent compact and the failure surfaces as a Note in the live panel.	2026-05-15 19:56:53 +02:00
müde	5ee65d2f15	dashboard: K3PT ST4T3 section + agent links open in new tab new section between containers and questions: lists every name with a state dir under /var/lib/hyperhive/agents/ that doesn't correspond to a live container. shows state size + last-modified age + whether claude creds are kept. two actions per row: - R3V1V3 — queues a spawn approval with the same name (operator approves to recreate; spawn flow reuses prior config + claude creds, no re-login needed) - PURG3 — wipes the agent's state + applied dirs (post /purge-tombstone/ endpoint; refuses if a live container with that name still exists) dashboard also opens agent links in new tabs now (target=_blank + rel=noopener) so the operator's overview tab stays put when they dive into an agent.	2026-05-15 19:55:27 +02:00
müde	8344dd9ab7	ask_operator: multi-select + free-text fallback ask_operator now accepts a multi: bool. when true and options is non-empty, the dashboard renders the choices as checkboxes — operator picks any subset, answer comes back as a ', '-joined string. when false (default), options are radio buttons. independent of multi, a free-text input ('or type your own…') is always rendered alongside options so the operator is never trapped by an incomplete list. submit merges checked options + free text into the single 'answer' field. schema migration: operator_questions grows a multi INTEGER column with a one-shot ALTER TABLE on open. backward compatible — old rows default to 0 (not multi). prompt + mcp tool description updated; existing dashboard css for .qform was rewritten around the new vertical layout.	2026-05-15 19:52:44 +02:00
müde	c337cc06f8	dashboard: spinners on in-flight lifecycle actions + cleaner row layout backend: - TransientKind grows Starting / Stopping / Restarting / Rebuilding / Destroying alongside the existing Spawning. each dashboard handler (start/restart/kill/rebuild/destroy) wraps the lifecycle call with set_transient + clear_transient so the dashboard knows what's in flight. transient kind is surfaced inline on ContainerView.pending (existing-container actions) — only Spawning (pre-creation) lands in the separate transients list. frontend: - container row is now two lines: identity + meta on top, action buttons below. less cluttered, leaves room for the pending state pill. pending rows dim their actions and surface a pulsing '◐ spawning… / starting… / stopping… / restarting… / rebuilding… / destroying…' indicator next to the name. - 'needs login' / 'needs update' chips moved into a unified .badge styling for consistency. - auto-refresh kicks in not only on transient spawn but on any container with a pending action.	2026-05-15 19:49:43 +02:00
müde	300be8afa9	operator control: /cancel slash command + cancel button new POST /api/cancel on the per-agent web UI: shells out pkill -INT claude (procps added to harness-base.nix). emits a Note on the bus so the operator sees the cancel landed; state goes back to idle when run_claude wakes and emits TurnEnd as usual. frontend: - /cancel slash command in the terminal input - ■ cancel turn button in the state row, visible only while state === 'thinking' (driven from the same SSE-based state machine). disabled briefly during the POST. claude gets SIGINT (not TERM) so it flushes anything in-flight and emits a final result row before exiting.	2026-05-15 19:45:37 +02:00
müde	de09503b59	events: persist to sqlite, survive harness restart hive_ag3nt::events::Bus replaces its in-memory VecDeque with a sqlite- backed store at /state/hyperhive-events.sqlite (overridable via HYPERHIVE_EVENTS_DB). emit() inserts a row; history() reads back the most recent 2000 events. survives harness restart now — operator reload mid-investigation no longer wipes the trail. vacuum runs hourly (immediate first sweep): drop rows older than 7 days, then trim to 2000 newest. two-stage so a quiet agent keeps a useful tail and a chatty one stays bounded. wired into both hive-ag3nt and hive-m1nd via spawn_events_vacuum. if the db open fails (e.g. no /state mount in dev), Bus runs in no-store mode — events still broadcast, just nothing persisted.	2026-05-15 19:42:57 +02:00
müde	6d52f67292	broker: hourly vacuum of delivered messages older than 30 days undelivered rows are always kept regardless of age (still in flight). sweep runs immediately on serve start then every hour. logs row count when non-zero. keep_secs is hard-coded for now (30 days); can be config-driven later if a host wants to retain more / less for audit.	2026-05-15 19:40:38 +02:00
müde	a9ed33d94f	todo: trim state-badge entry to what's left (compacting/napping)	2026-05-15 19:36:42 +02:00
müde	211599c589	agent state badge: idle / thinking / offline + age timer new badge between the status line and the terminal. shows current state with a glyph + label + age suffix (e.g. '🧠 thinking · 12s'). state transitions are driven from existing SSE turn_start/turn_end — no harness changes needed. on page load, history backfill detects an in-flight turn (turn_start without matching turn_end) and starts in thinking. state-just-changed flash kicks in on each transition. age timer ticks client-side every 1s. compacting/napping states will be added when /compact and nap land — their slots are reserved in the state enum, just unused for now.	2026-05-15 19:36:29 +02:00
müde	0cc25d33d8	drop debug-only cli subcommands from hive-ag3nt + hive-m1nd drop the one-shot send/recv/kill/start/restart/request-spawn/request- apply-commit subcommands from both in-container binaries. they were debug-only — the host admin socket (`hive-c0re ...`) exposes the same verbs and the manager mcp surface covers the rest from inside claude. now each binary's --help shows just `serve` and `mcp`, which are the only commands either is meant to be started with. removes the `one_shot` helper and the `render` / `check` glue.	2026-05-15 19:34:58 +02:00
müde	08f2ec5232	agent terminal: sticky-bottom auto-scroll with new-row pill new rows no longer yank the view if the operator is scrolled up. threshold for 'near bottom' is 48px. when not near bottom, an amber '↓ N new' pill appears in the bottom-right of the terminal-wrap; clicking it jumps to bottom. scrolling back near bottom clears the counter. backfilled (history-replay) rows always scroll to bottom since the operator hasn't started reading yet.	2026-05-15 19:30:34 +02:00
müde	875a8f5be4	agent terminal: take up real screen space terminal height is now min(72vh, 60em) instead of a 32em strip — on a 1080p screen that's ~3x more visible lines. body max-width raised to 110em so a wide window doesn't waste the available width on the margin.	2026-05-15 19:29:36 +02:00
müde	48ebfefd1a	destroy --purge: also wipe agent state dirs new --purge flag on the destroy verb (cli + admin socket + dashboard). default destroy still keeps /var/lib/hyperhive/{agents,applied}/<name>/ so recreating with the same name reuses prior config + creds. with --purge, both dirs go too (config history, claude creds, /state/ notes). no undo. dashboard adds a separate PURG3 button with an explicit confirmation copy; the existing DESTR0Y button keeps the soft semantics. claude.md dashboard-action-surface section updated; todo entry dropped.	2026-05-15 19:29:14 +02:00

1 2 3 4 5

249 commits