agents constantly emit pointer strings to /agents/<n>/state/foo.md since broker bodies cap at 1 KiB. now those tokens linkify in the message flow, question bodies, answer text, and operator inbox; clicking expands an inline <details> that lazy-fetches via the new /api/state-file?path=... endpoint. endpoint allow-list: per-agent state dirs + shared docs, both in their container-mount form (/agents/<n>/state, /shared) and host form (/var/lib/hyperhive/...). 1 MiB read cap; canonicalises before the prefix check so `..` / symlinks can't escape. legacy bare `/state/...` is deliberately not matched — ambiguous from the host's perspective (we'd need to know which agent the message references to translate). agents should use the qualified form going forward.
6.3 KiB
6.3 KiB
Hyperhive TODOs
Architecture / Features
- Shared space for all agents to access documents/files without manager routing
- Private git forge agents can push to and create new repos in
- Move bind mounts in agents to
/agents/<name>/stateso path for agent = path for manager - Broadcast messaging: allow sending messages with recipient "*" to all agents; deliver with hint "this was a broadcast and may not need any action from you"
- Multi-agent restart coordination: when rebuilding all agents, manager should start first so it can coordinate post-restart confusion (notify agents, suppress unnecessary retries, etc)
- Shared docs/skills repo (RO): a single repo on the hive forge that every agent has read-only access to — common references, prompts, runbooks, "skills" the operator wants every agent to inherit without baking into the system prompt or
/shared. Implementation likely: seed anorg-shared/docsrepo on first hive-forge boot, grant every per-agent user a read membership in the org. Agentsgit cloneit (or use the API) to read; only the manager + operator can push. - Loose-ends tracker +
get_open_threadstool: hive-c0re already knows about pending approvals + unanswered questions; soon will also know about open PRs on hive-forge. Aggregate these into a per-agent "open threads" view (e.g.[{kind: "approval", id: 7, summary: "spawn alice"}, {kind: "question", id: 12, asker: "alice", summary: "deploy now?"}]). New MCP toolmcp__hyperhive__get_open_threadsreturns the list so an agent can see what's still pending against it without rebuilding context from inbox history. Manager's version includes hive-wide threads. Also surface this list on the per-agent web UI so the operator can see at a glance what each agent has hanging open — same data source as the MCP tool, just rendered into the existing per-agent dashboard page (next to inbox view / model chip / etc).- Scope per agent X (confirmed with operator): include BOTH (a) unanswered questions where
asker == X(X is waiting on someone) AND (b) unanswered questions wheretarget == X(X owes an answer). Distinguish via arole: "asker" | "target"field on the question variant so the agent can render "waiting on" vs "owe a reply" appropriately. Approvals: include rows where the submitter is X (waiting on the operator). Forge PRs (future): include open PRs where X is author OR reviewer. - Wire shape sketch: new
AgentRequest::GetOpenThreads/ManagerRequest::GetOpenThreadsreturningResponse::OpenThreads { threads: Vec<OpenThread> }withOpenThreadas a tagged enum ({kind: "approval", id, summary, age_seconds}/{kind: "question", id, role, counterparty, summary, age_seconds}/ future{kind: "pr", ...}). Manager flavour returns hive-wide threads (no asker/target filter). MCP toolget_open_threadstakes no args. - Aggregator location: new helper on
Coordinator(or a dedicatedopen_threads.rs) so both surfaces share the query logic; queriesapprovals+operator_questionstables with a single per-call sweep (no caching — call frequency is low).
- Scope per agent X (confirmed with operator): include BOTH (a) unanswered questions where
Reminder Tool
- Per-agent reminder limits (burst capacity, rate limiting)
- Scheduler shutdown: add graceful shutdown signal when coordinator is destroyed (currently runs forever)
- DB lock contention: under high reminder volume, the broker's
Mutex<Connection>serializes every delivery transaction. Consider batching multiple deliveries into one tx, or moving reminders onto a separate sqlite connection.
Dashboard
- UI for pending reminders: show pending/queued reminders in dashboard, allow operator to view/debug/cancel
- Per-agent reminder status (pending, delivered)
- Reminder query interface for debugging
- Display reminder delivery errors (failed sends, mark failures)
- Phase 6 follow-ups — dashboard side is fully event-driven (Phase 6 leftovers landed); the per-agent web UI's lifecycle endpoints (
/api/{cancel,compact,model,new-session},/login/*) still 303-redirect-and-poll. Convert them to 200 +data-no-refreshso the per-agent page stops refetching/api/stateon every operator click —LiveEvent::Notealready covers cancel/compact/model/new-session, login state needs its ownNeedsLogin/LoggedInevents on the per-agent bus. - Tombstones + meta_inputs events: not yet event-derived. PURG3 + meta-update still trigger a post-submit
/api/staterefetch on the dashboard. AddTombstoneAdded/TombstoneRemoved+MetaInputsChangedso those forms can drop their refetch too and the cold-load is the only/api/statefetch in normal operation.
Bugs
- Post-rebuild system-message missed wake: at 09:13:14 the dashboard showed
system → damocles container rebuiltas ✓ delivered, but the agent harness never ran a turn for it (no claude invocation, no operator-visible activity). A subsequentrecv()from inside the agent returned(empty), confirming the message was popped + marked delivered server-side — yet drove no turn. Most likely cause: the agent_serverserve_agent_stdiotask is up and answering MCP/socket calls, but thehive-ag3nt::servelong-poll loop that drivesdrive_turneither died silently during rebuild or never restarted. Investigate: (a) does hive-ag3nt's serve loop survivenixos-container updatecleanly, or does its tokio runtime get torn down mid-loop? (b) is there an early-exit path on a transient socket error during rebuild that drops the serve task without notifying the manager? (c) compare timeline with manager's own post-rebuild wake to see if this is rebuilt-agents-only or universal. Could be related to therecv_blockingfix ine423d57if the rebuild restarts the broker mid-subscribe.