agents constantly emit pointer strings to /agents/<n>/state/foo.md since broker bodies cap at 1 KiB. now those tokens linkify in the message flow, question bodies, answer text, and operator inbox; clicking expands an inline <details> that lazy-fetches via the new /api/state-file?path=... endpoint. endpoint allow-list: per-agent state dirs + shared docs, both in their container-mount form (/agents/<n>/state, /shared) and host form (/var/lib/hyperhive/...). 1 MiB read cap; canonicalises before the prefix check so `..` / symlinks can't escape. legacy bare `/state/...` is deliberately not matched — ambiguous from the host's perspective (we'd need to know which agent the message references to translate). agents should use the qualified form going forward.
44 lines
6.3 KiB
Markdown
44 lines
6.3 KiB
Markdown
# Hyperhive TODOs
|
|
|
|
## Architecture / Features
|
|
|
|
- Shared space for all agents to access documents/files without manager routing
|
|
- Private git forge agents can push to and create new repos in
|
|
- Move bind mounts in agents to `/agents/<name>/state` so path for agent = path for manager
|
|
- **Broadcast messaging**: allow sending messages with recipient "*" to all agents; deliver with hint "this was a broadcast and may not need any action from you"
|
|
- **Multi-agent restart coordination**: when rebuilding all agents, manager should start first so it can coordinate post-restart confusion (notify agents, suppress unnecessary retries, etc)
|
|
- **Shared docs/skills repo (RO)**: a single repo on the hive forge that every agent has read-only access to — common references, prompts, runbooks, "skills" the operator wants every agent to inherit without baking into the system prompt or `/shared`. Implementation likely: seed an `org-shared/docs` repo on first hive-forge boot, grant every per-agent user a read membership in the org. Agents `git clone` it (or use the API) to read; only the manager + operator can push.
|
|
- **Loose-ends tracker + `get_open_threads` tool**: hive-c0re already knows about pending approvals + unanswered questions; soon will also know about open PRs on hive-forge. Aggregate these into a per-agent "open threads" view (e.g. `[{kind: "approval", id: 7, summary: "spawn alice"}, {kind: "question", id: 12, asker: "alice", summary: "deploy now?"}]`). New MCP tool `mcp__hyperhive__get_open_threads` returns the list so an agent can see what's still pending against it without rebuilding context from inbox history. Manager's version includes hive-wide threads. **Also surface this list on the per-agent web UI** so the operator can see at a glance what each agent has hanging open — same data source as the MCP tool, just rendered into the existing per-agent dashboard page (next to inbox view / model chip / etc).
|
|
- **Scope per agent X (confirmed with operator):** include BOTH (a) unanswered questions where `asker == X` (X is waiting on someone) AND (b) unanswered questions where `target == X` (X owes an answer). Distinguish via a `role: "asker" | "target"` field on the question variant so the agent can render "waiting on" vs "owe a reply" appropriately. Approvals: include rows where the submitter is X (waiting on the operator). Forge PRs (future): include open PRs where X is author OR reviewer.
|
|
- **Wire shape sketch:** new `AgentRequest::GetOpenThreads` / `ManagerRequest::GetOpenThreads` returning `Response::OpenThreads { threads: Vec<OpenThread> }` with `OpenThread` as a tagged enum (`{kind: "approval", id, summary, age_seconds}` / `{kind: "question", id, role, counterparty, summary, age_seconds}` / future `{kind: "pr", ...}`). Manager flavour returns hive-wide threads (no asker/target filter). MCP tool `get_open_threads` takes no args.
|
|
- **Aggregator location:** new helper on `Coordinator` (or a dedicated `open_threads.rs`) so both surfaces share the query logic; queries `approvals` + `operator_questions` tables with a single per-call sweep (no caching — call frequency is low).
|
|
|
|
## Reminder Tool
|
|
|
|
- Per-agent reminder limits (burst capacity, rate limiting)
|
|
- **Scheduler shutdown**: add graceful shutdown signal when coordinator is destroyed (currently runs forever)
|
|
- **DB lock contention**: under high reminder volume, the broker's `Mutex<Connection>` serializes every delivery transaction. Consider batching multiple deliveries into one tx, or moving reminders onto a separate sqlite connection.
|
|
|
|
## Dashboard
|
|
|
|
<!-- Landed: dashboard questions pane now shows both operator-targeted
|
|
and peer (agent-to-agent) threads, with filter chips (all /
|
|
@operator / @peer / per-participant) and an 0V3RR1D3 button on
|
|
peer rows so the operator can answer when an agent is stuck. -->
|
|
<!-- Landed: PATH_RE-detected pointer strings in message + question +
|
|
answer + inbox bodies linkify to inline <details> previews
|
|
fetched from /api/state-file. Allow-list: `/agents/<n>/state/...`
|
|
+ `/var/lib/hyperhive/agents/<n>/state/...` + `/shared/...` +
|
|
`/var/lib/hyperhive/shared/...`. Legacy bare `/state/...` is
|
|
intentionally NOT matched (ambiguous from host's perspective);
|
|
prefer `/agents/<n>/state/...` in agent outputs. -->
|
|
- **UI for pending reminders**: show pending/queued reminders in dashboard, allow operator to view/debug/cancel
|
|
- Per-agent reminder status (pending, delivered)
|
|
- Reminder query interface for debugging
|
|
- Display reminder delivery errors (failed sends, mark failures)
|
|
- **Phase 6 follow-ups** — dashboard side is fully event-driven (Phase 6 leftovers landed); the per-agent web UI's lifecycle endpoints (`/api/{cancel,compact,model,new-session}`, `/login/*`) still 303-redirect-and-poll. Convert them to 200 + `data-no-refresh` so the per-agent page stops refetching `/api/state` on every operator click — `LiveEvent::Note` already covers cancel/compact/model/new-session, login state needs its own `NeedsLogin` / `LoggedIn` events on the per-agent bus.
|
|
- **Tombstones + meta_inputs events**: not yet event-derived. PURG3 + meta-update still trigger a post-submit `/api/state` refetch on the dashboard. Add `TombstoneAdded`/`TombstoneRemoved` + `MetaInputsChanged` so those forms can drop their refetch too and the cold-load is the only `/api/state` fetch in normal operation.
|
|
|
|
## Bugs
|
|
|
|
- **Post-rebuild system-message missed wake**: at 09:13:14 the dashboard showed `system → damocles container rebuilt` as ✓ delivered, but the agent harness never ran a turn for it (no claude invocation, no operator-visible activity). A subsequent `recv()` from inside the agent returned `(empty)`, confirming the message was popped + marked delivered server-side — yet drove no turn. Most likely cause: the agent_server `serve_agent_stdio` task is up and answering MCP/socket calls, but the `hive-ag3nt::serve` long-poll loop that drives `drive_turn` either died silently during rebuild or never restarted. Investigate: (a) does hive-ag3nt's serve loop survive `nixos-container update` cleanly, or does its tokio runtime get torn down mid-loop? (b) is there an early-exit path on a transient socket error during rebuild that drops the serve task without notifying the manager? (c) compare timeline with manager's own post-rebuild wake to see if this is rebuilt-agents-only or universal. Could be related to the `recv_blocking` fix in `e423d57` if the rebuild restarts the broker mid-subscribe.
|