Commit graph

51 commits

Author SHA1 Message Date
iris
91bfa269fd add reminder rollup RPC and broker query
Surface reminder activity statistics (scheduled, delivered, pending counts)
for each agent over configurable time windows. Needed by the per-agent
stats page to display reminder metrics.

Adds:
- ReminderStats struct and ReminderRollup request/response variants
- Broker::reminder_rollup_for(agent, since_secs) method
- Agent and manager socket handlers for the new RPC
- SocketReply mapping for response conversion
2026-05-20 13:24:17 +02:00
damocles
0a79912b67 get_logs: resolve machine name via container_name like every other verb 2026-05-20 10:48:24 +02:00
damocles
f8795dc029 fix: request_apply_commit resolves sha locally + rejects non-sha refs 2026-05-20 09:48:05 +02:00
damocles
5d27ae3048 recv: fold batch drain into recv(max) — one tool, uniform list response 2026-05-19 01:07:30 +02:00
damocles
77b89bf2c6 broker: recv_batch(max) — drain a bursty inbox in one round-trip 2026-05-19 00:47:21 +02:00
damocles
690cb5ab5b broker: lease-style delivery — ack_turn + requeue_inflight close the no-drop loop 2026-05-18 22:01:48 +02:00
damocles
6e23d087d2 rename: open_threads → loose_ends + cancel_thread → cancel_loose_end across wire / tools / web ui 2026-05-18 18:24:09 +02:00
damocles
b1d0a62cb9 cancel_thread: new mcp tool — unify reminder + question cancel on both surfaces 2026-05-18 18:24:09 +02:00
damocles
d395bdc945 whoami: drop operator_pronouns (redundant — already in system prompts at boot) 2026-05-18 00:04:58 +02:00
damocles
3c66cb6707 whoami: new mcp tool returning name/role/pronouns/hyperhive_rev on both surfaces 2026-05-18 00:04:58 +02:00
müde
8f5752980f turn_stats: per-turn analytics sink
new sqlite table at /state/hyperhive-turn-stats.sqlite on each
agent's state dir. one row per claude turn captures identity
(model, wake_from, result_kind), timing (started/ended_at,
duration_ms), cost (input/output/cache_read/cache_creation token
counts), behaviour (tool_call_count + per-tool breakdown JSON),
and post-turn snapshot metrics (open_threads_count,
open_reminders_count).

wire additions:
- AgentRequest/ManagerRequest::CountPendingReminders +
  Broker::count_pending_reminders_for(agent)
- Bus::observe_stream + take_tool_calls — pumps the existing
  stdout stream-json, picks out tool_use blocks, accumulates per
  turn. bin loops fold the breakdown into each row.
- TurnStats::open_default + TurnStatRow + record() — best-effort
  inserts; failures log + don't block the harness.

both ag3nt and m1nd bins capture started_at + duration via
Instant::elapsed, fetch open-thread + reminder counts from
hive-c0re via the existing socket (post-turn, best-effort), and
record one row at turn_end. record_kind splits ok / failed /
prompt_too_long; failures carry the error message in note.

todo entries for host-side vacuum sweep + reading the table back
into agent/dashboard badges.
2026-05-17 23:00:41 +02:00
damocles
dc1ce1f236 open_threads: new get_open_threads MCP tool on agent + manager surfaces 2026-05-17 22:52:08 +02:00
damocles
15f141801b limits: raise message body cap 1k → 4k (catches ~95% of conversational overflow) 2026-05-17 22:15:25 +02:00
damocles
82b0877c47 ask: rename ask_operator → ask + optional 'to' for agent-to-agent Q&A 2026-05-17 13:20:32 +02:00
damocles
1770b51845 manager mcp: expose 'remind' tool sharing storage helper with agent surface 2026-05-17 11:43:14 +02:00
damocles
1023acf69f add get_logs tool to manager mcp surface 2026-05-16 20:45:19 +02:00
damocles
4a8a668348 feat: add optional description to request_apply_commit and request_spawn 2026-05-16 15:18:32 +02:00
damocles
7e9fd8e978 agent: add Remind request + ReminderTiming enum (stub implementation) 2026-05-16 12:49:59 +02:00
damocles
862bc1de44 Revert "agent: add Wake command - co-process self-wake via agent socket"
This reverts commit 68a9b8575b1647643c87bd753767acabf96528c3.
2026-05-16 12:49:59 +02:00
damocles
f0e87f0bc5 agent: add Wake command - co-process self-wake via agent socket 2026-05-16 12:49:59 +02:00
müde
90df2106bf agent socket: external wake-up path for in-container MCP servers
new AgentRequest::Wake { from, body } drops a message into
this agent's inbox via the per-agent socket. matrix-style MCP
servers can use it when they receive an external event
(matrix message, webhook, scrape result) to nudge claude
into running a turn. broker.send wakes whatever Recv is
currently long-polling, the harness picks the message up,
formats a wake prompt with the caller's chosen from label
('matrix: new dm', 'webhook: deploy succeeded', etc.).

new `hive-ag3nt wake --from <label> --body <text>` subcommand
on the harness binary so MCP servers can shell out instead of
implementing the line-JSON protocol themselves; body=='-'
reads from stdin for multi-line / quoting-friendly payloads.

identity = socket: anything that can connect to /run/hive/mcp
.sock is implicitly trusted to inject. that's fine because the
bind-mount is the agent's own container; no new auth surface
opens up.

docs/turn-loop.md gets a new 'Waking the agent from inside
the container' section pointing at both paths (CLI + raw
JSON).
2026-05-16 03:15:58 +02:00
müde
2a6d084718 ask_operator: any agent can call it, answer routes by asker
new AgentRequest::AskOperator + AgentResponse::QuestionQueued on
the per-agent socket — same shape as the manager flavor, agent
gets the same wire surface (still uses the same operator_questions
table). agent_server::dispatch wires AskOperator through coord
.questions.submit(agent, ...) so the row's asker is the sub-agent
name; the ttl watchdog already in manager_server gets shared and
spawn_question_watchdog goes pub.

answer routing: operator_questions::answer now returns (question,
asker). post_answer_question + post_cancel_question + the watchdog
fire OperatorAnswered through new coord.notify_agent(asker, event)
instead of always notify_manager — the event lands in whichever
agent originally asked. notify_manager is now a thin wrapper.

agent socket plumbing: agent_server::start takes Arc<Coordinator>
instead of Arc<Broker> so dispatch has access to questions +
notify path; coordinator::{register_agent,ensure_runtime} take
self: &Arc<Self>. mcp::AgentServer grows the ask_operator tool;
allowed_mcp_tools(Agent) adds it; prompts/agent.md replaces the
'message the manager to ask the operator' guidance with the
direct tool description.
2026-05-16 01:48:10 +02:00
müde
fc61cb9310 fmt: clippy doc_markdown backticks 2026-05-15 23:11:10 +02:00
müde
871e7bf3fa wire types: add sha + tag to Approval and HelperEvent
approval grows fetched_sha (canonical hive-c0re-vouched sha,
distinct from manager-supplied commit_ref). helperevent
{approvalresolved,spawned,rebuilt} grow optional sha + tag so
the manager can git-show the exact tree it's hearing about
(against the upcoming /agents/<n>/applied.git RO mount) and
know which terminal tag landed. all serde-defaulted; existing
construction sites pass none until the tag-driven flow lands.
2026-05-15 22:47:39 +02:00
müde
80229c6af9 manager: needs_login / logged_in / needs_update events + update tool
crash_watch grows two more state-axes alongside running/stopped:

- logged-in (claude session dir populated for the agent)
- up-to-date (recorded flake rev matches current)

per-tick transitions emit HelperEvent::NeedsLogin / LoggedIn /
NeedsUpdate. seed-on-first-tick semantics retained — nothing fires
on harness boot for agents that were already in their state. only
needs_update fires the 'stale appeared' direction; the resolved
direction is already covered by Rebuilt.

new mcp__hyperhive__update(name) on the manager surface: idempotent
rebuild via auto_update::rebuild_agent. transient-aware (Rebuilding)
so the dashboard shows the spinner. login intentionally has NO tool
— it's interactive OAuth, only the operator can complete it.

prompts + approvals doc + turn-loop doc updated. todo grows a
'show per-agent applied config in dashboard' entry (separate
follow-up).
2026-05-15 21:42:13 +02:00
müde
58c3cd853b container crash watcher → HelperEvent::ContainerCrash
new hive_c0re::crash_watch task polls every 10s, builds the set of
currently-running containers, and on running→stopped transitions
checks the transient snapshot: if no Stopping / Restarting /
Destroying / Rebuilding flag is set, the container exited
unexpectedly and we fire HelperEvent::ContainerCrash into the
manager's inbox so it can react (typically: start it again).

first poll is a seeding pass — no events on harness startup. dbus
subscription would be lower-latency but polling is honest and
debuggable, and a 10s delay on crash detection is fine for our
scale.

manager prompt + approvals doc updated to advertise the new
event variant. todo drops the entry (and the journald-viewer
entry that already shipped).
2026-05-15 21:02:05 +02:00
müde
f65ee88269 recv: optional wait_seconds parameter, capped at 60s
AgentRequest::Recv and ManagerRequest::Recv grow an optional
wait_seconds field (default None → 30s, capped at 60s server-side).
agent_server / manager_server clamp via recv_timeout(). MCP tool
schemas advertise the param so claude can pick its own poll window
— useful when an agent wants to throttle wakes without entering a
distinct nap state.

both harness loops still pass None, keeping the existing 30s
default behaviour for system-level Recvs.
2026-05-15 20:49:33 +02:00
müde
754db7830e ask_operator: ttl_seconds auto-cancel + remaining-time chip
manager can pass ttl_seconds to ask_operator. on submit, host
stores deadline_at = now + ttl in operator_questions (new column,
migrated via existing pragma_table_info pattern), spawns a tokio
task that sleeps until the deadline then resolves the question with
answer '[expired]' and fires the same OperatorAnswered helper event.
already-resolved races no-op silently.

dashboard renders a ' MM:SS' chip on the question row when
deadline_at is set. format collapses seconds → s, < 1h → m s, ≥ 1h
→ h m. heartbeat refresh (5s) keeps the chip current; the operator
sees it tick down.

manager prompt + mcp tool description updated. journald viewer per
container queued in todo (separate task).
2026-05-15 20:38:02 +02:00
müde
538e0446d7 agent page: inbox view of last 30 messages addressed to this agent
new wire request AgentRequest::Recent { limit } / ManagerRequest::Recent
(plus matching responses with Vec<InboxRow>). InboxRow moved to
hive-sh4re so it lives on both surfaces without an internal-to-wire
conversion. host-side dispatch in agent_server / manager_server
calls broker.recent_for(name, limit).

per-agent web_ui /api/state grew an inbox: Vec<InboxRow> populated
via the same per-agent socket (best-effort; transport failure
returns empty). frontend renders as a collapsible <details> section
between the state row and the terminal — fmt timestamp / from /
body in a tight grid, capped at 16em scrollable. only visible when
there are rows.
2026-05-15 20:32:19 +02:00
müde
8344dd9ab7 ask_operator: multi-select + free-text fallback
ask_operator now accepts a multi: bool. when true and options is
non-empty, the dashboard renders the choices as checkboxes — operator
picks any subset, answer comes back as a ', '-joined string. when
false (default), options are radio buttons.

independent of multi, a free-text input ('or type your own…') is
always rendered alongside options so the operator is never trapped
by an incomplete list. submit merges checked options + free text into
the single 'answer' field.

schema migration: operator_questions grows a multi INTEGER column
with a one-shot ALTER TABLE on open. backward compatible — old rows
default to 0 (not multi).

prompt + mcp tool description updated; existing dashboard css for
.qform was rewritten around the new vertical layout.
2026-05-15 19:52:44 +02:00
müde
48ebfefd1a destroy --purge: also wipe agent state dirs
new --purge flag on the destroy verb (cli + admin socket + dashboard).
default destroy still keeps /var/lib/hyperhive/{agents,applied}/<name>/
so recreating with the same name reuses prior config + creds.
with --purge, both dirs go too (config history, claude creds, /state/
notes). no undo. dashboard adds a separate PURG3 button with an
explicit confirmation copy; the existing DESTR0Y button keeps the
soft semantics.

claude.md dashboard-action-surface section updated; todo entry
dropped.
2026-05-15 19:29:14 +02:00
müde
ac1b5fde8e manager: start/restart at will, no approval; refuse self
new manager tools mcp__hyperhive__{start,restart} that delegate to the
existing lifecycle::start / lifecycle::restart on the host. kill was
already at the manager's discretion; rounding out start + restart for
parity so day-to-day container care doesn't have to round-trip through
the operator.

guard: refuse self-targeting on kill/start/restart — the manager would
just be cutting its own legs. spawn (request_spawn) and config changes
(request_apply_commit) still go through the approval queue, since those
are the actual gate. prompt + claude.md updated to make the boundary
explicit. kill now also emits HelperEvent::Killed (it didn't before).
2026-05-15 18:57:25 +02:00
müde
2770630f33 ask_operator tool: non-blocking; operator answer arrives as helper event
new mcp tool on the manager surface that queues a question on the
dashboard and returns the question id immediately. operator submits an
answer via /answer-question/<id>; the dashboard fires
HelperEvent::OperatorAnswered { id, question, answer } into the manager
inbox so the next turn picks it up.

also: fix async-form button stuck on spinner after successful submit
(refreshState skipped re-rendering, so the button was never re-enabled).
2026-05-15 18:44:42 +02:00
müde
37c6504462 manager events: Spawned/Rebuilt/Killed/Destroyed + start button 2026-05-15 17:38:41 +02:00
müde
06ea0cf283 operator inbox view on dashboard; agent ui doesn't clobber typing 2026-05-15 17:23:53 +02:00
müde
e1289a3e4c nix templates: factor harness-base.nix (shared scaffolding incl. gitconfig) 2026-05-15 16:10:55 +02:00
müde
409263f1c9 operator input: per-agent /send form (dashboard T4LK removed) 2026-05-15 15:28:17 +02:00
müde
accb1445e3 claude: pipe prompt via stdin (variadic --allowedTools was eating it); + ManagerRequest::Status 2026-05-15 15:06:09 +02:00
müde
3c9d42b2a7 agent loop: claude drives; tool envelope (log/run/status/log) 2026-05-15 14:54:10 +02:00
müde
c59fa8541c phase 8 step 2: approval-gated spawn + dashboard spinner 2026-05-15 12:53:13 +02:00
müde
a42fdb3a5c phase 8 step 1: per-agent claude creds bind + destroy keeps state 2026-05-15 12:39:22 +02:00
müde
b711296460 destroy verb: CLI + admin socket + dashboard button; purges state + approvals 2026-05-15 02:57:22 +02:00
müde
7c1ed07cf2 lifecycle: HYPERHIVE_GIT env override (bypass PATH); module sets it 2026-05-15 00:24:51 +02:00
müde
fef2dee92a clippy pedantic clean + wired into flake checks 2026-05-14 22:57:47 +02:00
müde
f12837fe32 Phase 5a: approval queue (request_apply_commit, pending/approve/deny) 2026-05-14 22:50:19 +02:00
müde
4a73340150 fmt 2026-05-14 22:38:07 +02:00
müde
aa67e5a481 Phase 4: manager socket + manager_server with privileged tool surface 2026-05-14 22:35:08 +02:00
müde
4545c08908 hive-sh4re: per-agent socket protocol (Message/AgentRequest/AgentResponse) 2026-05-14 21:40:38 +02:00
müde
0ec54ecf89 hive-c0re: admin socket server + client (stub dispatch) 2026-05-14 20:49:11 +02:00
müde
c67584c7c1 flake: expose hyperhive package + nixos module + agent-base container 2026-05-14 20:33:25 +02:00