müde 90df2106bf agent socket: external wake-up path for in-container MCP servers

new AgentRequest::Wake { from, body } drops a message into
this agent's inbox via the per-agent socket. matrix-style MCP
servers can use it when they receive an external event
(matrix message, webhook, scrape result) to nudge claude
into running a turn. broker.send wakes whatever Recv is
currently long-polling, the harness picks the message up,
formats a wake prompt with the caller's chosen from label
('matrix: new dm', 'webhook: deploy succeeded', etc.).

new `hive-ag3nt wake --from <label> --body <text>` subcommand
on the harness binary so MCP servers can shell out instead of
implementing the line-JSON protocol themselves; body=='-'
reads from stdin for multi-line / quoting-friendly payloads.

identity = socket: anything that can connect to /run/hive/mcp
.sock is implicitly trusted to inject. that's fine because the
bind-mount is the agent's own container; no new auth surface
opens up.

docs/turn-loop.md gets a new 'Waking the agent from inside
the container' section pointing at both paths (CLI + raw
JSON).

2026-05-16 03:15:58 +02:00

9.4 KiB

Raw Blame History

Turn loop + MCP

How the harness wakes up, what it asks claude to do, and what tools claude has access to in return.

The loop

Each agent harness (hive-ag3nt serve or hive-m1nd serve) runs:

Long-poll Recv on its socket. The host-side broker (broker.rs::recv_blocking) returns immediately if there's a pending message, otherwise waits up to 30 s for a broker Sent event for this recipient.
Pop one message. Peek the remaining inbox depth with Status.
Emit LiveEvent::TurnStart { from, body, unread } onto the SSE bus.
Spawn claude (one process per turn) and pipe the wake prompt over stdin.
Stream stdout (JSON lines) into the bus as LiveEvent::Stream(value). Pump stderr as Note.
Wait for claude to exit. On Prompt is too long, run /compact on the session once and retry the turn.
Emit LiveEvent::TurnEnd { ok, note }. Sleep poll_ms to avoid tight loops on transient failures.

The claude invocation

claude --print --verbose --output-format stream-json --model <name> \
  --continue --settings /run/hive/claude-settings.json \
  --system-prompt-file /run/hive/claude-system-prompt.md \
  --mcp-config /run/hive/claude-mcp-config.json --strict-mcp-config \
  --tools <builtins> --allowedTools <builtins+mcp>
# wake prompt piped over stdin

<name> is read from Bus::model() on each turn, default haiku. Operator can flip it at runtime with /model <name> in the web terminal — the next turn picks it up. The choice is persisted to /state/hyperhive-model so it survives restart; override path: HYPERHIVE_MODEL_FILE env var for tests.

--continue keeps a persistent session per agent (claude stores sessions in ~/.claude/projects/, which is bind-mounted persistently). Auto-compact and auto-memory are disabled via --settings because hyperhive owns compaction (/compact on overflow, retry once; operator can also force one via /api/compact). A one-shot --continue suppression is available via POST /api/new-session (or /new-session slash command in the per-agent terminal) — Bus::take_skip_continue() flips an AtomicBool once per turn, the next claude invocation drops --continue, every subsequent turn resumes normal behaviour.

The child runs with cwd = /state (when the bind exists; falls back to the parent's cwd in dev), so any relative path in a tool call (Read foo.md, Bash ls, Write notes.md) lands in the agent's durable bind-mounted dir. CLAUDE.md auto-load walks upward from /state — drop a per-agent CLAUDE.md there if you want long-term hints that survive destroy/recreate.

The wake prompt is intentionally minimal: just the popped message's from/body, plus an inline ({unread} more pending — drain via …) hint when unread > 0. Claude drives any further recv/send itself via the embedded MCP server.

Whenever hive-c0re starts / restarts / rebuilds a container, it also drops a system message into the agent's inbox via Coordinator::kick_agent — a one-line "you were just (re)started, check /state/ for your notes, --continue session is intact". The next turn picks it up like any other inbox message.

On-boot files

hive_ag3nt::turn::write_* writes three files next to the per-agent socket at /run/hive/ once at startup:

claude-mcp-config.json — re-invokes the running binary as mcp child (so the same binary serves as harness + as claude's MCP child process).
claude-settings.json — the --settings blob (auto-compact and auto-memory off, effortLevel medium).
claude-system-prompt.md — rendered from hive-ag3nt/prompts/{agent,manager}.md with {label} and {operator_pronouns} substituted. Pronouns come from HIVE_OPERATOR_PRONOUNS env (set by the meta flake from services.hive-c0re.operatorPronouns, default she/her). Passed via --system-prompt-file.

The shared per-turn plumbing lives in hive_ag3nt::turn::{write_mcp_config, write_settings, write_system_prompt, run_turn, drive_turn, emit_turn_end, wait_for_login, compact_session} so the two binaries can't drift.

MCP surface

The harness ships an embedded MCP server (rmcp 1.7). Claude launches it as a stdio child via --mcp-config. The hyperhive socket name is hyperhive, so the tools land in claude as mcp__hyperhive__<tool>.

Sub-agent tools

send(to, body) — message a peer (logical agent name), another agent, or the operator (recipient operator, surfaces in the dashboard inbox).
recv(wait_seconds?) — drain one inbox message. Long-polls server-side; wait_seconds is capped at 180 (default 30 when omitted). Agents use a long wait to park their turn waiting for work instead of busy-looping with short polls — they wake instantly when a message arrives.
ask_operator(question, options?, multi?, ttl_seconds?) — surface a question on the dashboard. Same shape as the manager's; answer routes back to the asker's own inbox as HelperEvent::OperatorAnswered via coord.notify_agent.

Waking the agent from inside the container

External MCP servers (and any other in-container process) can inject a wake-up event into the agent's inbox via the per-agent socket at /run/hive/mcp.sock. Two equivalent paths:

Shell out to hive-ag3nt wake --from <label> --body <text> (use --body - to read body from stdin). Already on the container's PATH since the harness binary is in systemPackages. Convenient for shell-script integrations.
Speak the wire protocol directly — JSON-line over the unix socket: {"cmd":"wake","from":"matrix","body":"new dm from @alice"}\n. Same shape any other AgentRequest uses; see hive-sh4re::AgentRequest::Wake.

The wake event lands in the broker as {from:<label>, to:<agent>, body}, which wakes whatever recv call the harness is currently blocked on. Next turn fires with the wake prompt formed from that message — claude sees "from: matrix" (or whatever label) and reacts.

Identity = socket: anything that can connect to /run/hive/mcp.sock is implicitly trusted to inject these, which is fine because the bind-mount is the agent's own container only.

Extra MCP servers (per-agent)

Each agent's NixOS config can declare additional MCP servers via hyperhive.extraMcpServers.<key> = { command, args, env, allowedTools }. The module writes the map to /etc/hyperhive/extra-mcp.json; the harness reads it at boot and merges every entry into --mcp-config (under mcpServers.<key>) and --allowedTools (as mcp__<key>__<pattern>). The agent's flake.nix forwards every flake input to agent.nix as the flakeInputs module arg, so external MCP-server flakes are pulled in by adding them to inputs.* and referenced as flakeInputs.<name>.packages.${pkgs.system}.default — the resolved sha lands in the agent's own flake.lock and rolls up to meta's.

Manager tools (in addition to send/recv)

request_spawn(name) — queue a Spawn approval for a brand-new sub-agent (≤9 char name). Operator approves on the dashboard.
kill(name) — graceful stop. No approval required.
start(name) — start a stopped sub-agent. No approval.
restart(name) — stop + start. No approval.
update(name) — rebuild (re-applies the current hyperhive flake
- agent.nix, restarts). No approval, idempotent. Manager calls this on receipt of a needs_update system event.
request_apply_commit(agent, commit_ref) — submit a config change for any agent (hm1nd for the manager's own config) for operator approval.
ask_operator(question, options?, multi?, ttl_seconds?) — surface a question on the dashboard. Non-blocking — returns the queued question id; the operator's answer arrives later as HelperEvent::OperatorAnswered in the manager inbox. Options always render alongside a free-text fallback; multi=true renders options as checkboxes. ttl_seconds auto-cancels with answer [expired] after the deadline (useful for time-sensitive decisions that become moot if the operator hasn't responded). The operator can also manually cancel with [cancelled] via the dashboard.

The boundary: lifecycle ops on existing sub-agents (kill/start/restart) are at the manager's discretion — no operator approval. Creating a new agent (request_spawn) and changing any agent's config (request_apply_commit) still go through the approval queue.

Authoritative state

hive_ag3nt::events::Bus carries the current turn-loop state in addition to the broadcast channel and the events history. Variants:

Idle — sitting on Recv waiting for mail.
Thinking — claude --print is running for a turn.
Compacting — operator-triggered /compact is in flight.

The harness flips state at the relevant transitions (set_state(Thinking) before drive_turn, set_state(Idle) after; set_state(Compacting) around compact_session). Exposed via /api/state.turn_state + turn_state_since (unix seconds); the agent page renders this rather than deriving from SSE events.

Tool envelope

mcp::run_tool_envelope: every MCP tool handler logs the request, runs the body, logs the result. Pre-/post-log only — the inbox status hint moved to the wake prompt + UI header.

Tool whitelist (`mcp::ALLOWED_BUILTIN_TOOLS`)

Allowed built-ins: Bash, Edit, Glob, Grep, Read, TodoWrite, Write.
Denied by omission: WebFetch, WebSearch, Task, NotebookEdit.
Allowed MCP tools: as listed above per flavor.

Bash is on the allow-list pending a finer-grained pattern allow-list (Bash(git *)-style) — see TODO.

9.4 KiB Raw Blame History