ask_operator: any agent can call it, answer routes by asker

new AgentRequest::AskOperator + AgentResponse::QuestionQueued on
the per-agent socket — same shape as the manager flavor, agent
gets the same wire surface (still uses the same operator_questions
table). agent_server::dispatch wires AskOperator through coord
.questions.submit(agent, ...) so the row's asker is the sub-agent
name; the ttl watchdog already in manager_server gets shared and
spawn_question_watchdog goes pub.

answer routing: operator_questions::answer now returns (question,
asker). post_answer_question + post_cancel_question + the watchdog
fire OperatorAnswered through new coord.notify_agent(asker, event)
instead of always notify_manager — the event lands in whichever
agent originally asked. notify_manager is now a thin wrapper.

agent socket plumbing: agent_server::start takes Arc<Coordinator>
instead of Arc<Broker> so dispatch has access to questions +
notify path; coordinator::{register_agent,ensure_runtime} take
self: &Arc<Self>. mcp::AgentServer grows the ask_operator tool;
allowed_mcp_tools(Agent) adds it; prompts/agent.md replaces the
'message the manager to ask the operator' guidance with the
direct tool description.
This commit is contained in:
müde 2026-05-16 01:48:10 +02:00
parent 6b3ef4549c
commit 2a6d084718
9 changed files with 156 additions and 43 deletions

View file

@ -4,11 +4,10 @@ Tools (hyperhive surface):
- `mcp__hyperhive__recv(wait_seconds?)` — drain one more message from your inbox (returns `(empty)` if nothing pending after the wait). Without `wait_seconds` it long-polls 30s. To **wait** for work when you have nothing else useful to do this turn, call with a long wait (e.g. `wait_seconds: 180`, the max) — you'll be woken instantly when a message arrives, otherwise return after the timeout. That is strictly better than calling `recv` repeatedly with short waits: lower latency on new work, fewer turns, no busy-loop. Never use a fixed `sleep` shell command for the same purpose.
- `mcp__hyperhive__send(to, body)` — message a peer (by their name) or the operator (recipient `operator`, surfaces in the dashboard).
- `mcp__hyperhive__ask_operator(question, options?, multi?, ttl_seconds?)` — surface a question to the human operator on the dashboard. Returns immediately with a question id — do NOT wait inline. When the operator answers, a system message with event `operator_answered { id, question, answer }` lands in your inbox; handle it on a future turn. Use this for clarifications, permission for risky actions, or choice between options. `options` is advisory: a short fixed-choice list when applicable, otherwise leave empty for free text. `multi: true` lets the operator pick multiple (checkboxes), answer comes back comma-joined. `ttl_seconds` auto-cancels with answer `[expired]` when the decision becomes moot.
Need new packages, env vars, or other NixOS config for yourself? You can't edit your own config directly — message the manager (recipient `manager`) describing what you need + why. The manager evaluates the request (it doesn't rubber-stamp), edits `/agents/{label}/config/agent.nix` on your behalf, commits, and submits an approval that the operator can accept on the dashboard; on approve hive-c0re rebuilds your container with the new config.
Need to ask the human operator a question (clarification, permission, choice)? You don't have direct operator access — ask the manager to surface the question on your behalf ("please ask the operator: …"). The manager has a channel for this.
Durable knowledge: write to `/state/notes.md` (free-form) or any other path under `/state/`. That directory is bind-mounted from the host and persists across container destroy/recreate — claude's `--continue` session only carries short-term context, but `/state/` is forever. Read it back at the start of relevant turns to remember things across resets.
Keep messages short — a few sentences each. For anything big (file listings, long diffs, transcripts, analysis): write the payload to `/state/<descriptive-name>` and `send` a short pointer ("dropped the cluster audit in /state/cluster-audit-2026-05.md, headline: 3 nodes over 80% mem"). The manager + operator can read your `/state/` from the host as `/agents/{label}/state/`. Sub-agent peers can't read each other's `/state/` directly — go through the manager if a payload needs to reach another sub-agent.

View file

@ -125,7 +125,10 @@ async fn serve(
bus.set_state(TurnState::Idle);
}
Ok(AgentResponse::Empty) => {}
Ok(AgentResponse::Ok | AgentResponse::Status { .. } | AgentResponse::Recent { .. }) => {
Ok(AgentResponse::Ok
| AgentResponse::Status { .. }
| AgentResponse::Recent { .. }
| AgentResponse::QuestionQueued { .. }) => {
tracing::warn!("recv produced unexpected response kind");
}
Ok(AgentResponse::Err { message }) => {

View file

@ -50,6 +50,7 @@ impl From<hive_sh4re::AgentResponse> for SocketReply {
hive_sh4re::AgentResponse::Empty => Self::Empty,
hive_sh4re::AgentResponse::Status { unread } => Self::Status(unread),
hive_sh4re::AgentResponse::Recent { rows } => Self::Recent(rows),
hive_sh4re::AgentResponse::QuestionQueued { id } => Self::QuestionQueued(id),
}
}
}
@ -163,6 +164,45 @@ impl AgentServer {
.await
}
#[tool(
description = "Surface a question to the operator on the dashboard. Returns immediately \
with a question id do NOT wait inline. When the operator answers, a system message \
with event `operator_answered { id, question, answer }` lands in your inbox; handle it \
on a future turn. Use this when a decision needs human signal (ambiguous scope, \
permission to do something risky, choosing between options). `options` is advisory: \
pass a short fixed-choice list when applicable, otherwise leave empty for free text. \
Set `multi: true` to let the operator pick multiple options (checkboxes); the answer \
comes back as a comma-separated string. Set `ttl_seconds` to auto-cancel a \
no-longer-relevant question on expiry the answer is `[expired]` and the same \
`operator_answered` event fires."
)]
async fn ask_operator(&self, Parameters(args): Parameters<AskOperatorArgs>) -> String {
let log = format!("{args:?}");
run_tool_envelope("ask_operator", log, async move {
let resp = client::request::<_, hive_sh4re::AgentResponse>(
&self.socket,
&hive_sh4re::AgentRequest::AskOperator {
question: args.question,
options: args.options,
multi: args.multi,
ttl_seconds: args.ttl_seconds,
},
)
.await
.map(SocketReply::from);
match resp {
Ok(SocketReply::QuestionQueued(id)) => format!(
"question queued (id={id}); operator's answer will arrive as a system \
`operator_answered` event in your inbox"
),
Ok(SocketReply::Err(m)) => format!("ask_operator failed: {m}"),
Ok(other) => format!("ask_operator unexpected response: {other:?}"),
Err(e) => format!("ask_operator transport error: {e:#}"),
}
})
.await
}
#[tool(
description = "Pop one message from this agent's inbox. Returns the sender and body, \
or an empty marker if nothing is waiting. Optional `wait_seconds` long-polls \
@ -527,7 +567,7 @@ pub enum Flavor {
#[must_use]
pub fn allowed_mcp_tools(flavor: Flavor) -> Vec<String> {
let names: &[&str] = match flavor {
Flavor::Agent => &["send", "recv"],
Flavor::Agent => &["send", "recv", "ask_operator"],
Flavor::Manager => &[
"send",
"recv",