recv: None = peek, positive value = opt-in long-poll
old behavior: omitted wait_seconds fell through to the 30s RECV_LONG_POLL_DEFAULT — claude calling 'is there anything in my inbox right now?' between actions blocked the turn for half a minute. flip the semantics: None (or 0) returns immediately, positive value parks up to MAX (180s, unchanged). cleaner 'peek vs wait' distinction; tool descriptions + agent/manager prompts updated to point at the new shape. harness's own serve loops in hive-ag3nt + hive-m1nd relied on the old default for their inbox poll. they now explicitly pass wait_seconds: Some(180) to opt into the full park — same effective behavior as before, just spelled out. retires the matching TODO under Turn loop.
This commit is contained in:
parent
90df2106bf
commit
06af23c8a4
9 changed files with 53 additions and 45 deletions
15
TODO.md
15
TODO.md
|
|
@ -3,21 +3,6 @@
|
||||||
Pick anything from here when relevant. Cross-cutting design notes live in
|
Pick anything from here when relevant. Cross-cutting design notes live in
|
||||||
[CLAUDE.md](CLAUDE.md); high-level project intro in [README.md](README.md).
|
[CLAUDE.md](CLAUDE.md); high-level project intro in [README.md](README.md).
|
||||||
|
|
||||||
## Turn loop
|
|
||||||
|
|
||||||
- **`recv` with no `wait_seconds` should return immediately.**
|
|
||||||
Today omitting the argument falls through to the 30s
|
|
||||||
default long-poll (`RECV_LONG_POLL_DEFAULT` in
|
|
||||||
`hive-c0re/src/agent_server.rs`); a manager that wants a
|
|
||||||
cheap "anything in the inbox right now?" peek has to
|
|
||||||
explicitly pass `wait_seconds: 0`. Flip the semantics so
|
|
||||||
`None` = no sleep, returning `None` (or the empty inbox
|
|
||||||
shape) right away. The agent opts into the long-poll by
|
|
||||||
setting a positive value. Update both `AgentRequest::Recv`
|
|
||||||
and `ManagerRequest::Recv` handlers + the prompt language
|
|
||||||
in `prompts/{agent,manager}.md`. Tighten the cap (180s)
|
|
||||||
too — only meaningful when the agent is choosing to wait.
|
|
||||||
|
|
||||||
## Permissions / policy
|
## Permissions / policy
|
||||||
|
|
||||||
- **Per-agent send allow-list.** Today any agent can `send` to any
|
- **Per-agent send allow-list.** Today any agent can `send` to any
|
||||||
|
|
|
||||||
|
|
@ -102,11 +102,11 @@ it as a stdio child via `--mcp-config`. The hyperhive socket name is
|
||||||
- `send(to, body)` — message a peer (logical agent name), another
|
- `send(to, body)` — message a peer (logical agent name), another
|
||||||
agent, or the operator (recipient `operator`, surfaces in the
|
agent, or the operator (recipient `operator`, surfaces in the
|
||||||
dashboard inbox).
|
dashboard inbox).
|
||||||
- `recv(wait_seconds?)` — drain one inbox message. Long-polls
|
- `recv(wait_seconds?)` — drain one inbox message. Without
|
||||||
server-side; `wait_seconds` is capped at 180 (default 30 when
|
`wait_seconds` (or with `0`) returns immediately, a cheap
|
||||||
omitted). Agents use a long wait to park their turn waiting for
|
"anything pending?" peek. Positive value parks the turn up
|
||||||
work instead of busy-looping with short polls — they wake
|
to that many seconds (cap 180) — incoming messages wake
|
||||||
instantly when a message arrives.
|
instantly, otherwise returns empty at the timeout.
|
||||||
- `ask_operator(question, options?, multi?, ttl_seconds?)` —
|
- `ask_operator(question, options?, multi?, ttl_seconds?)` —
|
||||||
surface a question on the dashboard. Same shape as the manager's;
|
surface a question on the dashboard. Same shape as the manager's;
|
||||||
answer routes back to the asker's own inbox as
|
answer routes back to the asker's own inbox as
|
||||||
|
|
|
||||||
|
|
@ -2,7 +2,7 @@ You are hyperhive agent `{label}` in a multi-agent system. The operator (recipie
|
||||||
|
|
||||||
Tools (hyperhive surface):
|
Tools (hyperhive surface):
|
||||||
|
|
||||||
- `mcp__hyperhive__recv(wait_seconds?)` — drain one more message from your inbox (returns `(empty)` if nothing pending after the wait). Without `wait_seconds` it long-polls 30s. To **wait** for work when you have nothing else useful to do this turn, call with a long wait (e.g. `wait_seconds: 180`, the max) — you'll be woken instantly when a message arrives, otherwise return after the timeout. That is strictly better than calling `recv` repeatedly with short waits: lower latency on new work, fewer turns, no busy-loop. Never use a fixed `sleep` shell command for the same purpose.
|
- `mcp__hyperhive__recv(wait_seconds?)` — drain one more message from your inbox (returns `(empty)` if nothing pending). Without `wait_seconds` (or with `0`) it returns immediately — a cheap "anything pending?" peek you can sprinkle between tool calls. To **wait** for work when you have nothing else useful to do this turn, call with a long wait (e.g. `wait_seconds: 180`, the max) — incoming messages wake you instantly, otherwise the call returns empty at the timeout. That's strictly better than a fixed `sleep` shell command: lower latency on new work, no busy-loop.
|
||||||
- `mcp__hyperhive__send(to, body)` — message a peer (by their name) or the operator (recipient `operator`, surfaces in the dashboard).
|
- `mcp__hyperhive__send(to, body)` — message a peer (by their name) or the operator (recipient `operator`, surfaces in the dashboard).
|
||||||
- (some agents only) **extra MCP tools** surfaced as `mcp__<server>__<tool>` — these are agent-specific (matrix client, scraper, db connector, etc.) declared in your `agent.nix` under `hyperhive.extraMcpServers`. Treat them as first-class tools alongside the hyperhive surface; the operator already auto-approved them at deploy time.
|
- (some agents only) **extra MCP tools** surfaced as `mcp__<server>__<tool>` — these are agent-specific (matrix client, scraper, db connector, etc.) declared in your `agent.nix` under `hyperhive.extraMcpServers`. Treat them as first-class tools alongside the hyperhive surface; the operator already auto-approved them at deploy time.
|
||||||
- `mcp__hyperhive__ask_operator(question, options?, multi?, ttl_seconds?)` — surface a question to the human operator on the dashboard. Returns immediately with a question id — do NOT wait inline. When the operator answers, a system message with event `operator_answered { id, question, answer }` lands in your inbox; handle it on a future turn. Use this for clarifications, permission for risky actions, or choice between options. `options` is advisory: a short fixed-choice list when applicable, otherwise leave empty for free text. `multi: true` lets the operator pick multiple (checkboxes), answer comes back comma-joined. `ttl_seconds` auto-cancels with answer `[expired]` when the decision becomes moot.
|
- `mcp__hyperhive__ask_operator(question, options?, multi?, ttl_seconds?)` — surface a question to the human operator on the dashboard. Returns immediately with a question id — do NOT wait inline. When the operator answers, a system message with event `operator_answered { id, question, answer }` lands in your inbox; handle it on a future turn. Use this for clarifications, permission for risky actions, or choice between options. `options` is advisory: a short fixed-choice list when applicable, otherwise leave empty for free text. `multi: true` lets the operator pick multiple (checkboxes), answer comes back comma-joined. `ttl_seconds` auto-cancels with answer `[expired]` when the decision becomes moot.
|
||||||
|
|
|
||||||
|
|
@ -2,7 +2,7 @@ You are the hyperhive manager `{label}` in a multi-agent system. You coordinate
|
||||||
|
|
||||||
Tools (hyperhive surface):
|
Tools (hyperhive surface):
|
||||||
|
|
||||||
- `mcp__hyperhive__recv(wait_seconds?)` — drain one more message from your inbox. Without `wait_seconds` it long-polls 30s. To **wait** when you have nothing else to do, call with a long wait (e.g. `wait_seconds: 180`, the max) — you'll wake instantly on new work, otherwise return after the timeout. Use this instead of ending the turn or sleeping in a Bash command.
|
- `mcp__hyperhive__recv(wait_seconds?)` — drain one more message from your inbox. Without `wait_seconds` (or with `0`) it returns immediately — a cheap inbox peek you can drop between actions. To **wait** when you have nothing else to do, call with a long wait (e.g. `wait_seconds: 180`, the max) — you'll wake instantly on new work, otherwise return after the timeout. Use that instead of ending the turn or sleeping in a Bash command.
|
||||||
- `mcp__hyperhive__send(to, body)` — message an agent (by name), another peer, or the operator (`operator` surfaces in the dashboard).
|
- `mcp__hyperhive__send(to, body)` — message an agent (by name), another peer, or the operator (`operator` surfaces in the dashboard).
|
||||||
- `mcp__hyperhive__request_spawn(name)` — queue a brand-new sub-agent for operator approval (≤9 char name).
|
- `mcp__hyperhive__request_spawn(name)` — queue a brand-new sub-agent for operator approval (≤9 char name).
|
||||||
- `mcp__hyperhive__kill(name)` — graceful stop on a sub-agent. No approval required.
|
- `mcp__hyperhive__kill(name)` — graceful stop on a sub-agent. No approval required.
|
||||||
|
|
|
||||||
|
|
@ -140,7 +140,17 @@ async fn serve(
|
||||||
let _ = state; // reserved for future state transitions (turn-loop -> needs-login)
|
let _ = state; // reserved for future state transitions (turn-loop -> needs-login)
|
||||||
loop {
|
loop {
|
||||||
let recv: Result<AgentResponse> =
|
let recv: Result<AgentResponse> =
|
||||||
client::request(socket, &AgentRequest::Recv { wait_seconds: None }).await;
|
// Explicit long-poll: the new agent_server semantics treat
|
||||||
|
// `None` as "peek, don't wait", which would tight-loop on
|
||||||
|
// sleep(interval). The harness wants to park until a
|
||||||
|
// message arrives, so opt into the full 180s cap.
|
||||||
|
client::request(
|
||||||
|
socket,
|
||||||
|
&AgentRequest::Recv {
|
||||||
|
wait_seconds: Some(180),
|
||||||
|
},
|
||||||
|
)
|
||||||
|
.await;
|
||||||
match recv {
|
match recv {
|
||||||
Ok(AgentResponse::Message { from, body }) => {
|
Ok(AgentResponse::Message { from, body }) => {
|
||||||
tracing::info!(%from, %body, "inbox");
|
tracing::info!(%from, %body, "inbox");
|
||||||
|
|
|
||||||
|
|
@ -92,7 +92,16 @@ async fn serve(
|
||||||
tracing::info!(socket = %socket.display(), "hive-m1nd serve");
|
tracing::info!(socket = %socket.display(), "hive-m1nd serve");
|
||||||
loop {
|
loop {
|
||||||
let recv: Result<ManagerResponse> =
|
let recv: Result<ManagerResponse> =
|
||||||
client::request(socket, &ManagerRequest::Recv { wait_seconds: None }).await;
|
// Explicit long-poll: see hive-ag3nt's serve loop for the
|
||||||
|
// rationale — recv now defaults to peek when wait_seconds
|
||||||
|
// is None.
|
||||||
|
client::request(
|
||||||
|
socket,
|
||||||
|
&ManagerRequest::Recv {
|
||||||
|
wait_seconds: Some(180),
|
||||||
|
},
|
||||||
|
)
|
||||||
|
.await;
|
||||||
match recv {
|
match recv {
|
||||||
Ok(ManagerResponse::Message { from, body }) => {
|
Ok(ManagerResponse::Message { from, body }) => {
|
||||||
if from == SYSTEM_SENDER {
|
if from == SYSTEM_SENDER {
|
||||||
|
|
|
||||||
|
|
@ -205,11 +205,13 @@ impl AgentServer {
|
||||||
|
|
||||||
#[tool(
|
#[tool(
|
||||||
description = "Pop one message from this agent's inbox. Returns the sender and body, \
|
description = "Pop one message from this agent's inbox. Returns the sender and body, \
|
||||||
or an empty marker if nothing is waiting. Optional `wait_seconds` long-polls \
|
or an empty marker if nothing is waiting. Without `wait_seconds` (or with 0) the \
|
||||||
for that many seconds (capped at 180) before returning empty — default 30. \
|
call returns immediately — a cheap 'anything pending?' peek. Pass a positive \
|
||||||
Use a long wait_seconds (e.g. 120 or 180) when you have nothing else to do — \
|
`wait_seconds` (capped at 180) to park the turn waiting for new work — incoming \
|
||||||
it parks the turn until either a message arrives or the timeout fires, which \
|
messages wake you instantly, otherwise the call returns empty at the timeout. \
|
||||||
is strictly better than a fixed sleep because incoming work wakes you instantly."
|
That's strictly better than a fixed shell `sleep`. Typical pattern: when you have \
|
||||||
|
nothing else useful to do, call `recv(wait_seconds: 180)` to park until \
|
||||||
|
something arrives."
|
||||||
)]
|
)]
|
||||||
async fn recv(&self, Parameters(args): Parameters<RecvArgs>) -> String {
|
async fn recv(&self, Parameters(args): Parameters<RecvArgs>) -> String {
|
||||||
let log = format!("{args:?}");
|
let log = format!("{args:?}");
|
||||||
|
|
@ -363,10 +365,10 @@ impl ManagerServer {
|
||||||
|
|
||||||
#[tool(
|
#[tool(
|
||||||
description = "Pop one message from the manager inbox. Returns sender + body, or \
|
description = "Pop one message from the manager inbox. Returns sender + body, or \
|
||||||
empty. Optional `wait_seconds` long-polls (capped at 180, default 30) so the \
|
empty. Without `wait_seconds` (or 0) returns immediately — a cheap inbox peek. \
|
||||||
manager can sit on Recv when there's nothing to do without burning turns — \
|
Pass a positive value (capped at 180) to park until either a message arrives \
|
||||||
prefer a long wait (120 or 180) over ending a turn early; you'll wake \
|
or the timeout fires; prefer a long wait (120 or 180) over ending a turn \
|
||||||
instantly when work arrives."
|
early when you have nothing else to do."
|
||||||
)]
|
)]
|
||||||
async fn recv(&self, Parameters(args): Parameters<RecvArgs>) -> String {
|
async fn recv(&self, Parameters(args): Parameters<RecvArgs>) -> String {
|
||||||
let log = format!("{args:?}");
|
let log = format!("{args:?}");
|
||||||
|
|
|
||||||
|
|
@ -81,19 +81,20 @@ async fn serve(stream: UnixStream, agent: String, coord: Arc<Coordinator>) -> Re
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Default and max long-poll window for `Recv`. Caller can request a
|
/// Max long-poll window the caller can ask for; values above the
|
||||||
/// shorter (or longer up to `RECV_LONG_POLL_MAX`) wait via the
|
/// cap are clamped. 180s keeps us under typical TCP/proxy idle
|
||||||
/// `wait_seconds` field; values above the cap are clamped. 180s
|
/// limits while still letting agents park their turn until a
|
||||||
/// max keeps us under typical TCP/proxy idle limits while letting
|
/// message arrives. Omitting `wait_seconds` (or passing `0`) means
|
||||||
/// agents park their turn until a message lands instead of busy-
|
/// "peek, don't wait" — claude can call recv whenever it wants a
|
||||||
/// looping with short waits.
|
/// cheap "is there anything pending?" check without blocking the
|
||||||
const RECV_LONG_POLL_DEFAULT: std::time::Duration = std::time::Duration::from_secs(30);
|
/// turn for 30 seconds. To actually park, the caller passes a
|
||||||
|
/// positive `wait_seconds`.
|
||||||
const RECV_LONG_POLL_MAX: std::time::Duration = std::time::Duration::from_secs(180);
|
const RECV_LONG_POLL_MAX: std::time::Duration = std::time::Duration::from_secs(180);
|
||||||
|
|
||||||
fn recv_timeout(wait_seconds: Option<u64>) -> std::time::Duration {
|
fn recv_timeout(wait_seconds: Option<u64>) -> std::time::Duration {
|
||||||
match wait_seconds {
|
match wait_seconds {
|
||||||
Some(s) => std::time::Duration::from_secs(s).min(RECV_LONG_POLL_MAX),
|
Some(s) => std::time::Duration::from_secs(s).min(RECV_LONG_POLL_MAX),
|
||||||
None => RECV_LONG_POLL_DEFAULT,
|
None => std::time::Duration::ZERO,
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -69,15 +69,16 @@ async fn serve(stream: UnixStream, coord: Arc<Coordinator>) -> Result<()> {
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Default and max long-poll window for manager `Recv`. Caller can
|
/// Max long-poll window for manager `Recv`. Same semantics as the
|
||||||
/// request a shorter or longer (up to MAX) wait via `wait_seconds`.
|
/// sub-agent socket: omitted `wait_seconds` (or `0`) = peek and
|
||||||
const MANAGER_RECV_LONG_POLL_DEFAULT: std::time::Duration = std::time::Duration::from_secs(30);
|
/// return immediately, positive value = park up to that many
|
||||||
|
/// seconds (clamped at MAX).
|
||||||
const MANAGER_RECV_LONG_POLL_MAX: std::time::Duration = std::time::Duration::from_secs(180);
|
const MANAGER_RECV_LONG_POLL_MAX: std::time::Duration = std::time::Duration::from_secs(180);
|
||||||
|
|
||||||
fn manager_recv_timeout(wait_seconds: Option<u64>) -> std::time::Duration {
|
fn manager_recv_timeout(wait_seconds: Option<u64>) -> std::time::Duration {
|
||||||
match wait_seconds {
|
match wait_seconds {
|
||||||
Some(s) => std::time::Duration::from_secs(s).min(MANAGER_RECV_LONG_POLL_MAX),
|
Some(s) => std::time::Duration::from_secs(s).min(MANAGER_RECV_LONG_POLL_MAX),
|
||||||
None => MANAGER_RECV_LONG_POLL_DEFAULT,
|
None => std::time::Duration::ZERO,
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue