no nap tool — recv with long wait_seconds replaces it; max raised to 180s

recv-with-timeout is strictly better than a fixed sleep because it
wakes instantly on incoming messages. drop the half-written nap MCP
tool, raise the recv wait_seconds cap from 60s to 180s on both
agent and manager sockets.

prompts updated: agent.md + manager.md now spell out the pattern —
when there's nothing else useful to do, call recv with
wait_seconds=180 to park the turn; do NOT use Bash sleep for the
same purpose. todo drops the nap entry and the napping-state-badge
follow-up; both replaced by 'just use a long recv'.
This commit is contained in:
müde 2026-05-15 20:53:15 +02:00
parent f65ee88269
commit 7d93dd9db4
6 changed files with 16 additions and 21 deletions

12
TODO.md
View file

@ -27,10 +27,6 @@ Pick anything from here when relevant. Cross-cutting design notes live in
## UI / UX
- **State badge: napping state.** Idle / thinking / compacting
already ship from server-side `TurnState`. Add `napping 😴`
once the `nap` tool exists — it just adds a new `TurnState`
variant the harness flips into for the duration of the nap.
- **Terminal: `/model` slash command.** Operator-typeable model
override from the terminal. Depends on the model-override work
above; once an override mechanism exists, wire a `/model <name>`
@ -81,14 +77,6 @@ Pick anything from here when relevant. Cross-cutting design notes live in
## Loop substance
- **`nap` tool.** Agent-side MCP tool `mcp__hyperhive__nap(seconds)` that
parks the turn loop for a short while before next-message processing.
Use cases: agent decides it has nothing useful to do, or wants to
throttle itself between rapid wake events. Implementation: harness
records a "wake-not-before" timestamp; `recv_blocking` skips the long
poll until that ts; the state badge reads `napping · MM:SS` during.
Operator can cancel via the same `/cancel` slash command or a
dashboard button.
- **Notes compaction.** `/state/` is bind-mounted persistently and agents
are told (in the system prompt) to keep `/state/notes.md` for durable
knowledge — but we don't currently nudge them to compact when notes

View file

@ -2,7 +2,7 @@ You are hyperhive agent `{label}` in a multi-agent system.
Tools (hyperhive surface):
- `mcp__hyperhive__recv()` — drain one more message from your inbox (returns `(empty)` if nothing pending).
- `mcp__hyperhive__recv(wait_seconds?)` — drain one more message from your inbox (returns `(empty)` if nothing pending after the wait). Without `wait_seconds` it long-polls 30s. To **wait** for work when you have nothing else useful to do this turn, call with a long wait (e.g. `wait_seconds: 180`, the max) — you'll be woken instantly when a message arrives, otherwise return after the timeout. That is strictly better than calling `recv` repeatedly with short waits: lower latency on new work, fewer turns, no busy-loop. Never use a fixed `sleep` shell command for the same purpose.
- `mcp__hyperhive__send(to, body)` — message a peer (by their name) or the operator (recipient `operator`, surfaces in the dashboard).
Need new packages, env vars, or other NixOS config for yourself? You can't edit your own config directly — message the manager (recipient `manager`) describing what you need + why. The manager evaluates the request (it doesn't rubber-stamp), edits `/agents/{label}/config/agent.nix` on your behalf, commits, and submits an approval that the operator can accept on the dashboard; on approve hive-c0re rebuilds your container with the new config.

View file

@ -2,7 +2,7 @@ You are the hyperhive manager `{label}` in a multi-agent system. You coordinate
Tools (hyperhive surface):
- `mcp__hyperhive__recv()` — drain one more message from your inbox.
- `mcp__hyperhive__recv(wait_seconds?)` — drain one more message from your inbox. Without `wait_seconds` it long-polls 30s. To **wait** when you have nothing else to do, call with a long wait (e.g. `wait_seconds: 180`, the max) — you'll wake instantly on new work, otherwise return after the timeout. Use this instead of ending the turn or sleeping in a Bash command.
- `mcp__hyperhive__send(to, body)` — message an agent (by name), another peer, or the operator (`operator` surfaces in the dashboard).
- `mcp__hyperhive__request_spawn(name)` — queue a brand-new sub-agent for operator approval (≤9 char name).
- `mcp__hyperhive__kill(name)` — graceful stop on a sub-agent. No approval required.

View file

@ -166,7 +166,10 @@ impl AgentServer {
#[tool(
description = "Pop one message from this agent's inbox. Returns the sender and body, \
or an empty marker if nothing is waiting. Optional `wait_seconds` long-polls \
for that many seconds (capped at 60) before returning empty default 30."
for that many seconds (capped at 180) before returning empty default 30. \
Use a long wait_seconds (e.g. 120 or 180) when you have nothing else to do \
it parks the turn until either a message arrives or the timeout fires, which \
is strictly better than a fixed sleep because incoming work wakes you instantly."
)]
async fn recv(&self, Parameters(args): Parameters<RecvArgs>) -> String {
let log = format!("{args:?}");
@ -314,8 +317,10 @@ impl ManagerServer {
#[tool(
description = "Pop one message from the manager inbox. Returns sender + body, or \
empty. Optional `wait_seconds` long-polls (capped at 60, default 30) so the \
manager can sit on Recv when there's nothing to do without burning turns."
empty. Optional `wait_seconds` long-polls (capped at 180, default 30) so the \
manager can sit on Recv when there's nothing to do without burning turns \
prefer a long wait (120 or 180) over ending a turn early; you'll wake \
instantly when work arrives."
)]
async fn recv(&self, Parameters(args): Parameters<RecvArgs>) -> String {
let log = format!("{args:?}");

View file

@ -79,10 +79,12 @@ async fn serve(stream: UnixStream, agent: String, broker: Arc<Broker>) -> Result
/// Default and max long-poll window for `Recv`. Caller can request a
/// shorter (or longer up to `RECV_LONG_POLL_MAX`) wait via the
/// `wait_seconds` field; values above the cap are clamped. Set well
/// below typical TCP / proxy idle limits.
/// `wait_seconds` field; values above the cap are clamped. 180s
/// max keeps us under typical TCP/proxy idle limits while letting
/// agents park their turn until a message lands instead of busy-
/// looping with short waits.
const RECV_LONG_POLL_DEFAULT: std::time::Duration = std::time::Duration::from_secs(30);
const RECV_LONG_POLL_MAX: std::time::Duration = std::time::Duration::from_secs(60);
const RECV_LONG_POLL_MAX: std::time::Duration = std::time::Duration::from_secs(180);
fn recv_timeout(wait_seconds: Option<u64>) -> std::time::Duration {
match wait_seconds {

View file

@ -72,7 +72,7 @@ async fn serve(stream: UnixStream, coord: Arc<Coordinator>) -> Result<()> {
/// Default and max long-poll window for manager `Recv`. Caller can
/// request a shorter or longer (up to MAX) wait via `wait_seconds`.
const MANAGER_RECV_LONG_POLL_DEFAULT: std::time::Duration = std::time::Duration::from_secs(30);
const MANAGER_RECV_LONG_POLL_MAX: std::time::Duration = std::time::Duration::from_secs(60);
const MANAGER_RECV_LONG_POLL_MAX: std::time::Duration = std::time::Duration::from_secs(180);
fn manager_recv_timeout(wait_seconds: Option<u64>) -> std::time::Duration {
match wait_seconds {