diff --git a/CLAUDE.md b/CLAUDE.md index 40c0b57..16c815c 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -200,6 +200,23 @@ read them à la carte. In-flight or recent context that hasn't earned a section yet. Prune freely. +- **Just landed:** proactive context-size compaction. `turn:: + drive_turn` now has two compaction paths. The old *reactive* + one (claude prints `Prompt is too long` → `/compact` + retry) + is unchanged — by then the session is already past the window + and no turn can run on it. The new *proactive* one fires after + a clean turn whose last-inference context + (`Bus::last_ctx_usage().context_tokens()`) crossed a watermark: + `drive_turn` injects one synthetic notes-checkpoint turn + (`CHECKPOINT_PROMPT` — "flush durable state into /state now") + then runs `/compact`, so the agent can persist in-flight state + before the detail collapses into a summary. Watermark is + `HIVE_COMPACT_WATERMARK_TOKENS` (default 150k, ~75% of a 200k + window; `0` disables). New `maybe_checkpoint_and_compact` + helper, all best-effort (a failed checkpoint/compact is a Note, + never fails the turn). Both ag3nt + m1nd loops get it for free + via the shared `drive_turn`. `docs/turn-loop.md` gained a + Compaction section. - **Just landed:** dashboard side panel + forge-linked config/approvals + per-bucket model stats. (a) Long content (file previews, approval diffs, journald logs) opens in a @@ -608,8 +625,8 @@ Prune freely. agent_web_port + collision banner + spawn/rebuild preflight, browser notifications, focus-preserving refresh, generalised
survival, prompt-on-submit pattern. -- **Open threads:** two-step spawn, notes compaction, - unprivileged containers, Bash allow-list, xterm.js. The +- **Open threads:** two-step spawn, unprivileged + containers, Bash allow-list, xterm.js. The deployment / gateway / privsep cluster is tracked as `area:ops` forge issues. (Landed since this note was first written: extra per-agent MCP servers, per-agent send allow-list, diff --git a/docs/turn-loop.md b/docs/turn-loop.md index 718bcc0..bfcfdbc 100644 --- a/docs/turn-loop.md +++ b/docs/turn-loop.md @@ -18,8 +18,9 @@ Each agent harness (`hive-ag3nt serve` or `hive-m1nd serve`) runs: over stdin. 5. Stream stdout (JSON lines) into the bus as `LiveEvent::Stream(value)`. Pump stderr as `Note`. -6. Wait for claude to exit. On `Prompt is too long`, run `/compact` - on the session once and retry the turn. +6. Wait for claude to exit. Compaction is two-pronged — *reactive* + on `Prompt is too long` and *proactive* on a context watermark + (see [Compaction](#compaction) below). 7. Emit `LiveEvent::TurnEnd { ok, note }`. Sleep `poll_ms` to avoid tight loops on transient failures. @@ -43,14 +44,40 @@ override path: `HYPERHIVE_MODEL_FILE` env var for tests. `--continue` keeps a persistent session per agent (claude stores sessions in `~/.claude/projects/`, which is bind-mounted persistently). Auto-compact and auto-memory are disabled via -`--settings` because hyperhive owns compaction (`/compact` on -overflow, retry once; operator can also force one via `/api/compact`). +`--settings` because hyperhive owns compaction — see +[Compaction](#compaction) below. A one-shot `--continue` suppression is available via `POST /api/new-session` (or `/new-session` slash command in the per-agent terminal) — `Bus::take_skip_continue()` flips an `AtomicBool` once per turn, the next claude invocation drops `--continue`, every subsequent turn resumes normal behaviour. +### Compaction + +claude's own in-session auto-compact is off (`--settings`); hyperhive +owns it explicitly in `turn::drive_turn`. There are two triggers: + +- **Reactive** — claude-code prints `Prompt is too long` (the + `PROMPT_TOO_LONG_MARKER`). The session is *already* past the context + window, so no turn can run on it — `drive_turn` runs `/compact` + straight away and retries the same wake-up prompt once. No + notes-checkpoint turn is possible here: the detail is gone. +- **Proactive** — a turn finishes cleanly but the last inference's + context size (`Bus::last_ctx_usage().context_tokens()`) is at or + above a watermark. While the session is still healthy, `drive_turn` + injects one synthetic *notes-checkpoint* turn (`CHECKPOINT_PROMPT` + — "context is filling up, flush durable state into `/state` now") + and *then* runs `/compact`. This gives the agent a chance to + persist in-flight task state, decisions, and file paths before the + conversation detail collapses into a summary. + +The watermark is `HIVE_COMPACT_WATERMARK_TOKENS` (default `150_000`, +~75% of a 200k window); set it to `0` to disable proactive compaction +entirely (the reactive path always applies). The proactive path is +best-effort — a failed checkpoint turn or `/compact` is surfaced as a +`Note` but never fails the turn that already succeeded. The operator +can also force a compaction any time via `/api/compact`. + The child runs with `cwd = /state` (when the bind exists; falls back to the parent's cwd in dev), so any relative path in a tool call (`Read foo.md`, `Bash ls`, `Write notes.md`) lands in the diff --git a/hive-ag3nt/src/turn.rs b/hive-ag3nt/src/turn.rs index 99ca80e..3cb28fc 100644 --- a/hive-ag3nt/src/turn.rs +++ b/hive-ag3nt/src/turn.rs @@ -34,6 +34,33 @@ const CLAUDE_SETTINGS: &str = include_str!("../prompts/claude-settings.json"); /// claude exit with a useful error in the live view. const PROMPT_TOO_LONG_MARKER: &str = "Prompt is too long"; +/// Token watermark for *proactive* compaction. Once a turn finishes with +/// the last inference's context size at or above this many tokens, +/// `drive_turn` runs one dedicated notes-checkpoint turn (so the agent +/// can flush durable state into `/state`) and then `/compact` — while the +/// session is still healthy enough to run a turn at all. This is distinct +/// from the reactive `PROMPT_TOO_LONG_MARKER` path, which only fires once +/// the session is *already* past the window: at that point no turn can +/// run on it, so the reactive path just compacts + retries with no +/// checkpoint. Default is ~75% of a 200k-token window; override via +/// `HIVE_COMPACT_WATERMARK_TOKENS`, or set that to `0` to disable +/// proactive compaction entirely (the reactive path always applies). +const DEFAULT_COMPACT_WATERMARK_TOKENS: u64 = 150_000; + +/// Synthetic wake prompt for the proactive notes-checkpoint turn. Not an +/// inbox message — the harness injects it directly so the agent gets one +/// turn to persist durable state before `/compact` collapses the +/// turn-by-turn history into a summary. +const CHECKPOINT_PROMPT: &str = "[system] Context checkpoint — no inbox message to handle.\n\n\ +Your conversation context has grown large and the harness is about to run `/compact`, \ +which collapses the detailed turn-by-turn history into a short summary. Anything you \ +do not persist now is effectively lost after the next turn.\n\n\ +Use THIS turn to flush anything worth keeping into your durable `/state` files: update \ +your notes / CLAUDE.md / TODO.md with in-flight task state, decisions made, important \ +file paths, and whatever you would need to resume cleanly with only a summary of this \ +conversation to go on. Do not start new work or reply to anyone — just write your notes \ +and end the turn."; + /// The set of files claude reads on every invocation: the MCP server /// config (`--mcp-config`), static settings (`--settings`), and the /// pre-rendered role/tools system prompt (`--system-prompt-file`). @@ -129,10 +156,32 @@ pub enum TurnOutcome { Failed(anyhow::Error), } -/// Drive one turn end-to-end, transparently compacting + retrying once on -/// `Prompt is too long`. Both the sub-agent and manager loops call this. +/// Resolve the proactive-compaction watermark: `HIVE_COMPACT_WATERMARK_TOKENS` +/// if set to a valid integer, else `DEFAULT_COMPACT_WATERMARK_TOKENS`. A +/// value of `0` disables proactive compaction. +fn compact_watermark_tokens() -> u64 { + std::env::var("HIVE_COMPACT_WATERMARK_TOKENS") + .ok() + .and_then(|s| s.trim().parse::().ok()) + .unwrap_or(DEFAULT_COMPACT_WATERMARK_TOKENS) +} + +/// Drive one turn end-to-end. Two compaction paths layer on top of the +/// raw `run_turn`: +/// +/// - **Reactive** — `run_turn` returns `PromptTooLong`: the session is +/// already past the context window and *no* turn can run on it, so we +/// compact immediately and retry the same wake-up prompt once. No +/// notes-checkpoint turn is possible here — the detail is gone. +/// - **Proactive** — the turn finished cleanly but its context size has +/// crept past the watermark: while the session is still healthy we +/// give the agent one dedicated turn to checkpoint its `/state` notes, +/// then compact. This keeps a later turn from hitting the reactive +/// path (where there is no chance to save anything first). +/// +/// Both the sub-agent and manager loops call this. pub async fn drive_turn(prompt: &str, files: &TurnFiles, bus: &Bus) -> TurnOutcome { - match run_turn(prompt, files, bus).await { + let outcome = match run_turn(prompt, files, bus).await { TurnOutcome::PromptTooLong => { if let Err(e) = compact_session(files, bus).await { tracing::warn!(error = %format!("{e:#}"), "compact failed"); @@ -141,6 +190,56 @@ pub async fn drive_turn(prompt: &str, files: &TurnFiles, bus: &Bus) -> TurnOutco run_turn(prompt, files, bus).await } other => other, + }; + // Proactive: a turn just completed on a still-healthy session. If its + // context crossed the watermark, checkpoint + compact before a later + // turn overflows into the reactive path. Best-effort — never changes + // the outcome of the turn that already succeeded. + if matches!(outcome, TurnOutcome::Ok) { + maybe_checkpoint_and_compact(files, bus).await; + } + outcome +} + +/// Proactive post-turn compaction. If the last inference's context size +/// has crossed the watermark, run one notes-checkpoint turn so the agent +/// can persist durable state, then `/compact`. Best-effort: a failed +/// checkpoint or compaction is logged + surfaced as a Note but never +/// fails the turn that already succeeded. +async fn maybe_checkpoint_and_compact(files: &TurnFiles, bus: &Bus) { + let watermark = compact_watermark_tokens(); + if watermark == 0 { + return; // proactive compaction disabled + } + let Some(used) = bus.last_ctx_usage().map(|u| u.context_tokens()) else { + return; // no usage reading yet — nothing to compare against + }; + if used < watermark { + return; + } + bus.emit(LiveEvent::Note { + text: format!( + "context at {used} tokens (watermark {watermark}) — running a \ + notes-checkpoint turn before /compact" + ), + }); + // Give the agent one turn to flush durable state into /state. If the + // session is somehow already too far gone to run even this, fall + // through to compaction anyway — the checkpoint is best-effort. + match run_turn(CHECKPOINT_PROMPT, files, bus).await { + TurnOutcome::Ok => {} + TurnOutcome::PromptTooLong => bus.emit(LiveEvent::Note { + text: "checkpoint turn overflowed the window — compacting without it".into(), + }), + TurnOutcome::Failed(e) => bus.emit(LiveEvent::Note { + text: format!("checkpoint turn failed ({e:#}) — compacting anyway"), + }), + } + if let Err(e) = compact_session(files, bus).await { + tracing::warn!(error = %format!("{e:#}"), "post-checkpoint compact failed"); + bus.emit(LiveEvent::Note { + text: format!("/compact after checkpoint failed: {e:#}"), + }); } }