add proactive context-size compaction with a notes-checkpoint turn
This commit is contained in:
parent
f2015954d9
commit
9cbb05bb86
3 changed files with 152 additions and 9 deletions
21
CLAUDE.md
21
CLAUDE.md
|
|
@ -200,6 +200,23 @@ read them à la carte.
|
|||
In-flight or recent context that hasn't earned a section yet.
|
||||
Prune freely.
|
||||
|
||||
- **Just landed:** proactive context-size compaction. `turn::
|
||||
drive_turn` now has two compaction paths. The old *reactive*
|
||||
one (claude prints `Prompt is too long` → `/compact` + retry)
|
||||
is unchanged — by then the session is already past the window
|
||||
and no turn can run on it. The new *proactive* one fires after
|
||||
a clean turn whose last-inference context
|
||||
(`Bus::last_ctx_usage().context_tokens()`) crossed a watermark:
|
||||
`drive_turn` injects one synthetic notes-checkpoint turn
|
||||
(`CHECKPOINT_PROMPT` — "flush durable state into /state now")
|
||||
then runs `/compact`, so the agent can persist in-flight state
|
||||
before the detail collapses into a summary. Watermark is
|
||||
`HIVE_COMPACT_WATERMARK_TOKENS` (default 150k, ~75% of a 200k
|
||||
window; `0` disables). New `maybe_checkpoint_and_compact`
|
||||
helper, all best-effort (a failed checkpoint/compact is a Note,
|
||||
never fails the turn). Both ag3nt + m1nd loops get it for free
|
||||
via the shared `drive_turn`. `docs/turn-loop.md` gained a
|
||||
Compaction section.
|
||||
- **Just landed:** dashboard side panel + forge-linked
|
||||
config/approvals + per-bucket model stats. (a) Long content
|
||||
(file previews, approval diffs, journald logs) opens in a
|
||||
|
|
@ -608,8 +625,8 @@ Prune freely.
|
|||
agent_web_port + collision banner + spawn/rebuild preflight,
|
||||
browser notifications, focus-preserving refresh, generalised
|
||||
<details data-restore-key> survival, prompt-on-submit pattern.
|
||||
- **Open threads:** two-step spawn, notes compaction,
|
||||
unprivileged containers, Bash allow-list, xterm.js. The
|
||||
- **Open threads:** two-step spawn, unprivileged
|
||||
containers, Bash allow-list, xterm.js. The
|
||||
deployment / gateway / privsep cluster is tracked as
|
||||
`area:ops` forge issues. (Landed since this note was first written:
|
||||
extra per-agent MCP servers, per-agent send allow-list,
|
||||
|
|
|
|||
|
|
@ -18,8 +18,9 @@ Each agent harness (`hive-ag3nt serve` or `hive-m1nd serve`) runs:
|
|||
over stdin.
|
||||
5. Stream stdout (JSON lines) into the bus as
|
||||
`LiveEvent::Stream(value)`. Pump stderr as `Note`.
|
||||
6. Wait for claude to exit. On `Prompt is too long`, run `/compact`
|
||||
on the session once and retry the turn.
|
||||
6. Wait for claude to exit. Compaction is two-pronged — *reactive*
|
||||
on `Prompt is too long` and *proactive* on a context watermark
|
||||
(see [Compaction](#compaction) below).
|
||||
7. Emit `LiveEvent::TurnEnd { ok, note }`. Sleep `poll_ms` to avoid
|
||||
tight loops on transient failures.
|
||||
|
||||
|
|
@ -43,14 +44,40 @@ override path: `HYPERHIVE_MODEL_FILE` env var for tests.
|
|||
`--continue` keeps a persistent session per agent (claude stores
|
||||
sessions in `~/.claude/projects/`, which is bind-mounted
|
||||
persistently). Auto-compact and auto-memory are disabled via
|
||||
`--settings` because hyperhive owns compaction (`/compact` on
|
||||
overflow, retry once; operator can also force one via `/api/compact`).
|
||||
`--settings` because hyperhive owns compaction — see
|
||||
[Compaction](#compaction) below.
|
||||
A one-shot `--continue` suppression is available via
|
||||
`POST /api/new-session` (or `/new-session` slash command in the
|
||||
per-agent terminal) — `Bus::take_skip_continue()` flips an
|
||||
`AtomicBool` once per turn, the next claude invocation drops
|
||||
`--continue`, every subsequent turn resumes normal behaviour.
|
||||
|
||||
### Compaction
|
||||
|
||||
claude's own in-session auto-compact is off (`--settings`); hyperhive
|
||||
owns it explicitly in `turn::drive_turn`. There are two triggers:
|
||||
|
||||
- **Reactive** — claude-code prints `Prompt is too long` (the
|
||||
`PROMPT_TOO_LONG_MARKER`). The session is *already* past the context
|
||||
window, so no turn can run on it — `drive_turn` runs `/compact`
|
||||
straight away and retries the same wake-up prompt once. No
|
||||
notes-checkpoint turn is possible here: the detail is gone.
|
||||
- **Proactive** — a turn finishes cleanly but the last inference's
|
||||
context size (`Bus::last_ctx_usage().context_tokens()`) is at or
|
||||
above a watermark. While the session is still healthy, `drive_turn`
|
||||
injects one synthetic *notes-checkpoint* turn (`CHECKPOINT_PROMPT`
|
||||
— "context is filling up, flush durable state into `/state` now")
|
||||
and *then* runs `/compact`. This gives the agent a chance to
|
||||
persist in-flight task state, decisions, and file paths before the
|
||||
conversation detail collapses into a summary.
|
||||
|
||||
The watermark is `HIVE_COMPACT_WATERMARK_TOKENS` (default `150_000`,
|
||||
~75% of a 200k window); set it to `0` to disable proactive compaction
|
||||
entirely (the reactive path always applies). The proactive path is
|
||||
best-effort — a failed checkpoint turn or `/compact` is surfaced as a
|
||||
`Note` but never fails the turn that already succeeded. The operator
|
||||
can also force a compaction any time via `/api/compact`.
|
||||
|
||||
The child runs with `cwd = /state` (when the bind exists; falls
|
||||
back to the parent's cwd in dev), so any relative path in a tool
|
||||
call (`Read foo.md`, `Bash ls`, `Write notes.md`) lands in the
|
||||
|
|
|
|||
|
|
@ -34,6 +34,33 @@ const CLAUDE_SETTINGS: &str = include_str!("../prompts/claude-settings.json");
|
|||
/// claude exit with a useful error in the live view.
|
||||
const PROMPT_TOO_LONG_MARKER: &str = "Prompt is too long";
|
||||
|
||||
/// Token watermark for *proactive* compaction. Once a turn finishes with
|
||||
/// the last inference's context size at or above this many tokens,
|
||||
/// `drive_turn` runs one dedicated notes-checkpoint turn (so the agent
|
||||
/// can flush durable state into `/state`) and then `/compact` — while the
|
||||
/// session is still healthy enough to run a turn at all. This is distinct
|
||||
/// from the reactive `PROMPT_TOO_LONG_MARKER` path, which only fires once
|
||||
/// the session is *already* past the window: at that point no turn can
|
||||
/// run on it, so the reactive path just compacts + retries with no
|
||||
/// checkpoint. Default is ~75% of a 200k-token window; override via
|
||||
/// `HIVE_COMPACT_WATERMARK_TOKENS`, or set that to `0` to disable
|
||||
/// proactive compaction entirely (the reactive path always applies).
|
||||
const DEFAULT_COMPACT_WATERMARK_TOKENS: u64 = 150_000;
|
||||
|
||||
/// Synthetic wake prompt for the proactive notes-checkpoint turn. Not an
|
||||
/// inbox message — the harness injects it directly so the agent gets one
|
||||
/// turn to persist durable state before `/compact` collapses the
|
||||
/// turn-by-turn history into a summary.
|
||||
const CHECKPOINT_PROMPT: &str = "[system] Context checkpoint — no inbox message to handle.\n\n\
|
||||
Your conversation context has grown large and the harness is about to run `/compact`, \
|
||||
which collapses the detailed turn-by-turn history into a short summary. Anything you \
|
||||
do not persist now is effectively lost after the next turn.\n\n\
|
||||
Use THIS turn to flush anything worth keeping into your durable `/state` files: update \
|
||||
your notes / CLAUDE.md / TODO.md with in-flight task state, decisions made, important \
|
||||
file paths, and whatever you would need to resume cleanly with only a summary of this \
|
||||
conversation to go on. Do not start new work or reply to anyone — just write your notes \
|
||||
and end the turn.";
|
||||
|
||||
/// The set of files claude reads on every invocation: the MCP server
|
||||
/// config (`--mcp-config`), static settings (`--settings`), and the
|
||||
/// pre-rendered role/tools system prompt (`--system-prompt-file`).
|
||||
|
|
@ -129,10 +156,32 @@ pub enum TurnOutcome {
|
|||
Failed(anyhow::Error),
|
||||
}
|
||||
|
||||
/// Drive one turn end-to-end, transparently compacting + retrying once on
|
||||
/// `Prompt is too long`. Both the sub-agent and manager loops call this.
|
||||
/// Resolve the proactive-compaction watermark: `HIVE_COMPACT_WATERMARK_TOKENS`
|
||||
/// if set to a valid integer, else `DEFAULT_COMPACT_WATERMARK_TOKENS`. A
|
||||
/// value of `0` disables proactive compaction.
|
||||
fn compact_watermark_tokens() -> u64 {
|
||||
std::env::var("HIVE_COMPACT_WATERMARK_TOKENS")
|
||||
.ok()
|
||||
.and_then(|s| s.trim().parse::<u64>().ok())
|
||||
.unwrap_or(DEFAULT_COMPACT_WATERMARK_TOKENS)
|
||||
}
|
||||
|
||||
/// Drive one turn end-to-end. Two compaction paths layer on top of the
|
||||
/// raw `run_turn`:
|
||||
///
|
||||
/// - **Reactive** — `run_turn` returns `PromptTooLong`: the session is
|
||||
/// already past the context window and *no* turn can run on it, so we
|
||||
/// compact immediately and retry the same wake-up prompt once. No
|
||||
/// notes-checkpoint turn is possible here — the detail is gone.
|
||||
/// - **Proactive** — the turn finished cleanly but its context size has
|
||||
/// crept past the watermark: while the session is still healthy we
|
||||
/// give the agent one dedicated turn to checkpoint its `/state` notes,
|
||||
/// then compact. This keeps a later turn from hitting the reactive
|
||||
/// path (where there is no chance to save anything first).
|
||||
///
|
||||
/// Both the sub-agent and manager loops call this.
|
||||
pub async fn drive_turn(prompt: &str, files: &TurnFiles, bus: &Bus) -> TurnOutcome {
|
||||
match run_turn(prompt, files, bus).await {
|
||||
let outcome = match run_turn(prompt, files, bus).await {
|
||||
TurnOutcome::PromptTooLong => {
|
||||
if let Err(e) = compact_session(files, bus).await {
|
||||
tracing::warn!(error = %format!("{e:#}"), "compact failed");
|
||||
|
|
@ -141,6 +190,56 @@ pub async fn drive_turn(prompt: &str, files: &TurnFiles, bus: &Bus) -> TurnOutco
|
|||
run_turn(prompt, files, bus).await
|
||||
}
|
||||
other => other,
|
||||
};
|
||||
// Proactive: a turn just completed on a still-healthy session. If its
|
||||
// context crossed the watermark, checkpoint + compact before a later
|
||||
// turn overflows into the reactive path. Best-effort — never changes
|
||||
// the outcome of the turn that already succeeded.
|
||||
if matches!(outcome, TurnOutcome::Ok) {
|
||||
maybe_checkpoint_and_compact(files, bus).await;
|
||||
}
|
||||
outcome
|
||||
}
|
||||
|
||||
/// Proactive post-turn compaction. If the last inference's context size
|
||||
/// has crossed the watermark, run one notes-checkpoint turn so the agent
|
||||
/// can persist durable state, then `/compact`. Best-effort: a failed
|
||||
/// checkpoint or compaction is logged + surfaced as a Note but never
|
||||
/// fails the turn that already succeeded.
|
||||
async fn maybe_checkpoint_and_compact(files: &TurnFiles, bus: &Bus) {
|
||||
let watermark = compact_watermark_tokens();
|
||||
if watermark == 0 {
|
||||
return; // proactive compaction disabled
|
||||
}
|
||||
let Some(used) = bus.last_ctx_usage().map(|u| u.context_tokens()) else {
|
||||
return; // no usage reading yet — nothing to compare against
|
||||
};
|
||||
if used < watermark {
|
||||
return;
|
||||
}
|
||||
bus.emit(LiveEvent::Note {
|
||||
text: format!(
|
||||
"context at {used} tokens (watermark {watermark}) — running a \
|
||||
notes-checkpoint turn before /compact"
|
||||
),
|
||||
});
|
||||
// Give the agent one turn to flush durable state into /state. If the
|
||||
// session is somehow already too far gone to run even this, fall
|
||||
// through to compaction anyway — the checkpoint is best-effort.
|
||||
match run_turn(CHECKPOINT_PROMPT, files, bus).await {
|
||||
TurnOutcome::Ok => {}
|
||||
TurnOutcome::PromptTooLong => bus.emit(LiveEvent::Note {
|
||||
text: "checkpoint turn overflowed the window — compacting without it".into(),
|
||||
}),
|
||||
TurnOutcome::Failed(e) => bus.emit(LiveEvent::Note {
|
||||
text: format!("checkpoint turn failed ({e:#}) — compacting anyway"),
|
||||
}),
|
||||
}
|
||||
if let Err(e) = compact_session(files, bus).await {
|
||||
tracing::warn!(error = %format!("{e:#}"), "post-checkpoint compact failed");
|
||||
bus.emit(LiveEvent::Note {
|
||||
text: format!("/compact after checkpoint failed: {e:#}"),
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue