agent badges: split into ctx (last-inference) + cost (cumulative)

the existing ctx badge was misnamed: it summed `result.usage`, which is
the cumulative tokens billed across every inference in the turn. for
tool-heavy turns that easily exceeds the model's context window (a 600k
cached prefix × 15 sub-calls = 9M cache_read), making it useless as a
"should i compact?" signal.

now two separate badges:

  ctx · N    last inference's prompt size = actual context window in
             use right now. parsed from each `assistant` event's
             `.message.usage`; the harness tracks the most recent one
             across the stream and snapshots it when the `result`
             event lands.

  cost · M   cumulative tokens billed across the whole turn (the
             previous behaviour, now correctly labelled).

both update via a single `TokenUsageChanged { ctx, cost }` SSE event at
turn-end. turn_stats grows four columns (`last_input_tokens`,
`last_output_tokens`, `last_cache_read_input_tokens`,
`last_cache_creation_input_tokens`) so the cold-load seed can paint both
badges on page load. migrations run try-and-ignore ALTERs so existing
agent dbs catch up; pre-migration rows have last-inference zeros and
yield no `ctx` seed (badge stays empty until next turn) rather than a
misleading 0.
This commit is contained in:
müde 2026-05-18 18:48:35 +02:00
parent 14549dd8a9
commit 5c6c607e25
9 changed files with 267 additions and 101 deletions

View file

@ -279,14 +279,28 @@ async fn run_claude(prompt: &str, files: &TurnFiles, bus: &Bus) -> Result<bool>
let bus_err = bus.clone();
let pump_stdout = tokio::spawn(async move {
let mut reader = BufReader::new(stdout).lines();
// Track usage as the turn unfolds. `last_inference` overwrites on
// every assistant event so at result-time it holds the most recent
// model call's usage — the actual context size. The `result` event
// carries the cumulative-across-the-turn usage (cost signal). Both
// get handed to `record_turn_usage` together so a single SSE
// event updates both badges.
let mut last_inference: Option<crate::events::TokenUsage> = None;
while let Ok(Some(line)) = reader.next_line().await {
if line.contains(PROMPT_TOO_LONG_MARKER) {
flag_out.store(true, Ordering::Relaxed);
}
match serde_json::from_str::<serde_json::Value>(&line) {
Ok(v) => {
if let Some(usage) = crate::events::TokenUsage::from_stream_event(&v) {
bus_out.record_usage(usage);
if let Some(u) = crate::events::TokenUsage::from_assistant_event(&v) {
last_inference = Some(u);
}
if let Some(cost) = crate::events::TokenUsage::from_stream_event(&v) {
// Fallback to `cost` if the turn somehow produced
// a result without any assistant event — keeps the
// ctx badge from going stale on a degenerate turn.
let ctx = last_inference.unwrap_or(cost);
bus_out.record_turn_usage(ctx, cost);
}
bus_out.observe_stream(&v);
bus_out.emit(LiveEvent::Stream(v));