agent badges: split into ctx (last-inference) + cost (cumulative)
the existing ctx badge was misnamed: it summed `result.usage`, which is
the cumulative tokens billed across every inference in the turn. for
tool-heavy turns that easily exceeds the model's context window (a 600k
cached prefix × 15 sub-calls = 9M cache_read), making it useless as a
"should i compact?" signal.
now two separate badges:
ctx · N last inference's prompt size = actual context window in
use right now. parsed from each `assistant` event's
`.message.usage`; the harness tracks the most recent one
across the stream and snapshots it when the `result`
event lands.
cost · M cumulative tokens billed across the whole turn (the
previous behaviour, now correctly labelled).
both update via a single `TokenUsageChanged { ctx, cost }` SSE event at
turn-end. turn_stats grows four columns (`last_input_tokens`,
`last_output_tokens`, `last_cache_read_input_tokens`,
`last_cache_creation_input_tokens`) so the cold-load seed can paint both
badges on page load. migrations run try-and-ignore ALTERs so existing
agent dbs catch up; pre-migration rows have last-inference zeros and
yield no `ctx` seed (badge stays empty until next turn) rather than a
misleading 0.
This commit is contained in:
parent
14549dd8a9
commit
5c6c607e25
9 changed files with 267 additions and 101 deletions
|
|
@ -310,13 +310,22 @@ Layout, top to bottom:
|
|||
`turn_state_since`.
|
||||
- Model chip: `model · <name>` (e.g. `model · haiku`). Driven
|
||||
by `LiveEvent::ModelChanged`; emitted from `Bus::set_model`.
|
||||
- Ctx badge: `ctx · 142k` — total prompt tokens in the
|
||||
current context window (input + cache_read + cache_write),
|
||||
mirroring claude code's bottom-right indicator. Hover for
|
||||
the breakdown including output. Driven by
|
||||
`LiveEvent::TokenUsageChanged`; emitted from
|
||||
`Bus::record_usage` whenever the terminal `result` event
|
||||
delivers a fresh usage block.
|
||||
- Ctx badge: `ctx · 142k` — last inference's prompt size
|
||||
(input + cache_read + cache_write of the most recent
|
||||
model call in the just-ended turn). This is the **actual
|
||||
context window utilisation** — the number to watch when
|
||||
deciding whether to compact.
|
||||
- Cost badge: `cost · 1.3M` — cumulative tokens billed
|
||||
across **every inference** in the last turn (sum of all
|
||||
per-call prompts). Tool-heavy turns rebill the cached
|
||||
prefix per call, so this routinely exceeds the model's
|
||||
window — it's a cost signal, not a size signal.
|
||||
- Both badges driven by `LiveEvent::TokenUsageChanged {
|
||||
ctx, cost }`, emitted once at turn-end from
|
||||
`Bus::record_turn_usage`. The harness tracks per-inference
|
||||
usage by walking `assistant` events in the stream-json
|
||||
and updating `last_inference` on each one; the `result`
|
||||
event supplies `cost` and triggers the emit.
|
||||
- Last-turn chip: `last turn 12.3s` appears after the first
|
||||
turn ends, computed from the state-since deltas.
|
||||
- `■ cancel turn` button: visible only while state=thinking,
|
||||
|
|
@ -437,8 +446,11 @@ Bus events (new vocabulary on `/events/stream`):
|
|||
`needs_login_idle` / `needs_login_in_progress`. Drives the
|
||||
alive-badge.
|
||||
- `model_changed { model }` — drives the model chip.
|
||||
- `token_usage_changed { usage: TokenUsage }` — drives the
|
||||
ctx-badge. Emitted from `Bus::record_usage` whenever the
|
||||
stream-json `result` event delivers a fresh usage block.
|
||||
- `token_usage_changed { ctx: TokenUsage, cost: TokenUsage }`
|
||||
— drives the ctx + cost badges. Emitted from
|
||||
`Bus::record_turn_usage` at turn-end; `ctx` is the last
|
||||
inference's usage (current context size), `cost` is the
|
||||
cumulative across every inference (the `result` event's
|
||||
totals).
|
||||
- `turn_state_changed { state, since_unix }` — drives the
|
||||
state badge (`idle`/`thinking`/`compacting`).
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue