agent badges: split into ctx (last-inference) + cost (cumulative)
the existing ctx badge was misnamed: it summed `result.usage`, which is
the cumulative tokens billed across every inference in the turn. for
tool-heavy turns that easily exceeds the model's context window (a 600k
cached prefix × 15 sub-calls = 9M cache_read), making it useless as a
"should i compact?" signal.
now two separate badges:
ctx · N last inference's prompt size = actual context window in
use right now. parsed from each `assistant` event's
`.message.usage`; the harness tracks the most recent one
across the stream and snapshots it when the `result`
event lands.
cost · M cumulative tokens billed across the whole turn (the
previous behaviour, now correctly labelled).
both update via a single `TokenUsageChanged { ctx, cost }` SSE event at
turn-end. turn_stats grows four columns (`last_input_tokens`,
`last_output_tokens`, `last_cache_read_input_tokens`,
`last_cache_creation_input_tokens`) so the cold-load seed can paint both
badges on page load. migrations run try-and-ignore ALTERs so existing
agent dbs catch up; pre-migration rows have last-inference zeros and
yield no `ctx` seed (badge stays empty until next turn) rather than a
misleading 0.
This commit is contained in:
parent
14549dd8a9
commit
5c6c607e25
9 changed files with 267 additions and 101 deletions
|
|
@ -279,14 +279,28 @@ async fn run_claude(prompt: &str, files: &TurnFiles, bus: &Bus) -> Result<bool>
|
|||
let bus_err = bus.clone();
|
||||
let pump_stdout = tokio::spawn(async move {
|
||||
let mut reader = BufReader::new(stdout).lines();
|
||||
// Track usage as the turn unfolds. `last_inference` overwrites on
|
||||
// every assistant event so at result-time it holds the most recent
|
||||
// model call's usage — the actual context size. The `result` event
|
||||
// carries the cumulative-across-the-turn usage (cost signal). Both
|
||||
// get handed to `record_turn_usage` together so a single SSE
|
||||
// event updates both badges.
|
||||
let mut last_inference: Option<crate::events::TokenUsage> = None;
|
||||
while let Ok(Some(line)) = reader.next_line().await {
|
||||
if line.contains(PROMPT_TOO_LONG_MARKER) {
|
||||
flag_out.store(true, Ordering::Relaxed);
|
||||
}
|
||||
match serde_json::from_str::<serde_json::Value>(&line) {
|
||||
Ok(v) => {
|
||||
if let Some(usage) = crate::events::TokenUsage::from_stream_event(&v) {
|
||||
bus_out.record_usage(usage);
|
||||
if let Some(u) = crate::events::TokenUsage::from_assistant_event(&v) {
|
||||
last_inference = Some(u);
|
||||
}
|
||||
if let Some(cost) = crate::events::TokenUsage::from_stream_event(&v) {
|
||||
// Fallback to `cost` if the turn somehow produced
|
||||
// a result without any assistant event — keeps the
|
||||
// ctx badge from going stale on a degenerate turn.
|
||||
let ctx = last_inference.unwrap_or(cost);
|
||||
bus_out.record_turn_usage(ctx, cost);
|
||||
}
|
||||
bus_out.observe_stream(&v);
|
||||
bus_out.emit(LiveEvent::Stream(v));
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue