todo + terminal-rendering: ctx-badge cold-load, auto session-reset, more coherence-pass gripes
This commit is contained in:
parent
9995bbc891
commit
8a3e8bfb7f
2 changed files with 98 additions and 0 deletions
6
TODO.md
6
TODO.md
|
|
@ -61,6 +61,12 @@ how often the friction bites in normal use.
|
||||||
- **Surface per-turn stats on the agent web UI**: "N turns today" chip + rolling tool-call histogram tooltip on the model chip. (`open_threads` and `open_reminders` chips already landed via other paths — open-threads section on the page + reminder count chip on the container row.) Reads the per-agent `turn_stats.sqlite`.
|
- **Surface per-turn stats on the agent web UI**: "N turns today" chip + rolling tool-call histogram tooltip on the model chip. (`open_threads` and `open_reminders` chips already landed via other paths — open-threads section on the page + reminder count chip on the container row.) Reads the per-agent `turn_stats.sqlite`.
|
||||||
- **Stats UI on the main dashboard**: per-agent rollups (avg turn duration, tokens-since-boot, top 5 tools) on the container row. Same data source, host-side aggregation query.
|
- **Stats UI on the main dashboard**: per-agent rollups (avg turn duration, tokens-since-boot, top 5 tools) on the container row. Same data source, host-side aggregation query.
|
||||||
|
|
||||||
|
## Harness Behaviour
|
||||||
|
|
||||||
|
- **Persist + cold-load current context size on the per-agent page**: the `ctx-badge` (Claude Code's bottom-right "N tokens" indicator) currently only populates after the first `TokenUsageChanged` SSE event arrives, which is the *next* turn — until then the badge is empty. Operator can't see "this agent is at 78% context" before deciding to manually compact / reset / message it. Last known token usage should be persisted (likely a small `/state/hyperhive-token-usage` blob, or a row in turn_stats already has it — pull last row's totals on cold load) and returned by `/api/state` so the badge paints with real numbers on first render.
|
||||||
|
|
||||||
|
- **Auto session-reset when context is large and cache is cold**: today every turn uses `--continue`, so a long-lived agent carries its entire transcript forward indefinitely. When the next turn's context is above some threshold (rough starting point: ~50% of the session limit — hive startup alone burned ~15%, so the headroom disappears fast) *and* the prompt cache is no longer warm (last turn ended past the cache TTL), it's cheaper to start fresh than to re-send the whole history uncached. Open question: drop `--continue` vs. trigger `--compact` first — needs measurement of what each actually costs (uncached re-read of N tokens vs. a compact turn's own token spend + the post-compact uncached re-read). Decision should be data-driven, not guessed. Needs: a context-size estimate per turn (turn_stats already tracks token usage), a cache-warmth heuristic (time since last turn vs. cache TTL), and a one-shot fresh-session path in `turn.rs` mirroring the existing `↻ new session` button.
|
||||||
|
|
||||||
## Bugs
|
## Bugs
|
||||||
|
|
||||||
- **Token-budget exhaustion crashes the harness**: when claude's account hits its rate/token cap, the in-flight `claude --print` invocation returns an error the harness doesn't recognise as recoverable, the serve loop exits, and the container stays up with a dead daemon. Operator only notices when an unrelated wake fails to drive a turn. Want: detect the budget-exceeded class of failure (likely a specific stderr line or stream-json `rate_limit_event` shape), fire a `LiveEvent::StatusChanged("rate_limited")` or new status, surface as a red badge + banner on the dashboard + per-agent UI, and have the serve loop park (sleep N minutes, retry) instead of returning Err. Operator can also see "this agent is rate-limited until ~HH:MM" if claude tells us when. Inspect `crate::turn::run_claude`'s `bail!` paths + claude's stderr conventions for the budget error string.
|
- **Token-budget exhaustion crashes the harness**: when claude's account hits its rate/token cap, the in-flight `claude --print` invocation returns an error the harness doesn't recognise as recoverable, the serve loop exits, and the container stays up with a dead daemon. Operator only notices when an unrelated wake fails to drive a turn. Want: detect the budget-exceeded class of failure (likely a specific stderr line or stream-json `rate_limit_event` shape), fire a `LiveEvent::StatusChanged("rate_limited")` or new status, surface as a red badge + banner on the dashboard + per-agent UI, and have the serve loop park (sleep N minutes, retry) instead of returning Err. Operator can also see "this agent is rate-limited until ~HH:MM" if claude tells us when. Inspect `crate::turn::run_claude`'s `bail!` paths + claude's stderr conventions for the budget error string.
|
||||||
|
|
|
||||||
|
|
@ -71,6 +71,56 @@ shared `.live .<class>` styling).
|
||||||
autonomous harness chatter; nothing flags them as "operator
|
autonomous harness chatter; nothing flags them as "operator
|
||||||
initiated."
|
initiated."
|
||||||
|
|
||||||
|
7. **Turn-start / turn-end are visually overweight**: the
|
||||||
|
triggering event row + the closing row both get bold text,
|
||||||
|
top margins, and a full coloured background tint. The
|
||||||
|
coloured left rule alone already says "this is a turn
|
||||||
|
boundary" — the heavy chrome adds noise without information.
|
||||||
|
Drop the bold/margin/tint, keep the left rule.
|
||||||
|
|
||||||
|
8. **Left-alignment is incoherent across row types**: the `▸`
|
||||||
|
disclosure marker on expandable rows doesn't sit in the
|
||||||
|
same column as flat-row prefix glyphs (`→`, `←`, `·`), so
|
||||||
|
the prefix column wanders by row kind. Expandable tool_use
|
||||||
|
(`→ Name …`) and expandable tool_result (`▸ ← Nl …`) use
|
||||||
|
different layouts from each other too. Pick one prefix
|
||||||
|
column and align every row kind into it; the disclosure
|
||||||
|
marker should be visually inside that column, not pushed
|
||||||
|
to the side.
|
||||||
|
|
||||||
|
9. **Continuation lines aren't inset**: when a row's body
|
||||||
|
wraps (e.g. `→ foo bar baz quux …`), the second visual
|
||||||
|
line starts at column 0 rather than under the `f` of
|
||||||
|
`foo`, so wrapped content blurs into the next row. Want a
|
||||||
|
`text-indent` / hanging-indent rule so continuation lines
|
||||||
|
align with the start of the body, not the prefix glyph.
|
||||||
|
|
||||||
|
10. **Most message-bearing rows are collapsed by default**
|
||||||
|
even when the body is the whole point: `send`, `recv`,
|
||||||
|
`ask`, and friends hide their text behind a `<details>`
|
||||||
|
summary. The summary headline is rarely enough context.
|
||||||
|
Want these expanded by default, with collapse reserved
|
||||||
|
for genuinely heavy payloads (multi-page diffs, long
|
||||||
|
tool_result blocks).
|
||||||
|
|
||||||
|
11. **Tools from extra MCP servers aren't pretty-printed**:
|
||||||
|
`renderRichToolUse` only special-cases the built-in
|
||||||
|
hyperhive tools (Write/Edit/send). Anything coming from
|
||||||
|
`extraMcpServers` (matrix, bitburner, …) falls through
|
||||||
|
to the generic `→ Name args…` JSON dump. Either a
|
||||||
|
plugin-style hook in the renderer map, or at minimum a
|
||||||
|
nicer generic args-pretty-printer that handles common
|
||||||
|
shapes (single string arg, single dict arg, etc.).
|
||||||
|
|
||||||
|
12. **No markdown rendering on message bodies**: `send` /
|
||||||
|
`recv` / `ask` / `answer` / agent text content all
|
||||||
|
arrive as markdown-flavoured prose (bullets, fenced
|
||||||
|
code, bold, links) but render as raw text. Want a
|
||||||
|
minimal markdown pass — at least lists, fenced code,
|
||||||
|
inline code, bold/italic, links — applied to message
|
||||||
|
bodies and probably to the `assistant.content[].text`
|
||||||
|
`.text` row.
|
||||||
|
|
||||||
## Suggested coherence pass
|
## Suggested coherence pass
|
||||||
|
|
||||||
Pick one scheme and audit all renderers to match. A concrete
|
Pick one scheme and audit all renderers to match. A concrete
|
||||||
|
|
@ -96,6 +146,48 @@ The `.sys` catch-all should escalate visually — a louder
|
||||||
shapes for future fix-up rather than hiding them in the muted
|
shapes for future fix-up rather than hiding them in the muted
|
||||||
note stream.
|
note stream.
|
||||||
|
|
||||||
|
### Layout rules to apply uniformly
|
||||||
|
|
||||||
|
- **One prefix column for every row kind** — flat rows and
|
||||||
|
expandable rows alike. The disclosure marker (`▸`) lives
|
||||||
|
inside that column, not as a separate gutter, so glyph
|
||||||
|
alignment doesn't shift between row types.
|
||||||
|
- **Hanging indent on wrapped bodies** so continuation lines
|
||||||
|
start under the first character of the body, not under the
|
||||||
|
prefix glyph. Probably `display: grid` with
|
||||||
|
`grid-template-columns: <prefix-col> 1fr` per row, or
|
||||||
|
`text-indent`/`padding-left` with negative `text-indent`.
|
||||||
|
- **Turn boundaries are rule-only** — drop the bold + margin
|
||||||
|
+ tint on `.turn-start` / `.turn-body` / `.turn-end-*`.
|
||||||
|
The coloured left rule alone carries the boundary signal.
|
||||||
|
- **Default-expanded message rows**: `send`, `recv`, `ask`,
|
||||||
|
`answer`, and short-ish `text` rows render their body
|
||||||
|
inline (no `<details>`). Reserve collapse for genuinely
|
||||||
|
heavy bodies — multi-page diffs, long tool_results,
|
||||||
|
thinking blocks past N lines.
|
||||||
|
|
||||||
|
### Renderer surface for extra MCP tools
|
||||||
|
|
||||||
|
`renderRichToolUse` currently switches on hard-coded tool
|
||||||
|
names. Replace with a registry keyed by
|
||||||
|
`mcp__<server>__<tool>` (and the built-in claude tool names)
|
||||||
|
that maps to a `(toolUse) -> Node` renderer. Per-agent extra
|
||||||
|
MCPs can register their own renderers via a small JS hook
|
||||||
|
(loaded the same way `extra-mcp.json` is loaded server-side);
|
||||||
|
the fallback is a generic args-pretty-printer that handles
|
||||||
|
single-string and single-dict shapes nicely instead of dumping
|
||||||
|
raw JSON.
|
||||||
|
|
||||||
|
### Markdown rendering
|
||||||
|
|
||||||
|
Apply a minimal markdown pass to message bodies (`send` /
|
||||||
|
`recv` / `ask` / `answer`) and to assistant `text` rows.
|
||||||
|
Scope: paragraphs, lists, fenced + inline code, bold/italic,
|
||||||
|
links. No HTML passthrough, no images, no tables —
|
||||||
|
explicitly bounded so we don't import a kitchen-sink parser
|
||||||
|
into the per-agent page. A small handwritten pass or a tiny
|
||||||
|
dep (e.g. `micromark` / `marked`) both work.
|
||||||
|
|
||||||
## Dashboard side (not covered here)
|
## Dashboard side (not covered here)
|
||||||
|
|
||||||
The main dashboard's message-flow pane is a different beast:
|
The main dashboard's message-flow pane is a different beast:
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue