dashboard: surface silent unwrap_or_default in api_state

every snapshot source backing /api/state used .unwrap_or_default()
— sqlite errors, broker errors, nixos-container list failures,
operator_questions decode crashes all degraded to empty lists
without a log line. the 'pending question doesn't render'
bug we've been chasing was likely a row-decode panic in
OperatorQuestions::pending() being swallowed this way.

new log_default(what, result) replaces each call site: same
default value on Err but emits target=api_state warn with the
source name + dbg error first. five sources covered:
nixos-container list, approvals.pending,
approvals.recent_resolved, broker.recent_for(operator),
questions.pending. next time the question goes missing the
journal will say which source failed and how.

todo updated — pending-question entry now points at the new
log instead of three suspect paths.
This commit is contained in:
müde 2026-05-16 03:49:49 +02:00
parent 74ba8a63e1
commit 40938d8b54
2 changed files with 45 additions and 29 deletions

22
TODO.md
View file

@ -57,19 +57,15 @@ Pick anything from here when relevant. Cross-cutting design notes live in
Repro: manager calls `ask_operator`, tool result is
`question queued (id=N)` (so the row is in sqlite), but the
M1ND H4S QU3STI0NS section keeps showing "no pending
questions". Last seen with id=5. Suspected paths:
- `OperatorQuestions::pending()` returns Err and the
`unwrap_or_default()` in `api_state` hides it. Surface the
error (warn-log) and check.
- serialization: a new field in `OpQuestion` (e.g.
`deadline_at: Option<i64>`) deserializes wrong against an
old row whose columns don't match the new SELECT order →
`row.get(N)?` panics for that row, the whole iterator
errors, `pending()` returns Err. Diagnose by curl
`/api/state | jq '.questions'` and compare with sqlite
counts.
- dashboard JS swallows a render error. Open browser console
and look for exceptions during `renderQuestions`.
questions". Last seen with id=5. Diagnostic step landed:
`api_state` now warn-logs (target=`api_state`) when any of
its source queries fail instead of silently
`unwrap_or_default`-ing — next repro should print the
underlying error in journald and tell us whether this is
sqlite (likely `OperatorQuestions::pending()` row-decode
panic on a migrated column) or dashboard-JS-side
(`renderQuestions` exception). Re-investigate with the new
log once the bug fires.
## UI / UX