ask_operator: ttl_seconds auto-cancel + remaining-time chip

manager can pass ttl_seconds to ask_operator. on submit, host stores deadline_at = now + ttl in operator_questions (new column, migrated via existing pragma_table_info pattern), spawns a tokio task that sleeps until the deadline then resolves the question with answer '[expired]' and fires the same OperatorAnswered helper event. already-resolved races no-op silently. dashboard renders a '⏳ MM:SS' chip on the question row when deadline_at is set. format collapses seconds → s, < 1h → m s, ≥ 1h → h m. heartbeat refresh (5s) keeps the chip current; the operator sees it tick down. manager prompt + mcp tool description updated. journald viewer per container queued in todo (separate task).
2026-05-15 20:38:02 +02:00 · 2026-05-15 20:38:02 +02:00 · 754db7830e
commit 754db7830e
parent 2146e47770
8 changed files with 133 additions and 36 deletions
--- a/TODO.md
+++ b/TODO.md
@ -68,13 +68,6 @@ Pick anything from here when relevant. Cross-cutting design notes live in

 ## Manager → operator question channel

- **TTL on `ask_operator`.** Manual cancel via dashboard already
-  ships (✗ CANC3L button resolves the question with answer
-  `[cancelled]` and fires `OperatorAnswered` so the manager sees a
-  terminal state). Still missing: per-question `ttl_seconds` that
-  auto-cancels after a deadline. Spawn a tokio task per submitted
-  question that calls the same cancel path after the ttl expires
-  (cheap; rare). Surface remaining time on the dashboard.

 ## Spawn flow

@ -114,6 +107,17 @@ Pick anything from here when relevant. Cross-cutting design notes live in

 ## Lifecycle / reliability

+- **journald viewer per container in the dashboard.** Surface the
+  equivalent of `journalctl -M h-coder -b` in the dashboard so the
+  operator can see container logs without ssh-ing in. Optional
+  filter by hive-specific systemd unit (`hive-ag3nt.service`,
+  `hive-m1nd.service`). Implementation: backend shells out to
+  `journalctl -M <container> -b --output=short-iso --no-pager`
+  (optionally `-u <unit>`), streams or paginates the result over a
+  new dashboard endpoint. Could be a `<details>` per container row
+  or a dedicated page. Honest journalctl, not the in-container
+  events stream — those are different surfaces (events = claude turn
+  loop; journalctl = systemd-wide logs incl. boot, network, etc.).
 - **Container crash events.** Watch `container@*.service` via D-Bus, push
  `HelperEvent::ContainerCrash` to the manager's inbox so the manager can
  react (restart, escalate, etc.).