container crash watcher → HelperEvent::ContainerCrash
new hive_c0re::crash_watch task polls every 10s, builds the set of currently-running containers, and on running→stopped transitions checks the transient snapshot: if no Stopping / Restarting / Destroying / Rebuilding flag is set, the container exited unexpectedly and we fire HelperEvent::ContainerCrash into the manager's inbox so it can react (typically: start it again). first poll is a seeding pass — no events on harness startup. dbus subscription would be lower-latency but polling is honest and debuggable, and a 10s delay on crash detection is fine for our scale. manager prompt + approvals doc updated to advertise the new event variant. todo drops the entry (and the journald-viewer entry that already shipped).
This commit is contained in:
parent
6db38cf70c
commit
58c3cd853b
6 changed files with 92 additions and 7 deletions
6
TODO.md
6
TODO.md
|
|
@ -99,9 +99,3 @@ Pick anything from here when relevant. Cross-cutting design notes live in
|
|||
that takes the existing notes + a "compact this" prompt and rewrites
|
||||
them in place. Add when the notes start bloating.
|
||||
|
||||
## Lifecycle / reliability
|
||||
|
||||
- **Container crash events.** Watch `container@*.service` via D-Bus, push
|
||||
`HelperEvent::ContainerCrash` to the manager's inbox so the manager can
|
||||
react (restart, escalate, etc.).
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue