manager: needs_login / logged_in / needs_update events + update tool

crash_watch grows two more state-axes alongside running/stopped:

- logged-in (claude session dir populated for the agent)
- up-to-date (recorded flake rev matches current)

per-tick transitions emit HelperEvent::NeedsLogin / LoggedIn /
NeedsUpdate. seed-on-first-tick semantics retained — nothing fires
on harness boot for agents that were already in their state. only
needs_update fires the 'stale appeared' direction; the resolved
direction is already covered by Rebuilt.

new mcp__hyperhive__update(name) on the manager surface: idempotent
rebuild via auto_update::rebuild_agent. transient-aware (Rebuilding)
so the dashboard shows the spinner. login intentionally has NO tool
— it's interactive OAuth, only the operator can complete it.

prompts + approvals doc + turn-loop doc updated. todo grows a
'show per-agent applied config in dashboard' entry (separate
follow-up).
This commit is contained in:
müde 2026-05-15 21:42:13 +02:00
parent b374f39b0d
commit 80229c6af9
8 changed files with 230 additions and 34 deletions

View file

@ -259,6 +259,20 @@ pub enum HelperEvent {
/// A sub-agent's container was torn down (container removed; state
/// dirs preserved per `destroy` semantics).
Destroyed { agent: String },
/// A sub-agent's container has no claude session yet (first
/// spawn, or `--purge` wiped creds). Manager can't do anything
/// about it directly — login is interactive OAuth — but it
/// surfaces so the manager knows the agent is in partial-run
/// mode and can flag the operator.
NeedsLogin { agent: String },
/// An agent successfully completed claude login — the session
/// dir now contains creds. Transition fires once per login.
LoggedIn { agent: String },
/// An agent's recorded flake rev is stale relative to the
/// current hyperhive rev. The manager has the `update` tool to
/// trigger a rebuild without operator approval (it's a no-op
/// when nothing actually changed).
NeedsUpdate { agent: String },
/// Container exited without an operator-initiated stop. Fired by
/// the crash watcher when an agent's container transitions from
/// running → stopped and no `Stopping` / `Restarting` /
@ -326,6 +340,12 @@ pub enum ManagerRequest {
Restart {
name: String,
},
/// Rebuild a sub-agent: re-applies the current hyperhive flake +
/// agent.nix, restarts the container. No approval required —
/// it's idempotent and the manager owns its own update cadence.
Update {
name: String,
},
/// Submit a config commit for the user to approve. `commit_ref` is opaque
/// to the host (typically a git sha pointing into the agent's config repo).
/// On approval the host applies the change via `nixos-container update`.