manager: needs_login / logged_in / needs_update events + update tool
crash_watch grows two more state-axes alongside running/stopped: - logged-in (claude session dir populated for the agent) - up-to-date (recorded flake rev matches current) per-tick transitions emit HelperEvent::NeedsLogin / LoggedIn / NeedsUpdate. seed-on-first-tick semantics retained — nothing fires on harness boot for agents that were already in their state. only needs_update fires the 'stale appeared' direction; the resolved direction is already covered by Rebuilt. new mcp__hyperhive__update(name) on the manager surface: idempotent rebuild via auto_update::rebuild_agent. transient-aware (Rebuilding) so the dashboard shows the spinner. login intentionally has NO tool — it's interactive OAuth, only the operator can complete it. prompts + approvals doc + turn-loop doc updated. todo grows a 'show per-agent applied config in dashboard' entry (separate follow-up).
This commit is contained in:
parent
b374f39b0d
commit
80229c6af9
8 changed files with 230 additions and 34 deletions
9
TODO.md
9
TODO.md
|
|
@ -44,6 +44,15 @@ Pick anything from here when relevant. Cross-cutting design notes live in
|
||||||
|
|
||||||
## UI / UX
|
## UI / UX
|
||||||
|
|
||||||
|
- **Dashboard: show per-agent applied config.** Surface
|
||||||
|
`/var/lib/hyperhive/applied/<name>/agent.nix` (the file the
|
||||||
|
container actually builds from) as a collapsible `<details>`
|
||||||
|
block on each container row, alongside the journald viewer.
|
||||||
|
Backend: new `GET /api/agent-config/{name}` returns the file
|
||||||
|
contents (text/plain). Frontend: lazy-fetch on expand, render
|
||||||
|
inside a `<pre>` with the same theming as the journal panel.
|
||||||
|
Useful for spot-checking what `request_apply_commit` produced
|
||||||
|
without ssh-ing in.
|
||||||
- **xterm.js terminal** embedded per-agent, attached to a PTY exposed by
|
- **xterm.js terminal** embedded per-agent, attached to a PTY exposed by
|
||||||
the harness. Pairs well with the unprivileged-container work — would let
|
the harness. Pairs well with the unprivileged-container work — would let
|
||||||
the operator drop into the container without `nixos-container root-login`.
|
the operator drop into the container without `nixos-container root-login`.
|
||||||
|
|
|
||||||
|
|
@ -129,6 +129,14 @@ regular claude turn so the manager can react. Variants
|
||||||
running container went away with no operator-initiated transient
|
running container went away with no operator-initiated transient
|
||||||
state (Stopping / Restarting / Destroying / Rebuilding). Manager
|
state (Stopping / Restarting / Destroying / Rebuilding). Manager
|
||||||
can `start` it again or escalate.
|
can `start` it again or escalate.
|
||||||
|
- `NeedsLogin { agent }` — sub-agent has no claude session yet.
|
||||||
|
Manager can't act directly (interactive OAuth); typically flags
|
||||||
|
the operator.
|
||||||
|
- `LoggedIn { agent }` — sub-agent just completed login. Manager
|
||||||
|
often greets the agent on this event.
|
||||||
|
- `NeedsUpdate { agent }` — sub-agent's recorded flake rev is
|
||||||
|
stale. Manager calls `update(name)` to rebuild — idempotent,
|
||||||
|
no approval required.
|
||||||
- `OperatorAnswered { id, question, answer }` — dashboard
|
- `OperatorAnswered { id, question, answer }` — dashboard
|
||||||
`/answer-question/{id}` after the operator submits the answer
|
`/answer-question/{id}` after the operator submits the answer
|
||||||
form.
|
form.
|
||||||
|
|
|
||||||
|
|
@ -100,6 +100,9 @@ it as a stdio child via `--mcp-config`. The hyperhive socket name is
|
||||||
- `kill(name)` — graceful stop. No approval required.
|
- `kill(name)` — graceful stop. No approval required.
|
||||||
- `start(name)` — start a stopped sub-agent. No approval.
|
- `start(name)` — start a stopped sub-agent. No approval.
|
||||||
- `restart(name)` — stop + start. No approval.
|
- `restart(name)` — stop + start. No approval.
|
||||||
|
- `update(name)` — rebuild (re-applies the current hyperhive flake
|
||||||
|
+ agent.nix, restarts). No approval, idempotent. Manager calls
|
||||||
|
this on receipt of a `needs_update` system event.
|
||||||
- `request_apply_commit(agent, commit_ref)` — submit a config
|
- `request_apply_commit(agent, commit_ref)` — submit a config
|
||||||
change for any agent (`hm1nd` for the manager's own config) for
|
change for any agent (`hm1nd` for the manager's own config) for
|
||||||
operator approval.
|
operator approval.
|
||||||
|
|
|
||||||
|
|
@ -8,6 +8,7 @@ Tools (hyperhive surface):
|
||||||
- `mcp__hyperhive__kill(name)` — graceful stop on a sub-agent. No approval required.
|
- `mcp__hyperhive__kill(name)` — graceful stop on a sub-agent. No approval required.
|
||||||
- `mcp__hyperhive__start(name)` — start a stopped sub-agent. No approval required.
|
- `mcp__hyperhive__start(name)` — start a stopped sub-agent. No approval required.
|
||||||
- `mcp__hyperhive__restart(name)` — stop + start a sub-agent. No approval required.
|
- `mcp__hyperhive__restart(name)` — stop + start a sub-agent. No approval required.
|
||||||
|
- `mcp__hyperhive__update(name)` — rebuild a sub-agent (re-applies the current hyperhive flake + agent.nix, restarts the container). No approval required — idempotent. Use when you receive a `needs_update` system event.
|
||||||
- `mcp__hyperhive__request_apply_commit(agent, commit_ref)` — submit a config change for any agent (`hm1nd` for self) for operator approval.
|
- `mcp__hyperhive__request_apply_commit(agent, commit_ref)` — submit a config change for any agent (`hm1nd` for self) for operator approval.
|
||||||
- `mcp__hyperhive__ask_operator(question, options?, multi?, ttl_seconds?)` — surface a question on the dashboard. Returns immediately with a question id; the operator's answer arrives later as a system `operator_answered` event in your inbox. Options are advisory: the dashboard always lets the operator type a free-text answer in addition. Set `multi: true` to render options as checkboxes (operator can pick multiple); the answer comes back as `, `-separated. Set `ttl_seconds` to auto-cancel after a deadline — useful when the decision becomes moot if the operator hasn't responded in time; on expiry the answer is `[expired]`. Do not poll inside the same turn — finish the current work and react when the event lands.
|
- `mcp__hyperhive__ask_operator(question, options?, multi?, ttl_seconds?)` — surface a question on the dashboard. Returns immediately with a question id; the operator's answer arrives later as a system `operator_answered` event in your inbox. Options are advisory: the dashboard always lets the operator type a free-text answer in addition. Set `multi: true` to render options as checkboxes (operator can pick multiple); the answer comes back as `, `-separated. Set `ttl_seconds` to auto-cancel after a deadline — useful when the decision becomes moot if the operator hasn't responded in time; on expiry the answer is `[expired]`. Do not poll inside the same turn — finish the current work and react when the event lands.
|
||||||
|
|
||||||
|
|
@ -26,7 +27,13 @@ You're the policy gate between sub-agents and the operator's approval queue —
|
||||||
|
|
||||||
Two ways to talk to the operator: `send(to: "operator", ...)` for fire-and-forget status / pointers (surfaces in the operator inbox), or `ask_operator(question, options?)` when you need a decision. `ask_operator` is non-blocking — it queues the question and returns an id immediately; the answer arrives on a future turn as an `operator_answered` system event. Prefer `ask_operator` over an open-ended `send` for anything you actually need to wait on.
|
Two ways to talk to the operator: `send(to: "operator", ...)` for fire-and-forget status / pointers (surfaces in the operator inbox), or `ask_operator(question, options?)` when you need a decision. `ask_operator` is non-blocking — it queues the question and returns an id immediately; the answer arrives on a future turn as an `operator_answered` system event. Prefer `ask_operator` over an open-ended `send` for anything you actually need to wait on.
|
||||||
|
|
||||||
Messages from sender `system` are hyperhive helper events (JSON body, `event` field discriminates): `approval_resolved`, `spawned`, `rebuilt`, `killed`, `destroyed`, `container_crash`, `operator_answered`. Use these to react to lifecycle changes — e.g. greet a freshly-spawned agent, retry a failed rebuild, restart an agent whose container crashed, or pick up the operator's answer to a question you previously asked.
|
Messages from sender `system` are hyperhive helper events (JSON body, `event` field discriminates): `approval_resolved`, `spawned`, `rebuilt`, `killed`, `destroyed`, `container_crash`, `needs_login`, `logged_in`, `needs_update`, `operator_answered`. Use these to react to lifecycle changes:
|
||||||
|
|
||||||
|
- `needs_login` — agent has no claude session yet. You can't help directly (login is interactive OAuth on the operator side); flag the operator if it's been long.
|
||||||
|
- `logged_in` — agent just completed login; first useful turn is imminent. Good time to brief them on what to do.
|
||||||
|
- `needs_update` — agent's flake rev is stale. Call `update(name)` to rebuild — it's idempotent and doesn't need approval.
|
||||||
|
- `container_crash` — restart with `start(name)`. If it crashes again, ask the operator.
|
||||||
|
- otherwise greet freshly-spawned agents, retry failed rebuilds, pick up the operator's answer to questions you asked.
|
||||||
|
|
||||||
Durable knowledge:
|
Durable knowledge:
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -240,6 +240,12 @@ pub struct RestartArgs {
|
||||||
pub name: String,
|
pub name: String,
|
||||||
}
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, serde::Deserialize, schemars::JsonSchema)]
|
||||||
|
pub struct UpdateArgs {
|
||||||
|
/// Sub-agent name (without the `h-` container prefix).
|
||||||
|
pub name: String,
|
||||||
|
}
|
||||||
|
|
||||||
#[derive(Debug, serde::Deserialize, schemars::JsonSchema)]
|
#[derive(Debug, serde::Deserialize, schemars::JsonSchema)]
|
||||||
pub struct AskOperatorArgs {
|
pub struct AskOperatorArgs {
|
||||||
/// The question to surface on the dashboard.
|
/// The question to surface on the dashboard.
|
||||||
|
|
@ -400,6 +406,23 @@ impl ManagerServer {
|
||||||
.await
|
.await
|
||||||
}
|
}
|
||||||
|
|
||||||
|
#[tool(
|
||||||
|
description = "Rebuild a sub-agent: re-applies the current hyperhive flake + agent.nix \
|
||||||
|
and restarts the container. No approval required — idempotent. Use when you receive a \
|
||||||
|
`needs_update` system event for an agent."
|
||||||
|
)]
|
||||||
|
async fn update(&self, Parameters(args): Parameters<UpdateArgs>) -> String {
|
||||||
|
let log = format!("{args:?}");
|
||||||
|
let name = args.name.clone();
|
||||||
|
run_tool_envelope("update", log, async move {
|
||||||
|
let resp = self
|
||||||
|
.dispatch(hive_sh4re::ManagerRequest::Update { name: args.name })
|
||||||
|
.await;
|
||||||
|
format_ack(resp, "update", format!("updated {name}"))
|
||||||
|
})
|
||||||
|
.await
|
||||||
|
}
|
||||||
|
|
||||||
#[tool(
|
#[tool(
|
||||||
description = "Surface a question to the operator on the dashboard. Returns immediately \
|
description = "Surface a question to the operator on the dashboard. Returns immediately \
|
||||||
with a question id — do NOT wait inline. When the operator answers, a system message \
|
with a question id — do NOT wait inline. When the operator answers, a system message \
|
||||||
|
|
@ -512,6 +535,7 @@ pub fn allowed_mcp_tools(flavor: Flavor) -> Vec<String> {
|
||||||
"kill",
|
"kill",
|
||||||
"start",
|
"start",
|
||||||
"restart",
|
"restart",
|
||||||
|
"update",
|
||||||
"request_apply_commit",
|
"request_apply_commit",
|
||||||
"ask_operator",
|
"ask_operator",
|
||||||
],
|
],
|
||||||
|
|
|
||||||
|
|
@ -1,15 +1,22 @@
|
||||||
//! Container crash watcher. Polls every managed container's running
|
//! Per-container state watcher. Polls every managed container on a
|
||||||
//! state on a fixed interval; when a previously-running container is
|
//! fixed interval, tracks three orthogonal state-sets across ticks,
|
||||||
//! suddenly stopped AND no operator-initiated transient (`Stopping`,
|
//! and emits a `HelperEvent` to the manager on each transition:
|
||||||
//! `Restarting`, `Destroying`) was set, fire `HelperEvent::ContainerCrash`
|
|
||||||
//! into the manager's inbox. The manager can then react — usually
|
|
||||||
//! a `start` or a config rebuild.
|
|
||||||
//!
|
//!
|
||||||
//! D-Bus subscription would be lower-latency, but polling is far
|
//! - **running**: container is up. running → stopped without an
|
||||||
//! simpler and the failure modes are honest (a crash discovered 10s
|
//! operator-initiated transient (`Stopping` / `Restarting` /
|
||||||
//! late is fine for our scale).
|
//! `Destroying` / `Rebuilding`) → `ContainerCrash`.
|
||||||
|
//! - **logged-in**: claude session dir is populated. ! → ✓ →
|
||||||
|
//! `LoggedIn`; ✓ → ! → `NeedsLogin` (rare — usually only fires
|
||||||
|
//! on a fresh spawn / purge).
|
||||||
|
//! - **up-to-date**: agent's recorded flake rev matches current. ✓
|
||||||
|
//! → ! → `NeedsUpdate`. The reverse direction (`NeedsUpdate`
|
||||||
|
//! resolved) is covered by `Rebuilt`, so no separate event.
|
||||||
|
//!
|
||||||
|
//! D-Bus subscription would be lower-latency for the first axis,
|
||||||
|
//! but polling is simpler and a 10s detection delay is fine.
|
||||||
|
|
||||||
use std::collections::HashSet;
|
use std::collections::HashSet;
|
||||||
|
use std::path::Path;
|
||||||
use std::sync::Arc;
|
use std::sync::Arc;
|
||||||
use std::time::Duration;
|
use std::time::Duration;
|
||||||
|
|
||||||
|
|
@ -20,14 +27,17 @@ const POLL_INTERVAL: Duration = Duration::from_secs(10);
|
||||||
|
|
||||||
pub fn spawn(coord: Arc<Coordinator>) {
|
pub fn spawn(coord: Arc<Coordinator>) {
|
||||||
tokio::spawn(async move {
|
tokio::spawn(async move {
|
||||||
// Seed the running-set from the first poll so we don't emit a
|
|
||||||
// crash for every agent on startup. First tick fills it; only
|
|
||||||
// running→stopped transitions across subsequent ticks count.
|
|
||||||
let mut prev_running: HashSet<String> = HashSet::new();
|
let mut prev_running: HashSet<String> = HashSet::new();
|
||||||
|
let mut prev_logged_in: HashSet<String> = HashSet::new();
|
||||||
|
let mut prev_updated: HashSet<String> = HashSet::new();
|
||||||
let mut seeded = false;
|
let mut seeded = false;
|
||||||
loop {
|
loop {
|
||||||
let raw = lifecycle::list().await.unwrap_or_default();
|
let raw = lifecycle::list().await.unwrap_or_default();
|
||||||
|
let current_rev = crate::auto_update::current_flake_rev(&coord.hyperhive_flake);
|
||||||
let mut current_running = HashSet::new();
|
let mut current_running = HashSet::new();
|
||||||
|
let mut current_logged_in = HashSet::new();
|
||||||
|
let mut current_updated = HashSet::new();
|
||||||
|
let mut sub_agents: Vec<String> = Vec::new();
|
||||||
for c in &raw {
|
for c in &raw {
|
||||||
let logical = if c == MANAGER_NAME {
|
let logical = if c == MANAGER_NAME {
|
||||||
MANAGER_NAME.to_owned()
|
MANAGER_NAME.to_owned()
|
||||||
|
|
@ -36,14 +46,42 @@ pub fn spawn(coord: Arc<Coordinator>) {
|
||||||
} else {
|
} else {
|
||||||
continue;
|
continue;
|
||||||
};
|
};
|
||||||
|
if logical != MANAGER_NAME {
|
||||||
|
sub_agents.push(logical.clone());
|
||||||
|
}
|
||||||
if lifecycle::is_running(&logical).await {
|
if lifecycle::is_running(&logical).await {
|
||||||
current_running.insert(logical);
|
current_running.insert(logical.clone());
|
||||||
|
}
|
||||||
|
if logical != MANAGER_NAME
|
||||||
|
&& claude_has_session(&Coordinator::agent_claude_dir(&logical))
|
||||||
|
{
|
||||||
|
current_logged_in.insert(logical.clone());
|
||||||
|
}
|
||||||
|
if let Some(rev) = current_rev.as_deref()
|
||||||
|
&& !crate::auto_update::agent_needs_update(&logical, rev)
|
||||||
|
{
|
||||||
|
current_updated.insert(logical.clone());
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
if seeded {
|
if seeded {
|
||||||
|
emit_crash_transitions(&coord, &prev_running, ¤t_running);
|
||||||
|
emit_login_transitions(&coord, &prev_logged_in, ¤t_logged_in, &sub_agents);
|
||||||
|
emit_update_transitions(&coord, &prev_updated, ¤t_updated, &sub_agents);
|
||||||
|
}
|
||||||
|
prev_running = current_running;
|
||||||
|
prev_logged_in = current_logged_in;
|
||||||
|
prev_updated = current_updated;
|
||||||
|
seeded = true;
|
||||||
|
|
||||||
|
tokio::time::sleep(POLL_INTERVAL).await;
|
||||||
|
}
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
fn emit_crash_transitions(coord: &Coordinator, prev: &HashSet<String>, current: &HashSet<String>) {
|
||||||
let transients = coord.transient_snapshot();
|
let transients = coord.transient_snapshot();
|
||||||
for stopped in prev_running.difference(¤t_running) {
|
for stopped in prev.difference(current) {
|
||||||
let deliberate = transients.get(stopped).is_some_and(|st| {
|
let deliberate = transients.get(stopped).is_some_and(|st| {
|
||||||
matches!(
|
matches!(
|
||||||
st.kind,
|
st.kind,
|
||||||
|
|
@ -63,10 +101,76 @@ pub fn spawn(coord: Arc<Coordinator>) {
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
prev_running = current_running;
|
|
||||||
seeded = true;
|
|
||||||
|
|
||||||
tokio::time::sleep(POLL_INTERVAL).await;
|
fn emit_login_transitions(
|
||||||
}
|
coord: &Coordinator,
|
||||||
|
prev: &HashSet<String>,
|
||||||
|
current: &HashSet<String>,
|
||||||
|
sub_agents: &[String],
|
||||||
|
) {
|
||||||
|
for agent in current.difference(prev) {
|
||||||
|
tracing::info!(%agent, "agent logged in");
|
||||||
|
coord.notify_manager(&hive_sh4re::HelperEvent::LoggedIn {
|
||||||
|
agent: agent.clone(),
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
|
// Only count NeedsLogin transitions for agents that exist and
|
||||||
|
// are *not* logged in — the difference set above already gives
|
||||||
|
// us "was in prev, gone from current" but we also want to fire
|
||||||
|
// for agents that newly appeared as not-logged-in (post-spawn /
|
||||||
|
// post-purge). Treat sub_agents minus current as the
|
||||||
|
// currently-needs-login set; emit when an agent enters it.
|
||||||
|
let prev_needs: HashSet<&str> = sub_agents
|
||||||
|
.iter()
|
||||||
|
.map(String::as_str)
|
||||||
|
.filter(|n| !prev.contains(*n))
|
||||||
|
.collect();
|
||||||
|
let current_needs: HashSet<&str> = sub_agents
|
||||||
|
.iter()
|
||||||
|
.map(String::as_str)
|
||||||
|
.filter(|n| !current.contains(*n))
|
||||||
|
.collect();
|
||||||
|
for agent in current_needs.difference(&prev_needs) {
|
||||||
|
tracing::info!(%agent, "agent needs login");
|
||||||
|
coord.notify_manager(&hive_sh4re::HelperEvent::NeedsLogin {
|
||||||
|
agent: (*agent).to_owned(),
|
||||||
|
});
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn emit_update_transitions(
|
||||||
|
coord: &Coordinator,
|
||||||
|
prev_updated: &HashSet<String>,
|
||||||
|
current_updated: &HashSet<String>,
|
||||||
|
sub_agents: &[String],
|
||||||
|
) {
|
||||||
|
// Fired on the "was up-to-date, now isn't" transition. The
|
||||||
|
// reverse (rebuilt) is already covered by HelperEvent::Rebuilt.
|
||||||
|
let prev_stale: HashSet<&str> = sub_agents
|
||||||
|
.iter()
|
||||||
|
.map(String::as_str)
|
||||||
|
.filter(|n| !prev_updated.contains(*n))
|
||||||
|
.collect();
|
||||||
|
let current_stale: HashSet<&str> = sub_agents
|
||||||
|
.iter()
|
||||||
|
.map(String::as_str)
|
||||||
|
.filter(|n| !current_updated.contains(*n))
|
||||||
|
.collect();
|
||||||
|
for agent in current_stale.difference(&prev_stale) {
|
||||||
|
tracing::info!(%agent, "agent needs update");
|
||||||
|
coord.notify_manager(&hive_sh4re::HelperEvent::NeedsUpdate {
|
||||||
|
agent: (*agent).to_owned(),
|
||||||
|
});
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Mirrors `dashboard::claude_has_session`. Lives here too so the
|
||||||
|
/// watcher doesn't depend on dashboard internals.
|
||||||
|
fn claude_has_session(dir: &Path) -> bool {
|
||||||
|
let Ok(entries) = std::fs::read_dir(dir) else {
|
||||||
|
return false;
|
||||||
|
};
|
||||||
|
entries
|
||||||
|
.flatten()
|
||||||
|
.any(|e| e.file_type().is_ok_and(|t| t.is_file()))
|
||||||
|
}
|
||||||
|
|
|
||||||
|
|
@ -204,6 +204,27 @@ async fn dispatch(req: &ManagerRequest, coord: &Arc<Coordinator>) -> ManagerResp
|
||||||
},
|
},
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
ManagerRequest::Update { name } => {
|
||||||
|
tracing::info!(%name, "manager: update");
|
||||||
|
let Some(current_rev) = crate::auto_update::current_flake_rev(&coord.hyperhive_flake)
|
||||||
|
else {
|
||||||
|
return ManagerResponse::Err {
|
||||||
|
message: "update: hyperhive_flake has no canonical path".into(),
|
||||||
|
};
|
||||||
|
};
|
||||||
|
coord.set_transient(name, crate::coordinator::TransientKind::Rebuilding);
|
||||||
|
let result = crate::auto_update::rebuild_agent(coord, name, ¤t_rev).await;
|
||||||
|
coord.clear_transient(name);
|
||||||
|
match result {
|
||||||
|
Ok(()) => {
|
||||||
|
coord.kick_agent(name, "container rebuilt");
|
||||||
|
ManagerResponse::Ok
|
||||||
|
}
|
||||||
|
Err(e) => ManagerResponse::Err {
|
||||||
|
message: format!("{e:#}"),
|
||||||
|
},
|
||||||
|
}
|
||||||
|
}
|
||||||
ManagerRequest::AskOperator {
|
ManagerRequest::AskOperator {
|
||||||
question,
|
question,
|
||||||
options,
|
options,
|
||||||
|
|
|
||||||
|
|
@ -259,6 +259,20 @@ pub enum HelperEvent {
|
||||||
/// A sub-agent's container was torn down (container removed; state
|
/// A sub-agent's container was torn down (container removed; state
|
||||||
/// dirs preserved per `destroy` semantics).
|
/// dirs preserved per `destroy` semantics).
|
||||||
Destroyed { agent: String },
|
Destroyed { agent: String },
|
||||||
|
/// A sub-agent's container has no claude session yet (first
|
||||||
|
/// spawn, or `--purge` wiped creds). Manager can't do anything
|
||||||
|
/// about it directly — login is interactive OAuth — but it
|
||||||
|
/// surfaces so the manager knows the agent is in partial-run
|
||||||
|
/// mode and can flag the operator.
|
||||||
|
NeedsLogin { agent: String },
|
||||||
|
/// An agent successfully completed claude login — the session
|
||||||
|
/// dir now contains creds. Transition fires once per login.
|
||||||
|
LoggedIn { agent: String },
|
||||||
|
/// An agent's recorded flake rev is stale relative to the
|
||||||
|
/// current hyperhive rev. The manager has the `update` tool to
|
||||||
|
/// trigger a rebuild without operator approval (it's a no-op
|
||||||
|
/// when nothing actually changed).
|
||||||
|
NeedsUpdate { agent: String },
|
||||||
/// Container exited without an operator-initiated stop. Fired by
|
/// Container exited without an operator-initiated stop. Fired by
|
||||||
/// the crash watcher when an agent's container transitions from
|
/// the crash watcher when an agent's container transitions from
|
||||||
/// running → stopped and no `Stopping` / `Restarting` /
|
/// running → stopped and no `Stopping` / `Restarting` /
|
||||||
|
|
@ -326,6 +340,12 @@ pub enum ManagerRequest {
|
||||||
Restart {
|
Restart {
|
||||||
name: String,
|
name: String,
|
||||||
},
|
},
|
||||||
|
/// Rebuild a sub-agent: re-applies the current hyperhive flake +
|
||||||
|
/// agent.nix, restarts the container. No approval required —
|
||||||
|
/// it's idempotent and the manager owns its own update cadence.
|
||||||
|
Update {
|
||||||
|
name: String,
|
||||||
|
},
|
||||||
/// Submit a config commit for the user to approve. `commit_ref` is opaque
|
/// Submit a config commit for the user to approve. `commit_ref` is opaque
|
||||||
/// to the host (typically a git sha pointing into the agent's config repo).
|
/// to the host (typically a git sha pointing into the agent's config repo).
|
||||||
/// On approval the host applies the change via `nixos-container update`.
|
/// On approval the host applies the change via `nixos-container update`.
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue