topology: meta-repo agent hierarchy + ContainerView.parent (#361)
This commit is contained in:
parent
e931c08739
commit
0b03d5bcfb
6 changed files with 403 additions and 3 deletions
|
|
@ -82,6 +82,14 @@ hive-c0re/ host daemon + CLI (one binary, subcommand-dispatched)
|
||||||
prepare/finalize/abort, lock_update_*
|
prepare/finalize/abort, lock_update_*
|
||||||
src/migrate.rs startup auto-migration from pre-meta layout
|
src/migrate.rs startup auto-migration from pre-meta layout
|
||||||
(idempotent, marker-guarded phase 4)
|
(idempotent, marker-guarded phase 4)
|
||||||
|
src/topology.rs agent parent/child storage at
|
||||||
|
/var/lib/hyperhive/meta/topology.json — sole
|
||||||
|
source of truth for who's the parent of whom
|
||||||
|
(single source the dashboard, render_flake,
|
||||||
|
and the eventual cap-enforcement plumbing all
|
||||||
|
read). Reconciled by `meta::sync_agents`;
|
||||||
|
operator/manager edits land via the
|
||||||
|
eventual write API (#361 follow-ups).
|
||||||
src/forge.rs optional Forgejo wiring: per-agent users +
|
src/forge.rs optional Forgejo wiring: per-agent users +
|
||||||
tokens, the `agent-configs` org (`push_config`),
|
tokens, the `agent-configs` org (`push_config`),
|
||||||
and meta read access; mirrors each applied repo
|
and meta read access; mirrors each applied repo
|
||||||
|
|
@ -171,6 +179,7 @@ docs/
|
||||||
persistence.md sqlite dbs, retention, state dir layout
|
persistence.md sqlite dbs, retention, state dir layout
|
||||||
terminal-rendering.md per-agent terminal row taxonomy (as built)
|
terminal-rendering.md per-agent terminal row taxonomy (as built)
|
||||||
boundary.md operator/agent trust model rationale
|
boundary.md operator/agent trust model rationale
|
||||||
|
agent-hierarchy.md tree-shape topology design + manager-privilege audit (#361)
|
||||||
damocles-migration.md future migration plan for damocles → hyperhive
|
damocles-migration.md future migration plan for damocles → hyperhive
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
|
||||||
178
docs/agent-hierarchy.md
Normal file
178
docs/agent-hierarchy.md
Normal file
|
|
@ -0,0 +1,178 @@
|
||||||
|
# Agent hierarchy & privileges
|
||||||
|
|
||||||
|
Design + audit doc for milestone #6 (the
|
||||||
|
[issue](http://localhost:3000/hyperhive/hyperhive/issues/361) tree).
|
||||||
|
The implementation lands in pieces; this doc tracks what's done, what's
|
||||||
|
planned, and what currently special-cases the manager.
|
||||||
|
|
||||||
|
## Current state (as of this PR)
|
||||||
|
|
||||||
|
Topology lives in the hive-c0re-owned **meta repo**, alongside
|
||||||
|
`flake.nix`, at `/var/lib/hyperhive/meta/topology.json`:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"manager": null,
|
||||||
|
"alice": "manager",
|
||||||
|
"bob": "alice"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
`null` = root-level agent. Today only the manager qualifies. Other
|
||||||
|
agents default to `"manager"` as parent on first sync. Operator/manager
|
||||||
|
re-parenting via the write API + dashboard UI lands in a follow-up.
|
||||||
|
|
||||||
|
### Why meta, not per-agent `agent.nix`
|
||||||
|
|
||||||
|
An agent shouldn't be able to claim a parent without that parent's
|
||||||
|
consent, and operator-driven re-parenting shouldn't require touching
|
||||||
|
the moved agent's config. Topology IS a system-level concern; meta is
|
||||||
|
where system-level facts live.
|
||||||
|
|
||||||
|
### Flow
|
||||||
|
|
||||||
|
1. **Read**: `topology::read()` parses `topology.json` into a
|
||||||
|
`BTreeMap<String, Option<String>>`. Missing / unparsable file →
|
||||||
|
empty map → every agent treated as root (safe degradation for
|
||||||
|
fresh installs that haven't run `meta::sync_agents` yet).
|
||||||
|
2. **Reconcile**: `meta::sync_agents` calls `topology::reconcile`
|
||||||
|
alongside its `flake.nix` regeneration. New agents land at their
|
||||||
|
default position (manager as parent, manager itself as root);
|
||||||
|
removed agents drop. Existing entries are preserved as-is so
|
||||||
|
operator overrides stick across regenerations.
|
||||||
|
3. **Inject**: `meta::render_flake` looks up each agent's parent and
|
||||||
|
passes it to `mkAgent`. When non-null, the mkAgent body sets
|
||||||
|
`HIVE_PARENT = parent` in the agent's systemd service environment
|
||||||
|
so the harness / claude prompts can see it.
|
||||||
|
4. **Surface**: `container_view::build_all` reads `topology.json` and
|
||||||
|
populates `ContainerView.parent: Option<String>` on every rescan.
|
||||||
|
The dashboard renders the field as a tree (#363 follow-up).
|
||||||
|
|
||||||
|
## Target topology semantics
|
||||||
|
|
||||||
|
Once enforcement lands the rules collapse into:
|
||||||
|
|
||||||
|
| operation | who can do it |
|
||||||
|
|---|---|
|
||||||
|
| `kill` / `start` / `restart` / `update` (any descendant) | any ancestor |
|
||||||
|
| `request_init_config` (spawn a new child) | any agent, child added under self |
|
||||||
|
| `request_apply_commit` (any descendant's config) | any ancestor |
|
||||||
|
| `get_logs` (any descendant) | any ancestor |
|
||||||
|
| moderate questions / reminders (cancel any open thread of a descendant) | any ancestor |
|
||||||
|
| `send` / `recv` routing | parent ↔ same-parent siblings ↔ self ↔ descendants; explicit allow-list for anyone else |
|
||||||
|
| `request_update_meta_inputs` (bump meta lock) | root agents only (today: just `manager`) |
|
||||||
|
|
||||||
|
"Ancestor" walks `ContainerView.parent` chains; cycles are guarded by a
|
||||||
|
visited-set at dispatch time (a malformed topology.json can't lock the
|
||||||
|
dispatcher into a loop).
|
||||||
|
|
||||||
|
## Current manager special-casings — the audit
|
||||||
|
|
||||||
|
What currently makes the manager different from every other agent, and
|
||||||
|
which axis the post-milestone version reads each special-case along:
|
||||||
|
|
||||||
|
### A — naming + bootstrap
|
||||||
|
|
||||||
|
- `MANAGER_AGENT = "manager"` (broker recipient name) and
|
||||||
|
`MANAGER_NAME = "hm1nd"` (container name). ~28 grep hits across
|
||||||
|
`hive-c0re/src/`. **Just a name** — the rename plan is `manager` →
|
||||||
|
`root`, executed via the one-shot migration script in
|
||||||
|
`migrate.rs` (idempotent, marker-guarded).
|
||||||
|
- `auto_update::ensure_manager` runs at hive-c0re boot and spawns
|
||||||
|
`hm1nd` if missing. Becomes "ensure the root agent exists" once any
|
||||||
|
agent can be at the root. **Topology**: root has no parent, so
|
||||||
|
hive-c0re itself owns its lifecycle (no parent to delegate to).
|
||||||
|
|
||||||
|
### B — wire-protocol privileges
|
||||||
|
|
||||||
|
The `ManagerRequest::*` variants in `hive-sh4re/src/lib.rs` are
|
||||||
|
operations the manager flavour socket can make that sub-agent sockets
|
||||||
|
can't:
|
||||||
|
|
||||||
|
| variant | semantic | post-milestone |
|
||||||
|
|---|---|---|
|
||||||
|
| `RequestInitConfig` | seed an agent's proposed config repo | **topology** — descendants only |
|
||||||
|
| `RequestApplyCommit` | submit a commit sha for operator approval | **topology** — descendants only |
|
||||||
|
| `RequestSpawn` (deprecated) | shortcut for spawn | **topology** — descendants only |
|
||||||
|
| `Kill` / `Start` / `Restart` / `Update` | container lifecycle on an existing agent | **topology** — descendants only |
|
||||||
|
| `RequestUpdateMetaInputs` | bump meta `flake.lock` | **per-agent cap** (root-only today; a future "let coder bump its own input" might grant it) |
|
||||||
|
| `GetLogs` | journalctl scrape of a sub-agent | **topology** — descendants only |
|
||||||
|
| `Wake` | inject a `from: <X>` message into self's inbox | **not really privileged** — the wire surface exists because daemon co-processes (e.g. `forge_notify`) need it. Sub-agents have the same via their own socket. |
|
||||||
|
|
||||||
|
### C — storage / mounts (`hive-c0re::lifecycle`)
|
||||||
|
|
||||||
|
The manager container's nspawn bind set:
|
||||||
|
|
||||||
|
- `HOST_AGENTS_ROOT (/var/lib/hyperhive/agents) → /agents` RW — so the
|
||||||
|
manager can edit any agent's proposed config repo
|
||||||
|
- `HOST_APPLIED_ROOT (/var/lib/hyperhive/applied) → /applied` RO — so
|
||||||
|
the manager can diff against what's deployed
|
||||||
|
- `HOST_META_ROOT (/var/lib/hyperhive/meta) → /meta` RO — so the
|
||||||
|
manager can read the system-wide deploy log
|
||||||
|
|
||||||
|
Tree-shape version:
|
||||||
|
- Each agent gets RW to `/agents/<descendant>/` for every descendant in
|
||||||
|
its subtree. The root agent (today: manager) gets RW to the full
|
||||||
|
forest as a special case of "the root has every other agent as a
|
||||||
|
descendant".
|
||||||
|
- RO `/meta` access if the agent holds a "meta read" cap.
|
||||||
|
- `request_update_meta_inputs` is the only path that actually writes
|
||||||
|
`flake.lock`, gated by the cap; everyone else stays RO.
|
||||||
|
|
||||||
|
### D — drop legacy `/state` for manager
|
||||||
|
|
||||||
|
`lifecycle.rs::notes_mount` currently ternary's `/state` for the
|
||||||
|
manager and `/agents/<name>/state` for everyone else (because the
|
||||||
|
manager pre-dates the per-agent state-dir layout). Milestone bullet:
|
||||||
|
unify on `/agents/<name>/state` for everyone. One-time `mv` of
|
||||||
|
`/var/lib/hyperhive/manager/state` → `/var/lib/hyperhive/agents/manager/state`
|
||||||
|
in `migrate.rs` (idempotent, marker-guarded).
|
||||||
|
|
||||||
|
### E — prompt + tools
|
||||||
|
|
||||||
|
- `prompts/manager.md` vs `prompts/agent.md` — two separate system
|
||||||
|
prompts. **Per-agent cap list** of what the agent can do, rendered
|
||||||
|
into a single parametrised prompt at boot.
|
||||||
|
- `mcp.rs::Flavor::{Agent, Manager}` controls which MCP tools claude
|
||||||
|
sees. Already structured this way internally — the per-flavour
|
||||||
|
allow-list becomes a per-cap-set lookup.
|
||||||
|
|
||||||
|
### F — drive-by checks across c0re
|
||||||
|
|
||||||
|
(`grep -n MANAGER_AGENT` produced ~28 hits)
|
||||||
|
|
||||||
|
- `loose_ends.rs`: manager sees hive-wide loose-ends, sub-agents only
|
||||||
|
their own. **Topology** — every agent sees its own + its
|
||||||
|
descendants'.
|
||||||
|
- `operator_questions.rs` + `broker.rs`: "manager can cancel any
|
||||||
|
question" override on the owner check. **Topology** — agents can
|
||||||
|
moderate threads of their descendants. (per mara's
|
||||||
|
https://localhost:3000/hyperhive/hyperhive/issues/361#issuecomment-3344)
|
||||||
|
- `reminder_scheduler.rs`: same override pattern for reminder cancel.
|
||||||
|
**Topology** — descendants only.
|
||||||
|
- `actions.rs`: `destroy` refuses to act on `MANAGER_NAME` (no
|
||||||
|
foot-shooting). **Topology** — agents can destroy descendants but
|
||||||
|
never themselves or ancestors.
|
||||||
|
- `crash_watch.rs`: skips `ContainerCrash` for the manager (it
|
||||||
|
auto-restarts via systemd). **Topology** — the root container has
|
||||||
|
different recovery semantics, every other agent falls into the same
|
||||||
|
watch loop.
|
||||||
|
|
||||||
|
### G — sub-agents inside the same container
|
||||||
|
|
||||||
|
Future work mentioned in #361: when enabled for an agent, it can spawn
|
||||||
|
temporary "sub-agents" that run inside its own container. Lighter than
|
||||||
|
a full nspawn agent. Open questions, not yet wired:
|
||||||
|
|
||||||
|
- Inherit caps from parent, or take an explicit narrower set?
|
||||||
|
- Survive container restart, or always ephemeral?
|
||||||
|
- Inbox: separate from parent, or shared?
|
||||||
|
- Filesystem: share parent's `/state` RW, or a sub-dir?
|
||||||
|
- Identity: distinct broker recipient name, or address the parent?
|
||||||
|
|
||||||
|
## Cross-references
|
||||||
|
|
||||||
|
- Milestone: [#361 "Agent privileges and sub-agents"](http://localhost:3000/hyperhive/hyperhive/issues/361)
|
||||||
|
- Dashboard render: [#363 "show agent topology in container list"](http://localhost:3000/hyperhive/hyperhive/issues/363)
|
||||||
|
- Audit table source: [comment 3335 on #361](http://localhost:3000/hyperhive/hyperhive/issues/361#issuecomment-3335)
|
||||||
|
- Operator/agent trust boundary (orthogonal axis): [`boundary.md`](boundary.md)
|
||||||
|
|
@ -87,6 +87,14 @@ pub struct ContainerView {
|
||||||
/// status is set.
|
/// status is set.
|
||||||
#[serde(default, skip_serializing_if = "Option::is_none")]
|
#[serde(default, skip_serializing_if = "Option::is_none")]
|
||||||
pub status_set_at: Option<i64>,
|
pub status_set_at: Option<i64>,
|
||||||
|
/// Name of this agent's parent in the agent hierarchy (#361). `None`
|
||||||
|
/// marks the agent as root-level; the dashboard renders it without
|
||||||
|
/// indentation. Sourced from `meta/topology.json` (single source of
|
||||||
|
/// truth, hive-c0re-owned) — NOT from per-agent agent.nix, because
|
||||||
|
/// an agent shouldn't be able to unilaterally declare its own place
|
||||||
|
/// in the tree.
|
||||||
|
#[serde(default, skip_serializing_if = "Option::is_none")]
|
||||||
|
pub parent: Option<String>,
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Build the full container list. Wraps `lifecycle::list()` and
|
/// Build the full container list. Wraps `lifecycle::list()` and
|
||||||
|
|
@ -94,6 +102,10 @@ pub struct ContainerView {
|
||||||
pub async fn build_all(coord: &Coordinator) -> Vec<ContainerView> {
|
pub async fn build_all(coord: &Coordinator) -> Vec<ContainerView> {
|
||||||
let raw = lifecycle::list().await.unwrap_or_default();
|
let raw = lifecycle::list().await.unwrap_or_default();
|
||||||
let locked = read_meta_locked_revs();
|
let locked = read_meta_locked_revs();
|
||||||
|
// Pull the topology map once and look up each agent's parent below.
|
||||||
|
// Empty / absent topology.json → every agent root-level (matches
|
||||||
|
// the pre-#361 status quo for fresh installs).
|
||||||
|
let topology = crate::topology::read();
|
||||||
let mut out = Vec::new();
|
let mut out = Vec::new();
|
||||||
for c in &raw {
|
for c in &raw {
|
||||||
let (logical, is_manager) = if c == MANAGER_NAME {
|
let (logical, is_manager) = if c == MANAGER_NAME {
|
||||||
|
|
@ -130,6 +142,7 @@ pub async fn build_all(coord: &Coordinator) -> Vec<ContainerView> {
|
||||||
let rate_limited = is_rate_limited(&logical);
|
let rate_limited = is_rate_limited(&logical);
|
||||||
let extra_links = read_dashboard_links(&logical);
|
let extra_links = read_dashboard_links(&logical);
|
||||||
let (status_text, status_set_at) = read_status(&logical);
|
let (status_text, status_set_at) = read_status(&logical);
|
||||||
|
let parent = topology.get(&logical).cloned().flatten();
|
||||||
out.push(ContainerView {
|
out.push(ContainerView {
|
||||||
port: lifecycle::agent_web_port(&logical),
|
port: lifecycle::agent_web_port(&logical),
|
||||||
running: lifecycle::is_running(&logical).await,
|
running: lifecycle::is_running(&logical).await,
|
||||||
|
|
@ -146,6 +159,7 @@ pub async fn build_all(coord: &Coordinator) -> Vec<ContainerView> {
|
||||||
extra_links,
|
extra_links,
|
||||||
status_text,
|
status_text,
|
||||||
status_set_at,
|
status_set_at,
|
||||||
|
parent,
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
out
|
out
|
||||||
|
|
|
||||||
|
|
@ -28,6 +28,7 @@ mod migrate;
|
||||||
mod operator_questions;
|
mod operator_questions;
|
||||||
mod questions;
|
mod questions;
|
||||||
mod rebuild_queue;
|
mod rebuild_queue;
|
||||||
|
mod topology;
|
||||||
mod reminder_scheduler;
|
mod reminder_scheduler;
|
||||||
mod server;
|
mod server;
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -85,6 +85,16 @@ pub async fn sync_agents(
|
||||||
std::fs::write(&flake_path, &new_flake)
|
std::fs::write(&flake_path, &new_flake)
|
||||||
.with_context(|| format!("write {}", flake_path.display()))?;
|
.with_context(|| format!("write {}", flake_path.display()))?;
|
||||||
|
|
||||||
|
// Reconcile topology.json against the live agent set — adds
|
||||||
|
// entries for newly-spawned agents (default: manager as parent,
|
||||||
|
// manager itself as root) and drops removed agents. Operator
|
||||||
|
// overrides via the write API (#361 follow-up) are preserved
|
||||||
|
// because reconcile only fills in missing entries. Idempotent;
|
||||||
|
// when nothing changed the file isn't touched.
|
||||||
|
let agent_names: Vec<String> = agents.iter().map(|a| a.name.clone()).collect();
|
||||||
|
let topology_changed = crate::topology::reconcile(&agent_names)
|
||||||
|
.with_context(|| format!("reconcile {}", crate::topology::topology_path().display()))?;
|
||||||
|
|
||||||
if initial {
|
if initial {
|
||||||
git(&dir, &["init", "--initial-branch=main"]).await?;
|
git(&dir, &["init", "--initial-branch=main"]).await?;
|
||||||
}
|
}
|
||||||
|
|
@ -96,12 +106,20 @@ pub async fn sync_agents(
|
||||||
// contain '/flake.nix'". Lock then commit once with both
|
// contain '/flake.nix'". Lock then commit once with both
|
||||||
// flake.nix and flake.lock — single commit per change.
|
// flake.nix and flake.lock — single commit per change.
|
||||||
git(&dir, &["add", "flake.nix"]).await?;
|
git(&dir, &["add", "flake.nix"]).await?;
|
||||||
|
// Stage topology.json on every sync (regenerated by reconcile
|
||||||
|
// above when the agent set changed). git add is a no-op when the
|
||||||
|
// file content is unchanged.
|
||||||
|
if crate::topology::topology_path().exists() {
|
||||||
|
git(&dir, &["add", "topology.json"]).await?;
|
||||||
|
}
|
||||||
nix(&dir, &["flake", "lock"]).await?;
|
nix(&dir, &["flake", "lock"]).await?;
|
||||||
if std::path::Path::new(&dir).join("flake.lock").exists() {
|
if std::path::Path::new(&dir).join("flake.lock").exists() {
|
||||||
git(&dir, &["add", "flake.lock"]).await?;
|
git(&dir, &["add", "flake.lock"]).await?;
|
||||||
}
|
}
|
||||||
let msg = if initial {
|
let msg = if initial {
|
||||||
format!("seed meta from {} agent(s)", agents.len())
|
format!("seed meta from {} agent(s)", agents.len())
|
||||||
|
} else if topology_changed {
|
||||||
|
"regenerate meta flake + topology".to_owned()
|
||||||
} else {
|
} else {
|
||||||
"regenerate meta flake".to_owned()
|
"regenerate meta flake".to_owned()
|
||||||
};
|
};
|
||||||
|
|
@ -348,7 +366,7 @@ where
|
||||||
let pronouns_escaped = operator_pronouns.replace('\\', "\\\\").replace('"', "\\\"");
|
let pronouns_escaped = operator_pronouns.replace('\\', "\\\\").replace('"', "\\\"");
|
||||||
let _ = writeln!(
|
let _ = writeln!(
|
||||||
out,
|
out,
|
||||||
" dashboardPort = {dashboard_port};\n operatorPronouns = \"{pronouns_escaped}\";\n mkAgent = {{ name, isManager, port }}:"
|
" dashboardPort = {dashboard_port};\n operatorPronouns = \"{pronouns_escaped}\";\n mkAgent = {{ name, isManager, port, parent ? null }}:"
|
||||||
);
|
);
|
||||||
out.push_str(
|
out.push_str(
|
||||||
r#" let
|
r#" let
|
||||||
|
|
@ -357,6 +375,7 @@ where
|
||||||
else hyperhive.nixosConfigurations.agent-base;
|
else hyperhive.nixosConfigurations.agent-base;
|
||||||
input = inputs."agent-${name}";
|
input = inputs."agent-${name}";
|
||||||
service = if isManager then "hive-m1nd" else "hive-ag3nt";
|
service = if isManager then "hive-m1nd" else "hive-ag3nt";
|
||||||
|
parentEnv = if parent == null then {} else { HIVE_PARENT = parent; };
|
||||||
in
|
in
|
||||||
base.extendModules {
|
base.extendModules {
|
||||||
modules = [
|
modules = [
|
||||||
|
|
@ -372,7 +391,7 @@ where
|
||||||
HIVE_LABEL = name;
|
HIVE_LABEL = name;
|
||||||
HYPERHIVE_STATE_DIR = "/agents/${name}/state";
|
HYPERHIVE_STATE_DIR = "/agents/${name}/state";
|
||||||
};
|
};
|
||||||
systemd.services.${service}.environment = {
|
systemd.services.${service}.environment = parentEnv // {
|
||||||
HIVE_PORT = toString port;
|
HIVE_PORT = toString port;
|
||||||
HIVE_LABEL = name;
|
HIVE_LABEL = name;
|
||||||
HIVE_DASHBOARD_PORT = toString dashboardPort;
|
HIVE_DASHBOARD_PORT = toString dashboardPort;
|
||||||
|
|
@ -406,14 +425,25 @@ where
|
||||||
nixosConfigurations = {
|
nixosConfigurations = {
|
||||||
"#,
|
"#,
|
||||||
);
|
);
|
||||||
|
// Pull the topology map once and look up each agent's parent. An
|
||||||
|
// empty / absent topology.json yields `parent = null` for everyone
|
||||||
|
// — equivalent to the pre-#361 status quo (every container at root).
|
||||||
|
// `meta::sync_agents` seeds the file on first run with manager as
|
||||||
|
// root + everyone else under manager.
|
||||||
|
let topology = crate::topology::read();
|
||||||
for spec in agents {
|
for spec in agents {
|
||||||
|
let parent_attr = topology
|
||||||
|
.get(&spec.name)
|
||||||
|
.and_then(|p| p.as_ref())
|
||||||
|
.map_or_else(|| "null".to_owned(), |p| format!("\"{p}\""));
|
||||||
let _ = writeln!(
|
let _ = writeln!(
|
||||||
out,
|
out,
|
||||||
" {} = mkAgent {{ name = \"{}\"; isManager = {}; port = {}; }};",
|
" {} = mkAgent {{ name = \"{}\"; isManager = {}; port = {}; parent = {}; }};",
|
||||||
spec.name,
|
spec.name,
|
||||||
spec.name,
|
spec.name,
|
||||||
if spec.is_manager { "true" } else { "false" },
|
if spec.is_manager { "true" } else { "false" },
|
||||||
spec.port,
|
spec.port,
|
||||||
|
parent_attr,
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
out.push_str(" };\n };\n}\n");
|
out.push_str(" };\n };\n}\n");
|
||||||
|
|
|
||||||
168
hive-c0re/src/topology.rs
Normal file
168
hive-c0re/src/topology.rs
Normal file
|
|
@ -0,0 +1,168 @@
|
||||||
|
//! Agent topology storage — single source of truth for parent/child
|
||||||
|
//! relations in the hive. Lives in the hive-c0re-owned meta repo at
|
||||||
|
//! `/var/lib/hyperhive/meta/topology.json`, alongside `flake.nix`, so
|
||||||
|
//! topology changes thread through the same git commit log as deploys.
|
||||||
|
//!
|
||||||
|
//! Why meta, not per-agent: an agent shouldn't be able to claim a
|
||||||
|
//! parent without that parent's consent, and an operator-driven
|
||||||
|
//! re-parenting shouldn't require touching the moved agent's own
|
||||||
|
//! config. Topology IS a system-level concern; meta is where
|
||||||
|
//! system-level facts live.
|
||||||
|
//!
|
||||||
|
//! Format — flat JSON map keyed by agent name, values are the parent
|
||||||
|
//! agent's name or `null` for root:
|
||||||
|
//!
|
||||||
|
//! ```json
|
||||||
|
//! {
|
||||||
|
//! "manager": null,
|
||||||
|
//! "alice": "manager",
|
||||||
|
//! "bob": "alice"
|
||||||
|
//! }
|
||||||
|
//! ```
|
||||||
|
//!
|
||||||
|
//! Agents present in `nixos-container list` but absent from the file
|
||||||
|
//! default to root-level (`parent = None`). This file is operator/
|
||||||
|
//! manager-managed via approval-gated writes (write API lands in a
|
||||||
|
//! follow-up PR on the #361 milestone); for the bootstrap commit
|
||||||
|
//! `meta::sync_agents` seeds it with the existing implicit topology
|
||||||
|
//! (manager as root, all current sub-agents as direct children).
|
||||||
|
|
||||||
|
use std::collections::BTreeMap;
|
||||||
|
use std::path::PathBuf;
|
||||||
|
|
||||||
|
const TOPOLOGY_FILE: &str = "topology.json";
|
||||||
|
|
||||||
|
#[must_use]
|
||||||
|
pub fn topology_path() -> PathBuf {
|
||||||
|
crate::meta::meta_dir().join(TOPOLOGY_FILE)
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Snapshot of the topology map. Read on every `container_view::build_all`
|
||||||
|
/// and every `render_flake` call. The file is small (one line per agent),
|
||||||
|
/// so we re-read rather than caching — keeps the source of truth on disk.
|
||||||
|
///
|
||||||
|
/// Returns an empty map when the file is absent or unparsable; callers
|
||||||
|
/// treat that as "no recorded parents", which falls back to every agent
|
||||||
|
/// being root-level. Safe degradation for fresh installs that haven't
|
||||||
|
/// run through `meta::sync_agents` yet.
|
||||||
|
#[must_use]
|
||||||
|
pub fn read() -> BTreeMap<String, Option<String>> {
|
||||||
|
let path = topology_path();
|
||||||
|
let Ok(raw) = std::fs::read_to_string(&path) else {
|
||||||
|
return BTreeMap::new();
|
||||||
|
};
|
||||||
|
serde_json::from_str(&raw).unwrap_or_default()
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Look up one agent's parent. Returns `None` when the agent is root
|
||||||
|
/// or absent from the file. Cheap convenience over `read()` for
|
||||||
|
/// callers that want a single entry.
|
||||||
|
#[must_use]
|
||||||
|
pub fn parent_of(name: &str) -> Option<String> {
|
||||||
|
read().get(name).cloned().flatten()
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Persist the topology map. Sorted JSON output (BTreeMap is sorted by
|
||||||
|
/// key) keeps git diffs minimal across re-writes. Best-effort —
|
||||||
|
/// returns an `io::Error` so callers can decide whether a failure
|
||||||
|
/// should abort their op (sync_agents, RequestSetParent) or just log.
|
||||||
|
pub fn write(topology: &BTreeMap<String, Option<String>>) -> std::io::Result<()> {
|
||||||
|
let path = topology_path();
|
||||||
|
if let Some(parent) = path.parent() {
|
||||||
|
std::fs::create_dir_all(parent)?;
|
||||||
|
}
|
||||||
|
let text = serde_json::to_string_pretty(topology)
|
||||||
|
.map_err(|e| std::io::Error::new(std::io::ErrorKind::InvalidData, e))?;
|
||||||
|
std::fs::write(&path, format!("{text}\n"))
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Compute the default topology for a fresh install: every non-manager
|
||||||
|
/// agent has the manager as parent; manager itself is root. Used by
|
||||||
|
/// `meta::sync_agents` on first call to seed `topology.json`.
|
||||||
|
///
|
||||||
|
/// As soon as an explicit write lands (#361 follow-up: dashboard /
|
||||||
|
/// `RequestSetParent` API), this seeding stops touching pre-existing
|
||||||
|
/// entries — `sync_agents` only adds rows for newly-spawned agents
|
||||||
|
/// against whatever the operator has configured.
|
||||||
|
#[must_use]
|
||||||
|
pub fn default_seed(agent_names: &[String]) -> BTreeMap<String, Option<String>> {
|
||||||
|
let mut out = BTreeMap::new();
|
||||||
|
for name in agent_names {
|
||||||
|
if name == crate::lifecycle::MANAGER_NAME {
|
||||||
|
out.insert(name.clone(), None);
|
||||||
|
} else {
|
||||||
|
out.insert(name.clone(), Some(crate::lifecycle::MANAGER_NAME.to_owned()));
|
||||||
|
}
|
||||||
|
}
|
||||||
|
out
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Reconcile `topology.json` against the current agent set. Adds an
|
||||||
|
/// entry (default: parent = manager, manager itself = root) for any
|
||||||
|
/// agent missing from the file; removes entries for agents no longer
|
||||||
|
/// present. Existing entries are preserved as-is — operator/manager
|
||||||
|
/// choices stick across regenerations. Returns true when the file
|
||||||
|
/// changed and should be re-committed by the caller.
|
||||||
|
pub fn reconcile(agent_names: &[String]) -> std::io::Result<bool> {
|
||||||
|
let mut current = read();
|
||||||
|
let mut changed = false;
|
||||||
|
// Add missing agents at their default position.
|
||||||
|
for name in agent_names {
|
||||||
|
if !current.contains_key(name) {
|
||||||
|
let parent = if name == crate::lifecycle::MANAGER_NAME {
|
||||||
|
None
|
||||||
|
} else {
|
||||||
|
Some(crate::lifecycle::MANAGER_NAME.to_owned())
|
||||||
|
};
|
||||||
|
current.insert(name.clone(), parent);
|
||||||
|
changed = true;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
// Drop entries for agents that no longer exist.
|
||||||
|
let known: std::collections::HashSet<_> = agent_names.iter().collect();
|
||||||
|
current.retain(|name, _| {
|
||||||
|
let keep = known.contains(name);
|
||||||
|
if !keep {
|
||||||
|
changed = true;
|
||||||
|
}
|
||||||
|
keep
|
||||||
|
});
|
||||||
|
if changed {
|
||||||
|
write(¤t)?;
|
||||||
|
}
|
||||||
|
Ok(changed)
|
||||||
|
}
|
||||||
|
|
||||||
|
#[cfg(test)]
|
||||||
|
mod tests {
|
||||||
|
use super::*;
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn default_seed_makes_manager_root_others_children() {
|
||||||
|
let agents = vec![
|
||||||
|
"alice".to_owned(),
|
||||||
|
crate::lifecycle::MANAGER_NAME.to_owned(),
|
||||||
|
"bob".to_owned(),
|
||||||
|
];
|
||||||
|
let seed = default_seed(&agents);
|
||||||
|
assert_eq!(
|
||||||
|
seed.get(crate::lifecycle::MANAGER_NAME),
|
||||||
|
Some(&None),
|
||||||
|
"manager should be root"
|
||||||
|
);
|
||||||
|
assert_eq!(
|
||||||
|
seed.get("alice"),
|
||||||
|
Some(&Some(crate::lifecycle::MANAGER_NAME.to_owned()))
|
||||||
|
);
|
||||||
|
assert_eq!(
|
||||||
|
seed.get("bob"),
|
||||||
|
Some(&Some(crate::lifecycle::MANAGER_NAME.to_owned()))
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn default_seed_handles_empty_input() {
|
||||||
|
let seed = default_seed(&[]);
|
||||||
|
assert!(seed.is_empty());
|
||||||
|
}
|
||||||
|
}
|
||||||
Loading…
Add table
Add a link
Reference in a new issue