8.3 KiB
Agent hierarchy & privileges
Design + audit doc for milestone #6 (the issue tree). The implementation lands in pieces; this doc tracks what's done, what's planned, and what currently special-cases the manager.
Current state (as of this PR)
Topology lives in the hive-c0re-owned meta repo, alongside
flake.nix, at /var/lib/hyperhive/meta/topology.json:
{
"manager": null,
"alice": "manager",
"bob": "alice"
}
null = root-level agent. Today only the manager qualifies. Other
agents default to "manager" as parent on first sync. Operator/manager
re-parenting via the write API + dashboard UI lands in a follow-up.
Why meta, not per-agent agent.nix
An agent shouldn't be able to claim a parent without that parent's consent, and operator-driven re-parenting shouldn't require touching the moved agent's config. Topology IS a system-level concern; meta is where system-level facts live.
Flow
- Read:
topology::read()parsestopology.jsoninto aBTreeMap<String, Option<String>>. Missing / unparsable file → empty map → every agent treated as root (safe degradation for fresh installs that haven't runmeta::sync_agentsyet). - Reconcile:
meta::sync_agentscallstopology::reconcilealongside itsflake.nixregeneration. New agents land at their default position (manager as parent, manager itself as root); removed agents drop. Existing entries are preserved as-is so operator overrides stick across regenerations. - Inject:
meta::render_flakelooks up each agent's parent and passes it tomkAgent. When non-null, the mkAgent body setsHIVE_PARENT = parentin the agent's systemd service environment so the harness / claude prompts can see it. - Surface:
container_view::build_allreadstopology.jsonand populatesContainerView.parent: Option<String>on every rescan. The dashboard renders the field as a tree (#363 follow-up).
Target topology semantics
Once enforcement lands the rules collapse into:
| operation | who can do it |
|---|---|
kill / start / restart / update (any descendant) |
any ancestor |
request_init_config (spawn a new child) |
any agent, child added under self |
request_apply_commit (any descendant's config) |
any ancestor |
get_logs (any descendant) |
any ancestor |
| moderate questions / reminders (cancel any open thread of a descendant) | any ancestor |
send / recv routing |
parent ↔ same-parent siblings ↔ self ↔ descendants; explicit allow-list for anyone else |
request_update_meta_inputs (bump meta lock) |
root agents only (today: just manager) |
"Ancestor" walks ContainerView.parent chains; cycles are guarded by a
visited-set at dispatch time (a malformed topology.json can't lock the
dispatcher into a loop).
Current manager special-casings — the audit
What currently makes the manager different from every other agent, and which axis the post-milestone version reads each special-case along:
A — naming + bootstrap
MANAGER_AGENT = "manager"(broker recipient name) andMANAGER_NAME = "hm1nd"(container name). ~28 grep hits acrosshive-c0re/src/. Just a name — the rename plan ismanager→root, executed via the one-shot migration script inmigrate.rs(idempotent, marker-guarded).auto_update::ensure_managerruns at hive-c0re boot and spawnshm1ndif missing. Becomes "ensure the root agent exists" once any agent can be at the root. Topology: root has no parent, so hive-c0re itself owns its lifecycle (no parent to delegate to).
B — wire-protocol privileges
The ManagerRequest::* variants in hive-sh4re/src/lib.rs are
operations the manager flavour socket can make that sub-agent sockets
can't:
| variant | semantic | post-milestone |
|---|---|---|
RequestInitConfig |
seed an agent's proposed config repo | topology — descendants only |
RequestApplyCommit |
submit a commit sha for operator approval | topology — descendants only |
RequestSpawn (deprecated) |
shortcut for spawn | topology — descendants only |
Kill / Start / Restart / Update |
container lifecycle on an existing agent | topology — descendants only |
RequestUpdateMetaInputs |
bump meta flake.lock |
per-agent cap (root-only today; a future "let coder bump its own input" might grant it) |
GetLogs |
journalctl scrape of a sub-agent | topology — descendants only |
Wake |
inject a from: <X> message into self's inbox |
not really privileged — the wire surface exists because daemon co-processes (e.g. forge_notify) need it. Sub-agents have the same via their own socket. |
C — storage / mounts (hive-c0re::lifecycle)
The manager container's nspawn bind set:
HOST_AGENTS_ROOT (/var/lib/hyperhive/agents) → /agentsRW — so the manager can edit any agent's proposed config repoHOST_APPLIED_ROOT (/var/lib/hyperhive/applied) → /appliedRO — so the manager can diff against what's deployedHOST_META_ROOT (/var/lib/hyperhive/meta) → /metaRO — so the manager can read the system-wide deploy log
Tree-shape version:
- Each agent gets RW to
/agents/<descendant>/for every descendant in its subtree. The root agent (today: manager) gets RW to the full forest as a special case of "the root has every other agent as a descendant". - RO
/metaaccess if the agent holds a "meta read" cap. request_update_meta_inputsis the only path that actually writesflake.lock, gated by the cap; everyone else stays RO.
D — drop legacy /state for manager
lifecycle.rs::notes_mount currently ternary's /state for the
manager and /agents/<name>/state for everyone else (because the
manager pre-dates the per-agent state-dir layout). Milestone bullet:
unify on /agents/<name>/state for everyone. One-time mv of
/var/lib/hyperhive/manager/state → /var/lib/hyperhive/agents/manager/state
in migrate.rs (idempotent, marker-guarded).
E — prompt + tools
prompts/manager.mdvsprompts/agent.md— two separate system prompts. Per-agent cap list of what the agent can do, rendered into a single parametrised prompt at boot.mcp.rs::Flavor::{Agent, Manager}controls which MCP tools claude sees. Already structured this way internally — the per-flavour allow-list becomes a per-cap-set lookup.
F — drive-by checks across c0re
(grep -n MANAGER_AGENT produced ~28 hits)
loose_ends.rs: manager sees hive-wide loose-ends, sub-agents only their own. Topology — every agent sees its own + its descendants'.operator_questions.rs+broker.rs: "manager can cancel any question" override on the owner check. Topology — agents can moderate threads of their descendants. (per mara's https://localhost:3000/hyperhive/hyperhive/issues/361#issuecomment-3344)reminder_scheduler.rs: same override pattern for reminder cancel. Topology — descendants only.actions.rs:destroyrefuses to act onMANAGER_NAME(no foot-shooting). Topology — agents can destroy descendants but never themselves or ancestors.crash_watch.rs: skipsContainerCrashfor the manager (it auto-restarts via systemd). Topology — the root container has different recovery semantics, every other agent falls into the same watch loop.
G — sub-agents inside the same container
Future work mentioned in #361: when enabled for an agent, it can spawn temporary "sub-agents" that run inside its own container. Lighter than a full nspawn agent. Open questions, not yet wired:
- Inherit caps from parent, or take an explicit narrower set?
- Survive container restart, or always ephemeral?
- Inbox: separate from parent, or shared?
- Filesystem: share parent's
/stateRW, or a sub-dir? - Identity: distinct broker recipient name, or address the parent?
Cross-references
- Milestone: #361 "Agent privileges and sub-agents"
- Dashboard render: #363 "show agent topology in container list"
- Audit table source: comment 3335 on #361
- Operator/agent trust boundary (orthogonal axis):
boundary.md