diff --git a/PLAN.md b/PLAN.md index fe5fd7a..a6e74be 100644 --- a/PLAN.md +++ b/PLAN.md @@ -54,7 +54,7 @@ A multi-Claude-Code-agent setup on a single host: │ ├── /run/hyperhive/manager.sock → manager container │ │ └── /run/hyperhive/agents/.sock → that agent │ │ │ -│ ┌─ nixos-container: hive-manager ───────────────┐ │ +│ ┌─ nixos-container: hm1nd ──────────────────────┐ │ │ │ hive-m1nd (Rust, hive-ag3nt crate) │ │ │ │ ├ MCP client → /run/hyperhive/manager.sock │ │ │ │ ├ turn loop driving `claude` │ │ @@ -68,7 +68,7 @@ A multi-Claude-Code-agent setup on a single host: │ │ update_shared_instructions │ │ │ └────────────────────────────────────────────────┘ │ │ │ -│ ┌─ nixos-container: hive-agent- ──────────┐ │ +│ ┌─ nixos-container: h- ───────────────────┐ │ │ │ hive-ag3nt (Rust) │ │ │ │ MCP client → /run/hyperhive/agents/.sock │ │ │ state/ RW, config/ + prompts/ + shared RO │ │ @@ -82,7 +82,7 @@ A multi-Claude-Code-agent setup on a single host: **Sockets, not identity gating.** One unix socket per principal, bind-mounted into the right place. The socket *is* the principal — no `SO_PEERCRED` lookups, no token plumbing. Perms are filesystem perms on the host side, plus the fact that only the matching container has the bind-mount. Each socket runs the MCP tool surface appropriate to its principal. -**Container naming.** All nixos-containers managed by `hive-c0re` are prefixed `hive-`. The manager runs as `hive-manager`. Sub-agents run as `hive-agent-` — the extra `-agent-` segment keeps sub-agents from colliding with the manager slot (even if someone names an agent `manager`) and makes ownership obvious. On-disk paths (`/var/lib/hyperhive/agents/foo/`) and socket paths (`/run/hyperhive/agents/foo.sock`) use the bare logical name; the prefixing lives only at the nixos-container layer. +**Container naming.** `nixos-container` caps the total container name at 11 chars (it gets encoded into network interface names). The manager runs as `hm1nd` (a compressed form of `hive-m1nd`). Sub-agents run as `h-` with `` capped to 9 chars (`MAX_AGENT_NAME`). The two namespaces don't collide. On-disk paths (`/var/lib/hyperhive/agents/foo/`) and socket paths (`/run/hyperhive/agents/foo/mcp.sock`) use the bare logical name; the prefixing lives only at the nixos-container layer. **Approvals are git commits.** `hive-c0re` maintains a `state-repo` on host that records the world (which agents exist, their roles, etc.). Per-agent flake configs (`agents//config/`) are themselves git repos. The manager edits clones with plain `git` CLI inside its container and asks `hive-c0re` to apply a commit via `request_apply_commit(agent, sha)`. `hive-c0re` queues it; once approved, fast-forwards `main` and reconciles state (rebuild containers, etc.). **No abstract "approval token" — the commit hash is the token.** @@ -117,7 +117,7 @@ A multi-Claude-Code-agent setup on a single host: - **Exit:** `nixos-container create test-agent --flake .#agent-base && nixos-container start test-agent` brings up a container whose `hive-ag3nt` prints "hello" and exits. ### Phase 1 — container lifecycle + Risk 1 -- `hive-c0re`: open host admin socket (`/run/hyperhive/host.sock`); verbs `spawn(name)`, `kill(name)`, `rebuild(name)`, `list()`. Uses `nixos-container` underneath; container name on the host is `hive-agent-`. +- `hive-c0re`: open host admin socket (`/run/hyperhive/host.sock`); verbs `spawn(name)`, `kill(name)`, `rebuild(name)`, `list()`. Uses `nixos-container` underneath; container name on the host is `h-` (sub-agents) or `hm1nd` (manager). - CLI tool talking to the admin socket (same `hive-c0re` binary, subcommand-driven). - Manually mutate an agent's config flake, call `rebuild`, observe whether `hive-ag3nt` survives. - **Decision:** if hot-reload doesn't preserve the harness, that becomes a hard requirement of `hive-ag3nt`'s design (resume from disk state). Document the outcome. @@ -136,7 +136,7 @@ A multi-Claude-Code-agent setup on a single host: ### Phase 4 — `hive-m1nd` + privileged surface - `hive-m1nd` binary (second `[[bin]]` in `hive-ag3nt`) wires the manager tool surface. -- Manager container (`hive-manager`) declared in host NixOS module (auto-restart). Bind-mount `agents/**` RW. +- Manager container (`hm1nd`) declared in host NixOS module (auto-restart). Bind-mount `agents/**` RW. - Manager socket gets the privileged tool surface: `request_spawn`/`request_kill`, `request_apply_commit`, `inject_peer_info`, `send(..., wait_for_reply=true)`. - Smoke: attach a terminal to the manager container (`nixos-container root-login`); ask `hive-m1nd` to spawn an agent and route a message to it. - **Exit:** manager spawns, routes, kills a child agent end-to-end; lifecycle still gated by manual CLI approval (no GUI yet). diff --git a/hive-c0re/src/lifecycle.rs b/hive-c0re/src/lifecycle.rs index 20d51f6..06453d8 100644 --- a/hive-c0re/src/lifecycle.rs +++ b/hive-c0re/src/lifecycle.rs @@ -5,8 +5,13 @@ use std::path::Path; use anyhow::{Context, Result, bail}; use tokio::process::Command; -pub const AGENT_PREFIX: &str = "hive-agent-"; -pub const HIVE_PREFIX: &str = "hive-"; +/// Sub-agent container prefix. `nixos-container` caps the total container name +/// at 11 chars (it gets encoded into network interface names), so the agent +/// name itself can be at most `MAX_AGENT_NAME` chars. +pub const AGENT_PREFIX: &str = "h-"; +pub const MAX_AGENT_NAME: usize = 9; +/// Container name of the manager (a separate slot from sub-agents). +pub const MANAGER_NAME: &str = "hm1nd"; /// Mount point of the per-agent runtime directory inside the container. pub const CONTAINER_RUNTIME_MOUNT: &str = "/run/hive"; @@ -15,7 +20,21 @@ pub fn container_name(name: &str) -> String { format!("{AGENT_PREFIX}{name}") } +fn validate(name: &str) -> Result<()> { + if name.is_empty() { + bail!("agent name must not be empty"); + } + if name.len() > MAX_AGENT_NAME { + bail!( + "agent name '{name}' is too long ({} chars); max {MAX_AGENT_NAME}", + name.len() + ); + } + Ok(()) +} + pub async fn spawn(name: &str, agent_flake: &str, agent_dir: &Path) -> Result<()> { + validate(name)?; let container = container_name(name); run(&["create", &container, "--flake", agent_flake]).await?; write_nspawn_override(&container, agent_dir)?; @@ -39,11 +58,13 @@ fn write_nspawn_override(container: &str, agent_dir: &Path) -> Result<()> { } pub async fn kill(name: &str) -> Result<()> { + validate(name)?; let container = container_name(name); run(&["stop", &container]).await } pub async fn rebuild(name: &str, agent_flake: &str) -> Result<()> { + validate(name)?; let container = container_name(name); run(&["update", &container, "--flake", agent_flake]).await } @@ -64,7 +85,7 @@ pub async fn list() -> Result> { Ok(String::from_utf8_lossy(&out.stdout) .lines() .map(str::trim) - .filter(|line| line.starts_with(HIVE_PREFIX)) + .filter(|line| line.starts_with(AGENT_PREFIX) || *line == MANAGER_NAME) .map(str::to_owned) .collect()) }