hyperhive/README.md
müde 14aa7c7acc final docs + cleanup sync for meta-flake era
claude.md flips 'in flight' → 'just landed' for the meta
overhaul + extends the file map with meta.rs and migrate.rs.
docs/approvals.md replaces the in-flight callout with a
proper 'Meta flake' section (two-phase deploy walkthrough,
sync_agents semantics, single-phase variants), updates the
two-repo box diagram to include the /var/lib/hyperhive/meta/
tree and tracks flake.nix in applied, rewrites the
container --flake reference to meta#<name>, replaces the
'Manager view of applied' section with a unified
'/agents + /applied + /meta' inventory listing every useful
git incantation, and explains the in-place no-state-loss
migration that now runs on hive-c0re startup.
docs/persistence.md grows entries for the meta repo + the
.meta-migration-done marker. readme box diagram picks up the
/meta RO bind; approval-flow paragraph rewritten end to end
to describe the meta lock dance.

lifecycle::flake_base deleted — the meta render hardcodes
the manager vs agent-base choice as nix expression.
2026-05-16 00:40:06 +02:00

145 lines
6.3 KiB
Markdown

# hyperhive
Multi-Claude-Code-agent orchestration on **nixos-containers**.
A host-side Rust daemon (`hive-c0re`) spawns nspawn-isolated agent
containers and brokers messages between them. A manager agent (`hm1nd`)
coordinates the swarm and gates lifecycle changes on user approval via git
commits, surfaced through a vibec0re-styled HTTP dashboard.
```
host (NixOS, runs hive-c0re.service)
├── operator
│ ├── browser → :7000 hive-c0re dashboard (containers, approvals)
│ ├── browser → :8000 / :8100-8999 per-agent web UIs (live SSE, send, login)
│ └── CLI → /run/hyperhive/host.sock JSON-line admin protocol
├── hive-c0re (Rust daemon)
│ ├── lifecycle nixos-container CRUD + per-agent flake generation
│ ├── broker sqlite messages + tokio broadcast (powers SSE + wake-ups)
│ ├── approvals sqlite queue, two kinds: ApplyCommit (config) + Spawn
│ ├── auto_update rebuilds any container whose recorded flake rev is stale
│ ├── dashboard axum HTTP + async-form actions + SSE message flow
│ └── sockets /run/hyperhive/{host,manager,agents/<n>}/mcp.sock
└── nixos-containers (each bind-mounts its socket dir → /run/hive,
│ credentials dir → /root/.claude,
│ durable notes dir → /state;
│ manager additionally gets /agents RW,
│ /applied RO (deployed-tag mirror),
│ /meta RO (swarm-wide deploy flake))
├── hm1nd hive-m1nd serve : claude turn loop +
│ MCP (send / recv / request_spawn / kill / start /
│ restart / update / request_apply_commit /
│ ask_operator) + web UI on :8000
└── h-<name> hive-ag3nt serve : claude turn loop +
MCP (send / recv) + web UI on a hashed :8100-8999
```
Each turn: harness pops one inbox message (Recv long-polls server-side and
wakes on a broker Sent event) → builds a wake prompt → spawns
`claude --print --continue --output-format stream-json --mcp-config …`
streams JSON events into the per-agent SSE bus + a sqlite history db →
claude drives any further `recv`/`send` itself via the embedded MCP server.
Operator surface per agent: terminal-themed live tail with a textarea
prompt; slash commands `/help` `/clear` `/cancel` `/compact`
`/model <name>`; granular state badge (idle / thinking /
compacting / offline) with age timer + last-turn duration chip +
model chip; cancel-turn button while thinking; sticky-bottom
auto-scroll with "↓ N new" pill; event history backfilled on page
load; collapsible inbox + collapsible journald viewer + collapsible
`agent.nix` viewer per agent on the dashboard.
Config changes flow the other way: manager edits files under
`/agents/<name>/config/``agent.nix` is a plain NixOS module function
`{ config, pkgs, lib, ... }: { ... }`, and arbitrary sibling files in
the commit are preserved → commits → submits the sha via
`request_apply_commit`. Hive-c0re immediately fetches that commit from
the proposed repo into the applied repo and pins it as `proposal/<id>`
— immutable from the manager's side from then on. Operator clicks
◆ APPR0VE → hive-c0re fast-forwards `applied/<n>/main` to the proposal,
runs `nix flake lock --update-input agent-<n>` against the host-wide
meta flake at `/var/lib/hyperhive/meta/`, builds via
`nixos-container update <c> --flake meta#<name>`, and either commits
the lock + tags `deployed/<id>` on success or `git restore`s the lock +
annotates `failed/<id>` with the build error + rolls back
`applied/<n>/main` on failure. Denials leave a `denied/<id>` annotated
tag carrying the operator's note.
Meta's git log is the swarm-wide deploy audit trail (one commit per
successful deploy). Per-agent applied repos carry the tag-rich state
machine for inside-baseball decisions. The manager sees both — proposed
repos ship with an `applied` remote pre-wired, and `/meta/` is RO-bound
inside the container — so `git fetch applied`,
`git show applied/refs/tags/deployed/<id>`, `git log /meta`,
`cat /meta/flake.lock` all just work without constructing paths by
hand. See [`docs/approvals.md`](docs/approvals.md) for the full state
machine + lock-flow walkthrough.
For decisions the manager needs human signal on, `ask_operator(question,
options?, multi?)` queues a free-text/checkbox/radio form on the
dashboard; the answer arrives later as a `HelperEvent::OperatorAnswered`
in the manager's inbox.
## Host config
Minimal `flake.nix` for a host that runs hive-c0re:
```nix
{
inputs = {
nixpkgs.url = "github:NixOS/nixpkgs/nixos-25.11";
hyperhive.url = "git+https://git.berlin.ccc.de/vinzenz/hyperhive";
};
outputs = { nixpkgs, hyperhive, ... }: {
nixosConfigurations.my-host = nixpkgs.lib.nixosSystem {
system = "x86_64-linux";
modules = [
hyperhive.nixosModules.hive-c0re
({ ... }: {
services.hive-c0re.enable = true;
# ... rest of your host config (hardware, networking, users, …)
system.stateVersion = "25.11";
})
];
};
};
}
```
hive-c0re will then:
- open its admin socket at `/run/hyperhive/host.sock` + dashboard on
`:7000`,
- auto-create the manager container (`hm1nd`) if missing,
- auto-rebuild any managed container whose hyperhive rev is stale.
`claude-code` is unfree; hyperhive whitelists it for itself
(scoped: only `claude-code`, nothing else) inside the
`claude-unstable` overlay and `harness-base.nix`. Per-agent
containers evaluate their own nixpkgs instance so the operator's
host-level `allowUnfree` doesn't propagate in — the predicate has
to live inline. Nothing to set on the operator side.
## Build / deploy
```sh
# inside the repo (devshell first; no global cargo)
nix develop -c cargo check
nix develop -c cargo clippy --workspace --all-targets -- -D warnings
# evaluate everything (rust+nix+toml fmt + clippy)
nix flake check
# deploy to a host that imports `hyperhive.nixosModules.hive-c0re`
cd ~/Repos/<nixos-config-repo>
nix flake update --update-input hyperhive
sudo nixos-rebuild switch --flake .#<host>
```
No overlays on the host's `pkgs` — the module pulls hive-c0re's package
straight from `hyperhive.packages.<system>.default`. Just import the
module and the service is wired up.