hyperhive/CLAUDE.md
müde a1cfb60fd0 docs: pre-load meta-flake design
scratchpad in claude.md and an in-flight callout at the top of
docs/approvals.md describe the upcoming overhaul so subsequent
commits can cite the design. covers: module-only agent flake
shape, /var/lib/hyperhive/meta/ as a hive-c0re-owned single
repo, applied remote pre-wired in proposed for manager git
plumbing, /meta RO bind for the system-wide deploy log,
auto-migration on hive-c0re startup with HIVE_SKIP_META_MIGRATION
kill-switch.
2026-05-16 00:06:42 +02:00

9.1 KiB

hyperhive — claude entry point

Hey claude. This is your starting page. The detailed docs live in docs/ and are written for humans + you both — read them when you need depth on a subsystem. This file is the index + scratchpad.

File map

hive-c0re/         host daemon + CLI (one binary, subcommand-dispatched)
  src/main.rs           clap setup; serve / spawn / kill / rebuild / list /
                         pending / approve / deny / destroy [--purge] /
                         request-spawn; periodic vacuum tasks
  src/server.rs         host admin socket (HostRequest → dispatch)
  src/client.rs         admin-socket client
  src/manager_server.rs manager-privileged socket (ManagerRequest)
  src/agent_server.rs   per-sub-agent socket listener (long-poll Recv)
  src/broker.rs         sqlite Message store + broadcast channel for SSE +
                         hourly vacuum of delivered>30d
  src/approvals.rs      sqlite Approval queue + kinds
  src/operator_questions.rs  sqlite question queue backing `ask_operator`
  src/events_vacuum.rs  host-side hourly sweep of every agent's
                         /state/hyperhive-events.sqlite
  src/crash_watch.rs    poll every 10s; fire HelperEvent::ContainerCrash
                         when a previously-running container disappears
                         without an operator-initiated transient
  src/coordinator.rs    shared state (broker/approvals/questions/transient/
                         sockets) + tombstone enumeration + kick_agent
  src/actions.rs        approve/deny/destroy (transient-aware)
  src/auto_update.rs    startup rebuild scan + ensure_manager
  src/lifecycle.rs      `nixos-container` shellouts, per-agent flake generator
  src/dashboard.rs      axum HTTP: static shell + /api/state JSON + actions
                         + journald viewer + bind-with-retry (SO_REUSEADDR)
  assets/               index.html, dashboard.css, app.js (include_str!)

hive-ag3nt/        in-container harness crate; produces TWO binaries
  src/lib.rs            re-exports + DEFAULT_SOCKET, DEFAULT_WEB_PORT
  src/client.rs         generic JSON-line request/response over unix socket
  src/web_ui.rs         per-container axum HTTP page (incl /api/cancel,
                         /api/compact, /api/model, /events/history)
  src/events.rs         LiveEvent + broadcast Bus + sqlite-backed history
                         (/state/hyperhive-events.sqlite) + TurnState +
                         model selection (persisted at /state/hyperhive-model)
  src/turn.rs           claude --print + stream-json pump; --compact retry
  src/mcp.rs            embedded MCP server (rmcp): AgentServer + ManagerServer
  src/login.rs          probe /root/.claude/ for a valid session
  src/login_session.rs  drives `claude auth login` over stdio pipes
  src/bin/hive-ag3nt.rs sub-agent main (Serve + Mcp subcommands)
  src/bin/hive-m1nd.rs  manager main (Serve + Mcp subcommands)
  assets/               index.html, agent.css, app.js (include_str!)
  prompts/              static role/tools/settings for claude (include_str!):
                          agent.md  — sub-agent system prompt
                          manager.md — manager system prompt
                          claude-settings.json — --settings JSON

hive-sh4re/        wire types (HostRequest/Response, AgentRequest/Response,
                   ManagerRequest/Response, Message, Approval, HelperEvent)

nix/
  modules/hive-c0re.nix         systemd service + firewall + git wiring
  templates/harness-base.nix    shared scaffolding for sub-agents + manager
  templates/agent-base.nix      sub-agent nixosConfiguration
  templates/manager.nix         manager nixosConfiguration

docs/
  conventions.md       naming, identity=socket, async forms, commit style
  gotchas.md           NixOS / nspawn lessons learned the hard way
  web-ui.md            dashboard + per-agent page layouts and endpoints
  turn-loop.md         claude invocation, wake prompt, MCP tool surface
  approvals.md         approval flow, manager policy, helper events
  persistence.md       sqlite dbs, retention, state dir layout

Reading paths

Pick the doc that matches your task. None depend on the others — read them à la carte.

Quick reminders

  • Commit before test. Stage and commit when work looks ready, then run validation. Failures get a follow-up commit rather than an amend.
  • Commit messages: short, lowercase, no Co-Authored-By trailer. Imperative mood.
  • rebuild is the reconcile verb. Anything that changes per-container state on the host should be re-applied there so the dashboard's ↻ R3BU1LD is sufficient to recover.
  • Identity = socket. No auth tokens — the socket path identifies the principal.
  • Actions are factored between admin socket and dashboard via actions.rs and dashboard.rs::lifecycle_action, so the two surfaces never drift.

Scratchpad

In-flight or recent context that hasn't earned a section yet. Prune freely.

  • In flight: meta-flake overhaul. Each agent's applied repo becomes a tiny module-only flake (nixosModules.default = import ./agent.nix); agent.nix is just a NixOS module function { config, pkgs, lib, ... }: { ... } — no extendModules, no hyperhive input visible to the manager. A single hive-c0re-owned repo at /var/lib/hyperhive/meta/ declares one input per agent (pointing at that agent's applied repo via git+file://) and one nixosConfigurations.<n> output per agent, wrapping inputs.agent-<n>.nixosModules.default with the identity
    • HIVE_PORT / HIVE_LABEL / HIVE_DASHBOARD_PORT injection that today's per-agent setup_applied does inline. Containers run against meta#<n> instead of applied/<n>#default. Every approval that lands does nix flake lock --update-input agent-<n> in meta and commits the lock — meta's git log is the system-wide deploy audit trail; per-agent tags stay as before for inside-baseball state.
  • Companion change: the manager's /agents/<n>/config/ (proposed) gets applied pre-configured as a git remote pointing at /applied/<n>/.git (the RO bind already there). git fetch applied / git show applied/refs/tags/deployed/<id> / git rebase applied/main etc. all just work from inside the manager. The manager additionally gets /meta RO-bound, so git -C /meta log --oneline and cat /meta/flake.lock answer "what's actually deployed across the swarm right now."
  • Auto-migration on startup: new phase before auto_update::run rewrites each existing applied/<n>/flake.nix to the module-only shape + relocates deployed/0, adds the applied remote to each proposed repo, bootstraps the meta repo from the agent list if missing, and nixos-container updates every container to point at meta#<n> (no fs wipe, no re-login). Idempotent; HIVE_SKIP_META_MIGRATION=1 defers it.
  • Just landed (prior overhaul still in place): tag-driven config-apply. Two-repo split (proposed = manager RW, applied = core-only); request_apply_commit fetches the manager's commit into applied and pins it as proposal/<id>; approve / deny / build walk through tags on the same commit; applied/main only fast-forwards on deployed/. failed/ + denied/ are annotated. See docs/approvals.md for the state machine.
  • Recent (since last compaction): inline +/- diffs on Write/Edit, send full body via collapsed details, operator cancel + ttl on questions, deny-with-reason, dashboard back-link + last-turn timing + model chip, per-agent inbox view, bind-retry + SO_REUSEADDR, journald viewer, agent.nix viewer, server-side TurnState, recv(wait_seconds) max 180s, runtime /model switch + persistence to /state, crash watcher + ContainerCrash / NeedsLogin / LoggedIn / NeedsUpdate events, manager update tool, pure-hash agent_web_port + collision banner + spawn/rebuild preflight, browser notifications, focus-preserving refresh, generalised
    survival, prompt-on-submit pattern.
  • Open threads: custom per-agent MCP tools (groundwork for moving bitburner-agent into hyperhive), two-step spawn, per-agent send allow-list, telemetry/charts, notes compaction, unprivileged containers, Bash allow-list, xterm.js. Known bug (in TODO.md): question id=5 was queued but didn't render — likely a pending() row-decode error swallowed by unwrap_or_default; investigate by curl /api/state | jq '.questions' + browser console.