hyperhive/CLAUDE.md
müde 497cd15137 docs: tag-driven config-apply plan + migration story
scratchpad in claude.md marks this as in-flight; docs/approvals.md
gets the new tag state machine (proposal/approved/building/deployed/
failed/denied) and the manager applied.git read-only mount. todo
picks up the unprivileged-containers git-identity caveat and a web
ui for config repos as a downstream follow-up.
2026-05-15 22:43:47 +02:00

155 lines
8 KiB
Markdown

# hyperhive — claude entry point
Hey claude. This is your starting page. The detailed docs live in
[`docs/`](docs/) and are written for humans + you both — read them
when you need depth on a subsystem. This file is the index +
scratchpad.
- High-level project intro: **[README.md](README.md)**.
- Open work + backlog: **[TODO.md](TODO.md)**.
## File map
```
hive-c0re/ host daemon + CLI (one binary, subcommand-dispatched)
src/main.rs clap setup; serve / spawn / kill / rebuild / list /
pending / approve / deny / destroy [--purge] /
request-spawn; periodic vacuum tasks
src/server.rs host admin socket (HostRequest → dispatch)
src/client.rs admin-socket client
src/manager_server.rs manager-privileged socket (ManagerRequest)
src/agent_server.rs per-sub-agent socket listener (long-poll Recv)
src/broker.rs sqlite Message store + broadcast channel for SSE +
hourly vacuum of delivered>30d
src/approvals.rs sqlite Approval queue + kinds
src/operator_questions.rs sqlite question queue backing `ask_operator`
src/events_vacuum.rs host-side hourly sweep of every agent's
/state/hyperhive-events.sqlite
src/crash_watch.rs poll every 10s; fire HelperEvent::ContainerCrash
when a previously-running container disappears
without an operator-initiated transient
src/coordinator.rs shared state (broker/approvals/questions/transient/
sockets) + tombstone enumeration + kick_agent
src/actions.rs approve/deny/destroy (transient-aware)
src/auto_update.rs startup rebuild scan + ensure_manager
src/lifecycle.rs `nixos-container` shellouts, per-agent flake generator
src/dashboard.rs axum HTTP: static shell + /api/state JSON + actions
+ journald viewer + bind-with-retry (SO_REUSEADDR)
assets/ index.html, dashboard.css, app.js (include_str!)
hive-ag3nt/ in-container harness crate; produces TWO binaries
src/lib.rs re-exports + DEFAULT_SOCKET, DEFAULT_WEB_PORT
src/client.rs generic JSON-line request/response over unix socket
src/web_ui.rs per-container axum HTTP page (incl /api/cancel,
/api/compact, /api/model, /events/history)
src/events.rs LiveEvent + broadcast Bus + sqlite-backed history
(/state/hyperhive-events.sqlite) + TurnState +
model selection (persisted at /state/hyperhive-model)
src/turn.rs claude --print + stream-json pump; --compact retry
src/mcp.rs embedded MCP server (rmcp): AgentServer + ManagerServer
src/login.rs probe /root/.claude/ for a valid session
src/login_session.rs drives `claude auth login` over stdio pipes
src/bin/hive-ag3nt.rs sub-agent main (Serve + Mcp subcommands)
src/bin/hive-m1nd.rs manager main (Serve + Mcp subcommands)
assets/ index.html, agent.css, app.js (include_str!)
prompts/ static role/tools/settings for claude (include_str!):
agent.md — sub-agent system prompt
manager.md — manager system prompt
claude-settings.json — --settings JSON
hive-sh4re/ wire types (HostRequest/Response, AgentRequest/Response,
ManagerRequest/Response, Message, Approval, HelperEvent)
nix/
modules/hive-c0re.nix systemd service + firewall + git wiring
templates/harness-base.nix shared scaffolding for sub-agents + manager
templates/agent-base.nix sub-agent nixosConfiguration
templates/manager.nix manager nixosConfiguration
docs/
conventions.md naming, identity=socket, async forms, commit style
gotchas.md NixOS / nspawn lessons learned the hard way
web-ui.md dashboard + per-agent page layouts and endpoints
turn-loop.md claude invocation, wake prompt, MCP tool surface
approvals.md approval flow, manager policy, helper events
persistence.md sqlite dbs, retention, state dir layout
```
## Reading paths
Pick the doc that matches your task. None depend on the others —
read them à la carte.
- **"What does the dashboard look like?"** →
[`docs/web-ui.md`](docs/web-ui.md).
- **"How does claude get its prompt and what tools does it have?"** →
[`docs/turn-loop.md`](docs/turn-loop.md).
- **"How do config changes flow from manager to operator to
container?"** → [`docs/approvals.md`](docs/approvals.md).
- **"What state survives destroy / purge / restart?"** →
[`docs/persistence.md`](docs/persistence.md).
- **"Naming, commit style, wire protocol, the `data-async`
pattern."** → [`docs/conventions.md`](docs/conventions.md).
- **"Why does the nspawn flag look like that?"** →
[`docs/gotchas.md`](docs/gotchas.md).
## Quick reminders
- **Commit before test.** Stage and commit when work *looks*
ready, then run validation. Failures get a follow-up commit
rather than an amend.
- **Commit messages: short, lowercase, no `Co-Authored-By`
trailer.** Imperative mood.
- **`rebuild` is the reconcile verb.** Anything that changes
per-container state on the host should be re-applied there so
the dashboard's `↻ R3BU1LD` is sufficient to recover.
- **Identity = socket.** No auth tokens — the socket path
identifies the principal.
- **Actions are factored** between admin socket and dashboard via
`actions.rs` and `dashboard.rs::lifecycle_action`, so the two
surfaces never drift.
## Scratchpad
In-flight or recent context that hasn't earned a section yet.
Prune freely.
- **In flight:** tag-driven config-apply overhaul. Keep the
two-repo split (proposed = manager RW, applied = core-only)
for safety — agent can rm -rf its own repo but never reaches
applied. New flow: at `request_apply_commit` time hive-c0re
fetches the manager's commit into applied and tags it
`proposal/<id>`; the manager's repo is then dead to core for
that approval. Approve/deny/build are encoded as more tags
(`approved/`, `building/`, `deployed/`, `failed/`, `denied/`)
on the same commit; `applied/main` only fast-forwards on
`deployed/`. Failure tags are annotated with the build error;
deny tags with the operator note. Manager gets `applied/.git`
bind-mounted RO at `/agents/<n>/applied.git` so it can `git
show` deployed/failed/denied trees and diff against its own
working tree. agent.nix stays the entry point but arbitrary
files in the manager's commit are now preserved; `flake.nix`
becomes hive-c0re-generated, gitignored, regenerated only on
spawn/rebuild. Migration: no in-place. Each existing agent
needs `destroy --purge` + re-spawn; tombstones lose their
history. See `docs/approvals.md` for the tag state machine.
- **Recent (since last compaction):** inline +/- diffs on
Write/Edit, send full body via collapsed details, operator
cancel + ttl on questions, deny-with-reason, dashboard
back-link + last-turn timing + model chip, per-agent inbox
view, bind-retry + SO_REUSEADDR, journald viewer,
agent.nix viewer, server-side TurnState, recv(wait_seconds)
max 180s, runtime /model switch + persistence to /state,
crash watcher + ContainerCrash / NeedsLogin / LoggedIn /
NeedsUpdate events, manager `update` tool, pure-hash
agent_web_port + collision banner + spawn/rebuild preflight,
browser notifications, focus-preserving refresh, generalised
<details data-restore-key> survival, prompt-on-submit pattern.
- **Open threads:** custom per-agent MCP tools (groundwork for
moving bitburner-agent into hyperhive), two-step spawn,
per-agent send allow-list, telemetry/charts, notes
compaction, unprivileged containers, Bash allow-list,
xterm.js. **Known bug** (in TODO.md): question id=5 was
queued but didn't render — likely a `pending()` row-decode
error swallowed by `unwrap_or_default`; investigate by curl
/api/state | jq '.questions' + browser console.