hyperhive

Author	SHA1	Message	Date
damocles	b86c0a2217	reminder: atomic delivery transaction + per-tick batch cap	2026-05-17 02:59:51 +02:00
damocles	e45d161cb8	todo: mark recv_blocking race bug as fixed	2026-05-17 02:59:51 +02:00
damocles	f78c6085b9	fix: subscribe-before-check in recv_blocking to avoid missed-wake race	2026-05-17 02:59:51 +02:00
damocles	4f56954422	extract TokenUsage::from_stream_event helper to keep run_claude under clippy line limit	2026-05-17 02:59:51 +02:00
damocles	ce740483c6	show token usage on per-agent web ui after each turn	2026-05-17 02:59:51 +02:00
damocles	ca86bcf4bd	add claudePluginsAutoUpdate NixOS option, default false	2026-05-17 02:59:51 +02:00
müde	6652ae90ab	tea-login: never fail switch-to-configuration a failed tea-login oneshot used to abort `nixos-container update` (switch-to-configuration exits 4), which blocked every rebuild whether the agent needed tea or not. drop `set -e`, exit 0 on every failure path (mkdir, tea login add, missing forge). also fix the unit description, which hardcoded /state (manager-only) — sub- agents have /agents/<name>/state.	2026-05-17 02:58:39 +02:00
müde	600ed509f4	forge: ensure core/meta repo + mirror meta commits to forge startup sweep adds ensure_repo('meta', core_token) after the orgs so the first push isn't a 404. meta::git_commit now calls forge::push_meta after every successful commit — token-in-URL `git push http://core:$token@localhost:3000/core/meta.git` — gated on the core token file existing (no-op when forge isn't seeded). push failures log warn, don't bubble up. no tea needed on the host; git is already on the hive-c0re service PATH via /run/current-system/sw.	2026-05-17 01:52:00 +02:00
müde	68020a15c9	forge: drop redundant 'core' org — meta repo lives under core user	2026-05-17 01:50:12 +02:00
müde	db87167469	forge: seed core admin user + 'core'/'agents' orgs on startup new ensure_core_user_and_token mints a site-admin 'core' user with its token at /var/lib/hyperhive/forge-core-token (root 0600) — hive-c0re's own forge identity for pushing the meta repo + driving the admin API. that token then drives ensure_org for 'core' (meta repo lives here) and 'agents' (per-agent applied config repos). both org-create calls are idempotent: HTTP 422/409 treated as success. failures log but don't abort the rest of the sweep. curl is shelled out from the host — already on the hive-c0re service PATH via /run/current-system/sw, no new dep.	2026-05-17 01:47:54 +02:00
müde	bf20d99142	kick_agent: use /agents/<name>/state uniformly manager has /agents bind-mounted too, so /agents/hm1nd/state resolves there alongside the legacy /state. one canonical path in the wake message instead of branching on MANAGER_NAME.	2026-05-17 01:43:42 +02:00
müde	90f5162076	kick_agent: use per-recipient state path manager keeps /state (legacy mount); sub-agents see their state at /agents/<name>/state. wake message hardcoded /state/ for everyone, which is wrong for sub-agents post-refactor — they get a path they can't ls. switch on MANAGER_NAME and format the right path.	2026-05-17 01:43:03 +02:00
damocles	6ba4241a45	show answered question history on dashboard	2026-05-17 01:41:59 +02:00
müde	411cf86632	nix fmt + rustfmt sweep	2026-05-17 01:40:28 +02:00
müde	0cf120e9e9	harness: default claudeMarketplaces to anthropics/claude-plugins-official so every agent has the official Anthropic marketplace registered out of the box and plugin specs like 'foo@claude-plugins-official' resolve without per-agent.nix wiring. operators add more entries (community marketplace, etc.) or override to [] to opt out.	2026-05-17 01:38:29 +02:00
müde	597351ca4e	harness: declarative claude plugin marketplaces new `hyperhive.claudeMarketplaces` option (list of strings — URL, path, or github:owner/repo). harness boot adds each via `claude plugin marketplace add` before updating + installing the configured plugins, so specs like `foo@some-marketplace` resolve on a fresh container. idempotent: 'already exists' stderr is treated as success.	2026-05-17 01:36:18 +02:00
müde	608de57924	hive-forge: default to pkgs.forgejo (15.x), expose package option nixpkgs's services.forgejo defaults to forgejo-lts (11.0.13 today); LTS lags far enough behind that any prior non-LTS run against the same state dir leaves the DB at a migration the LTS binary can't read ('database newer than binary, refusing to start'). default to the latest release line and let operators opt down to LTS by overriding services.hive-forge.package.	2026-05-17 01:29:19 +02:00
müde	2192cb5148	forge-login: don't die on RO ~/.config/git/config home-manager / nix-managed git configs ship the file from the nix store, so `git config --global` errors out. catch the failure and print the equivalent home-manager snippet instead of aborting — the tea + netrc steps still want to run.	2026-05-17 01:22:31 +02:00
müde	33f7408ef1	scripts: forge-migrate.sh — run pending DB migrations + restart for the 'table X has no column Y' class of schema-lag errors that showed up generating an access token on a fresh 11.0.13 install.	2026-05-17 01:21:40 +02:00
müde	a1c4d37bc9	scripts: forge-login.sh + forge-create-token.sh forge-create-token.sh mints an access token for an existing user (prints to stdout — forgejo only shows it once). forge-login.sh configures the operator's shell: git config --global user.name / user.email, ~/.netrc entry for HTTP clones, and `tea login add` when tea is on PATH. takes the token interactively (hidden input) so it doesn't land in shell history.	2026-05-17 01:18:27 +02:00
müde	d8b05a9eb9	scripts: forge-create-user.sh wrapper one-liner-as-a-script: `forge-create-user.sh mara --admin`. wraps the nixos-container run + runuser + --work-path dance + sensible defaults (random password, no force-change, email = <user>@hive.local) so copy-paste line-continuations don't bite.	2026-05-17 00:43:02 +02:00
müde	2b076f8ce4	forge: pass --work-path to admin CLI so app.ini is found without --work-path, forgejo's admin CLI defaults WorkPath to the binary's directory (RO nix store), can't find custom/conf/app.ini there, falls back to defaults, and F3 init mkdir-fails inside the store. systemd unit sets WORK_PATH for the daemon; mirror it here for every nixos-container-driven 'forgejo admin' invocation.	2026-05-17 00:42:03 +02:00
müde	fed943a04e	hive-forge: pin F3 PATH absolute (init runs even when disabled) forgejo's F3 init resolves data-dir before checking ENABLED, so `forgejo admin user create` still fataled on the RO nix-store default. set [F3] PATH = /var/lib/forgejo/data/f3 alongside the disable.	2026-05-17 00:25:55 +02:00
müde	3e3c27ac48	hive-forge: disable F3 (federation) — defaults to RO nix-store path forgejo's F3 federation subsystem resolves its data dir relative to the binary, which under nixos lands at /run/current-system/sw/bin/data/f3 (read-only nix store) and fatals the daemon at boot. we don't federate; turn it off.	2026-05-17 00:03:41 +02:00
müde	4a06615c5c	fix /state paths: sub-agents use /agents/<name>/state, not /state sub-agent containers post-refactor bind their state at /agents/<name>/state (manager keeps the legacy /state — see lifecycle.rs:751). agent.md still said /state/forge-token; corrected to /agents/{label}/state/forge-token (template-substituted at boot). tea-login systemd unit now walks both candidates so the same harness module works for the manager and sub-agents.	2026-05-16 23:37:49 +02:00
müde	9fc7cae132	prompts: tell agents + manager about the code forge; todo: shared docs repo system prompts now describe the hyperhive Forgejo at localhost:3000, the per-agent user, the pre-configured tea CLI, and the REST API fallback with /state/forge-token. todo gains the shared docs/skills RO-repo follow-up (org-shared + per-agent read membership).	2026-05-16 23:36:05 +02:00
müde	787c058c71	harness: install tea + auto-login from /state/forge-token agents get `pkgs.tea` (gitea/forgejo CLI) and a tea-login oneshot that runs `tea login add --url <hyperhive.forge.url> --token $(cat /state/forge-token)` before the harness starts. idempotent: exits 0 when the token file is absent (hive-forge not on) or when ~/.config/tea/config.yml already exists. new `hyperhive.forge.url` option (default http://localhost:3000) so operators can point at a non-default forge port. claude can now shell out to `tea repos create`, `tea pulls create`, etc.	2026-05-16 23:35:28 +02:00
müde	dccbd99b0c	forge: broaden token scopes for repo create / PRs / orgs / misc bumped from (read:user,write:repository,write:issue) to also include write:user (own profile + create repos under own namespace), write:organization (share namespaces between agents), write:misc (hooks/attachments). still excludes admin and package scopes.	2026-05-16 20:58:20 +02:00
müde	480d646f69	forge: auto-create a user + token per agent on spawn / startup new forge module probes the hive-forge nixos-container (no-op when absent), and ensures every agent + the manager has a forgejo user named after them with an access token at `<state>/forge-token` (visible inside the container as `/state/forge-token`). idempotent: skips user creation when forgejo reports 'already exists', skips token issuance when the file is present, scopes the token to read:user,write:repository,write:issue. token-name suffixed with a clock so re-issuing doesn't collide with a stale name. shells out via `nixos-container run hive-forge -- runuser -u forgejo -- forgejo admin` (runuser instead of sudo since sudo isn't in the container by default). hooks: ensure_all sweeps existing containers at hive-c0re startup (backgrounded), and the actions.rs spawn task calls ensure_user_for the new agent right after lifecycle::spawn succeeds. failures log a warning but don't abort spawn — a missing token is recoverable from the next startup sweep.	2026-05-16 20:55:13 +02:00
müde	6e9c67dd94	hive-forge: wrap forgejo in a nixos-container avoids fighting an operator-side `services.forgejo` over the singleton module options. container shares host netns (`privateNetwork = false`) so agents still dial the forge via plain `localhost:<httpPort>` and the host firewall is the only layer that matters. container name is `hive-forge` (no `h-` prefix) so hive-c0re's lifecycle scanner ignores it — operator manages it with the standard `nixos-container` CLI. state lives at `/var/lib/nixos-containers/hive-forge/var/lib/forgejo/` and survives restarts.	2026-05-16 20:52:36 +02:00
müde	c2d176ed13	add hive-forge module: private forgejo for agents new `services.hive-forge.enable` (off by default) wraps `services.forgejo` with hyperhive-friendly defaults: sqlite (no extra db service), built-in ssh on 2222 so it doesn't fight the host's openssh, http on 3000 (outside hyperhive's 7000/8000/8100-8999 ranges), registration off (operator seeds agent users), private repos by default. exported as `nixosModules.hive-forge` — operator imports it on the host alongside hive-c0re. container-side wiring (MCP tools or a bind-mounted token) is deferred; containers already share the host netns so they can reach http://localhost:3000 today.	2026-05-16 20:50:36 +02:00
damocles	824acee134	include agent label in turn failure notification body	2026-05-16 20:45:19 +02:00
damocles	1023acf69f	add get_logs tool to manager mcp surface	2026-05-16 20:45:19 +02:00
damocles	fca480b86e	add turn lock to prevent /compact racing with in-flight turns	2026-05-16 20:45:19 +02:00
damocles	25508d7399	fix manager loop: pending wake + move sleep into Empty arm only	2026-05-16 20:45:19 +02:00
müde	f2a0dc4107	re-apply TodoWrite removal + deny list (lost in subsequent merge)	2026-05-16 19:47:55 +02:00
müde	313121a6e9	fix: transient state leak via RAII guard bare set_transient/clear_transient pairs leak the in-memory transient on task cancellation, panics, or any early return between the two calls — dashboard then shows the agent stuck in 'rebuilding…' forever (coder hit this today). add Coordinator::transient_guard returning a TransientGuard whose Drop clears, and convert every caller (dashboard lifecycle_action, auto_update::rebuild_agent, manager_server Update, actions::destroy, actions Spawn task, migrate phase 4). destroy() now takes &Arc<Coordinator> so it can hold a guard. existing stuck transients clear on next hive-c0re restart since transient state is in-memory only.	2026-05-16 19:47:52 +02:00
damocles	1a36c38a54	fix broadcast send for manager, deduplicate into coordinator.broadcast_send	2026-05-16 19:31:53 +02:00
damocles	f3739d2b8e	update plugin marketplaces before install at harness boot	2026-05-16 18:51:06 +02:00
damocles	dc53615686	fix stale /state refs in agent and manager prompts	2026-05-16 18:50:15 +02:00
müde	772fdd8320	forward plugin install failures to manager from sub-agents install_configured now takes an optional notify recipient. on a non-zero or spawn-failed 'claude plugin install', sub-agents send the spec + stderr to manager via the hyperhive socket; manager passes None so it doesn't message itself. boot still proceeds either way — notification is best-effort.	2026-05-16 17:24:04 +02:00
müde	3e040d5b16	agent: forward unhandled turn failures to manager run_claude now keeps a 20-line stderr ring buffer and bails with it inline (was just 'exit <status>'). agent serve loop, on Failed (not PromptTooLong — that's already absorbed by drive_turn's compaction retry), sends the error body to manager via the normal hyperhive send. swallows transport errors — failure is already in journald and the events sqlite. manager-only harness (hive-m1nd) is unchanged so it doesn't try to notify itself.	2026-05-16 16:04:35 +02:00
müde	7ec658851a	back out bypassPermissions: claude refuses it under root uid claude-code rejects --dangerously-skip-permissions / defaultMode= bypassPermissions when running as root, which all hyperhive containers do. revert to the previous explicit allow-list plumbing (per-flavor list spliced into permissions.allow + --tools enable list), keep TodoWrite out of the built-in allow set, and keep the deny list (TodoWrite, WebFetch, WebSearch, Task) as belt-and-braces in case anything sneaks past the allow gate.	2026-05-16 15:58:41 +02:00
müde	36c7f3d1c7	mirror claude stderr to tracing so journald captures it bus-only note made post-mortems require the web UI / events sqlite; now stderr lines also land in 'journalctl -M <container> -b' alongside the existing LiveEvent::Note for the dashboard.	2026-05-16 15:30:03 +02:00
müde	7d33da3727	retry hive socket up to 5x over 60s, surface retry count to claude socket client now retries connect/IO failures with 2-4-8-16-30s backoffs (60s total budget). transparent for non-tool callers via request(); tool handlers go through request_retried() which also returns the retry count, then annotate_retries() appends a one-line note to the tool result so claude knows the slow round-trip was a c0re flicker, not a content failure — avoids burning tokens on an LLM-level retry.	2026-05-16 15:28:18 +02:00
damocles	4a8a668348	feat: add optional description to request_apply_commit and request_spawn	2026-05-16 15:18:32 +02:00
damocles	a6d1464071	refactor: per-agent state paths (/agents/{label}/state), centralize in paths.rs	2026-05-16 15:18:32 +02:00
damocles	a82009cf8c	docs: update agent prompt to reference /agents/{label}/state and /agents/{label}/claude	2026-05-16 15:18:19 +02:00
damocles	ecaa178199	refactor: compute per-agent mount points for /agents/<name>/ structure	2026-05-16 15:18:19 +02:00
müde	6dd17864ac	auto-install claude plugins at harness boot new hyperhive.claudePlugins NixOS option (list of strings) rendered to /etc/hyperhive/claude-plugins.json. both hive-ag3nt and hive-m1nd shell out 'claude plugin install <spec>' for each entry once at startup before the turn loop opens. failures log a warning but don't abort boot.	2026-05-16 15:17:34 +02:00

1 2 3 4 5 ...

379 commits