new `hyperhive.claudeMarketplaces` option (list of strings — URL,
path, or github:owner/repo). harness boot adds each via
`claude plugin marketplace add` before updating + installing the
configured plugins, so specs like `foo@some-marketplace` resolve
on a fresh container. idempotent: 'already exists' stderr is
treated as success.
sub-agent containers post-refactor bind their state at
/agents/<name>/state (manager keeps the legacy /state — see
lifecycle.rs:751). agent.md still said /state/forge-token; corrected
to /agents/{label}/state/forge-token (template-substituted at
boot). tea-login systemd unit now walks both candidates so the same
harness module works for the manager and sub-agents.
system prompts now describe the hyperhive Forgejo at localhost:3000,
the per-agent user, the pre-configured tea CLI, and the REST API
fallback with /state/forge-token. todo gains the shared docs/skills
RO-repo follow-up (org-shared + per-agent read membership).
install_configured now takes an optional notify recipient. on a
non-zero or spawn-failed 'claude plugin install', sub-agents send
the spec + stderr to manager via the hyperhive socket; manager
passes None so it doesn't message itself. boot still proceeds either
way — notification is best-effort.
run_claude now keeps a 20-line stderr ring buffer and bails with it
inline (was just 'exit <status>'). agent serve loop, on Failed (not
PromptTooLong — that's already absorbed by drive_turn's compaction
retry), sends the error body to manager via the normal hyperhive
send. swallows transport errors — failure is already in journald
and the events sqlite. manager-only harness (hive-m1nd) is unchanged
so it doesn't try to notify itself.
claude-code rejects --dangerously-skip-permissions / defaultMode=
bypassPermissions when running as root, which all hyperhive
containers do. revert to the previous explicit allow-list plumbing
(per-flavor list spliced into permissions.allow + --tools enable
list), keep TodoWrite out of the built-in allow set, and keep the
deny list (TodoWrite, WebFetch, WebSearch, Task) as belt-and-braces
in case anything sneaks past the allow gate.
bus-only note made post-mortems require the web UI / events sqlite;
now stderr lines also land in 'journalctl -M <container> -b' alongside
the existing LiveEvent::Note for the dashboard.
socket client now retries connect/IO failures with 2-4-8-16-30s
backoffs (60s total budget). transparent for non-tool callers via
request(); tool handlers go through request_retried() which also
returns the retry count, then annotate_retries() appends a one-line
note to the tool result so claude knows the slow round-trip was a
c0re flicker, not a content failure — avoids burning tokens on an
LLM-level retry.
new hyperhive.claudePlugins NixOS option (list of strings) rendered
to /etc/hyperhive/claude-plugins.json. both hive-ag3nt and hive-m1nd
shell out 'claude plugin install <spec>' for each entry once at
startup before the turn loop opens. failures log a warning but don't
abort boot.
claude-settings.json now sets permissions.defaultMode=bypassPermissions
with a small deny list (WebFetch, WebSearch, Task, TodoWrite). The
per-flavor allow list and --tools / --allowedTools CLI flags are gone
— anything not denied auto-approves. mcp.rs loses ALLOWED_BUILTIN_TOOLS,
builtin_tools_arg, allow_list, allowed_mcp_tools. The extraMcpServers
allowedTools field is parsed for back-compat but no longer wired
anywhere; restrict via permissions.deny instead.
new NixOS option in harness-base.nix:
hyperhive.allowedRecipients = [ 'alice' 'manager' ]; # whitelist
hyperhive.allowedRecipients = [ ]; # default = unrestricted
module writes the list as JSON to /etc/hyperhive/send-allow
.json at activation. AgentServer::send reads the file before
issuing the broker request; if the list is non-empty and
`to` isn't on it, the tool returns a claude-readable refusal
string without touching the broker. the manager is always
implicitly permitted regardless of the list — otherwise a
misconfigured allow-list could strand a sub-agent without an
escalation path.
enforcement is in the in-container MCP server (not on the
host's per-agent socket) because the agent's nix config is the
trust boundary anyway — the operator audits agent.nix at
deploy time, the activation-time /etc/hyperhive/send-allow
.json is r/o under /nix/store, so the agent can't tamper at
runtime without going through a new approval.
agent prompt mentions the option + tells claude to route
through the manager when refused. retires the matching TODO
under Permissions / policy.
old behavior: omitted wait_seconds fell through to the 30s
RECV_LONG_POLL_DEFAULT — claude calling 'is there anything in
my inbox right now?' between actions blocked the turn for half
a minute. flip the semantics: None (or 0) returns immediately,
positive value parks up to MAX (180s, unchanged). cleaner
'peek vs wait' distinction; tool descriptions + agent/manager
prompts updated to point at the new shape.
harness's own serve loops in hive-ag3nt + hive-m1nd relied on
the old default for their inbox poll. they now explicitly pass
wait_seconds: Some(180) to opt into the full park — same
effective behavior as before, just spelled out.
retires the matching TODO under Turn loop.
new AgentRequest::Wake { from, body } drops a message into
this agent's inbox via the per-agent socket. matrix-style MCP
servers can use it when they receive an external event
(matrix message, webhook, scrape result) to nudge claude
into running a turn. broker.send wakes whatever Recv is
currently long-polling, the harness picks the message up,
formats a wake prompt with the caller's chosen from label
('matrix: new dm', 'webhook: deploy succeeded', etc.).
new `hive-ag3nt wake --from <label> --body <text>` subcommand
on the harness binary so MCP servers can shell out instead of
implementing the line-JSON protocol themselves; body=='-'
reads from stdin for multi-line / quoting-friendly payloads.
identity = socket: anything that can connect to /run/hive/mcp
.sock is implicitly trusted to inject. that's fine because the
bind-mount is the agent's own container; no new auth surface
opens up.
docs/turn-loop.md gets a new 'Waking the agent from inside
the container' section pointing at both paths (CLI + raw
JSON).
new boilerplate wraps agent.nix as a sub-module + passes every
flake input (minus self) through to it via _module.args.flake
Inputs. manager edits the inputs block of flake.nix to pull in
out-of-tree flakes (MCP servers etc.) and references them in
agent.nix as flakeInputs.<name>.packages.${pkgs.system}.default
— the new input's pinned sha lands in the agent's own flake
.lock (already tracked + part of the proposal flow), and
transitively rolls up into meta's lock.
migrate's MODULE_FLAKE_MARKER swaps to _module.args.flakeInputs
so existing agents on the old 'nixosModules.default = import
./agent.nix' template get re-rendered onto the new shape on
next hive-c0re start.
manager_server's flake.nix tamper-check goes away — the build
path's failed/<id> annotated tag already provides the safety
net when a manager edit breaks the flake; enforcing 'no
flake.nix edits at all' was overly strict (blocks the inputs-
addition pattern that's the whole point of this change).
manager prompt updated with a worked example for adding an
MCP-server flake input + wiring it through agent.nix.
new NixOS option in harness-base.nix:
hyperhive.extraMcpServers.<key> = {
command = "/path/to/server";
args = [ ... ];
env = { KEY = "value"; };
allowedTools = [ "send_message" "join_room" ]; # or ["*"]
};
declared as attrsOf submodule so agents stack arbitrarily many.
the module writes the whole map as JSON to
/etc/hyperhive/extra-mcp.json at activation; the harness reads
that file in mcp::render_claude_config and merges each entry
into the rendered --mcp-config under its own mcpServers.<key>
block. allowed_mcp_tools(flavor) extends the --allowedTools
arg with mcp__<key>__<pattern> for every entry — "*" (the
default) becomes mcp__<key>__* so every tool from that server
is auto-approved, or pass a concrete list to tighten.
collision guard: an extra server keyed "hyperhive" is dropped
with a warn-log so user config can't shadow the built-in
surface. malformed JSON / missing file fall back to "no
extras" silently.
prompt note added: agents see "(some agents only) extra MCP
tools surfaced as mcp__<server>__<tool>" and learn they're
declared via agent.nix. retires the matching TODO under
Per-agent extension. matrix-chat agents + bitburner-agent
migration unblocked.
new NixOS module option services.hive-c0re.operatorPronouns
(free text, default 'she/her', example 'they/them'). hive-c0re
takes it as a CLI flag (--operator-pronouns, lib.escapeShellArg'd
in the systemd unit), stores it on Coordinator, threads it into
the meta flake's mkAgent so each agent's systemd service gets
HIVE_OPERATOR_PRONOUNS set. the harness reads the env at boot
and substitutes {operator_pronouns} into the agent / manager
system prompt alongside {label}. nix string is escaped against
backslash + double-quote so non-ascii / quoted values
round-trip safely. prompt addendum: both agent.md and
manager.md mention the operator's pronouns up front so claude
uses them naturally in third-person reference. propagates on
next ↻ R3BU1LD (meta lock bump, no per-agent approval).
new AgentRequest::AskOperator + AgentResponse::QuestionQueued on
the per-agent socket — same shape as the manager flavor, agent
gets the same wire surface (still uses the same operator_questions
table). agent_server::dispatch wires AskOperator through coord
.questions.submit(agent, ...) so the row's asker is the sub-agent
name; the ttl watchdog already in manager_server gets shared and
spawn_question_watchdog goes pub.
answer routing: operator_questions::answer now returns (question,
asker). post_answer_question + post_cancel_question + the watchdog
fire OperatorAnswered through new coord.notify_agent(asker, event)
instead of always notify_manager — the event lands in whichever
agent originally asked. notify_manager is now a thin wrapper.
agent socket plumbing: agent_server::start takes Arc<Coordinator>
instead of Arc<Broker> so dispatch has access to questions +
notify path; coordinator::{register_agent,ensure_runtime} take
self: &Arc<Self>. mcp::AgentServer grows the ask_operator tool;
allowed_mcp_tools(Agent) adds it; prompts/agent.md replaces the
'message the manager to ask the operator' guidance with the
direct tool description.
new TurnFiles bundle (mcp_config + settings + system_prompt +
flavor) materialised once per harness boot, passed to drive_turn
and compact_session alike. operator-initiated /compact now uses
the exact same session shape as a normal turn — same MCP
surface, same allowed tools, same role prompt — only the stdin
payload differs (/compact vs the wake-up body). web_ui's
AppState carries the TurnFiles instead of (label + socket +
flavor + ad-hoc file writes per click). bin/hive-ag3nt and
bin/hive-m1nd prepare TurnFiles before spawning the web UI and
pass them to both surfaces. web_ui::Flavor folds into a type
alias for mcp::Flavor — no two-stage enum mapping.
removes ClaudeMode + the run_claude variant fork (system prompt
was Option, mcp args were skipped on Compact). dead 'mode'
plumbing gone.
every claude invocation now runs with current_dir set to
/state — relative paths in tool calls (Read notes.md, Bash ls,
Write blob) land in the agent's durable bind-mounted dir
instead of the harness's systemd cwd. /state is RW + survives
destroy/recreate so this matches where the agent is told to
keep notes anyway. dev/test setups without the bind silently
fall back to the parent cwd.
bus carries a one-shot AtomicBool armed by POST
/api/new-session (or the /new-session slash command). next
turn drops --continue, starting a fresh claude session; the
flag clears automatically so subsequent turns resume normal
behavior. /compact still always uses --continue — compacting
a non-existent session is a no-op anyway.
per-agent page grows an ↻ new session button next to the
cancel-turn one (always visible, amber, confirms before
posting since dropping --continue context isn't reversible).
slash-command surface picks up /new-session for parity with
the button. note row emitted on the live feed both at arm-
time and again when the turn actually consumes the flag, so
the operator can confirm it landed.
agent.nix becomes a plain NixOS module function — flake.nix is
fixed boilerplate the manager mustn't edit; meta flake at /meta/
owns the wrapper. proposed repos ship with an 'applied' remote
pre-wired, so 'git fetch applied' / 'git log applied/main' /
'git show applied/refs/tags/deployed/<id>' all just work without
constructing paths by hand. /meta/ exposes the system-wide
deploy log (git log /meta) + flake.lock for cross-agent sha
introspection.
manager prompt: explain that arbitrary files now travel with
the proposal, document the /applied/<n>/.git RO mount and the
tag scheme (git show applied/deployed/<id> etc.), call out
that applied/main only advances on deployed so a failed build
isn't terminal. approvals.md: drop the old per-agent
applied.git phrasing in favour of the single /applied RO
bind, mention both manager binds together. claude.md
scratchpad flips from in-flight to just-landed.
crash_watch grows two more state-axes alongside running/stopped:
- logged-in (claude session dir populated for the agent)
- up-to-date (recorded flake rev matches current)
per-tick transitions emit HelperEvent::NeedsLogin / LoggedIn /
NeedsUpdate. seed-on-first-tick semantics retained — nothing fires
on harness boot for agents that were already in their state. only
needs_update fires the 'stale appeared' direction; the resolved
direction is already covered by Rebuilt.
new mcp__hyperhive__update(name) on the manager surface: idempotent
rebuild via auto_update::rebuild_agent. transient-aware (Rebuilding)
so the dashboard shows the spinner. login intentionally has NO tool
— it's interactive OAuth, only the operator can complete it.
prompts + approvals doc + turn-loop doc updated. todo grows a
'show per-agent applied config in dashboard' entry (separate
follow-up).
send was truncating to 80 chars in the tool_use row, hiding
anything past the first sentence. now renders as a collapsed
<details> like Write/Edit — summary still shows the recipient +
headline (so the operator can scan), expanding reveals the full
body unchanged.
recv side was already covered: the wake prompt shows the full
incoming body, and explicit recv() tool_result rows expand to
the full text via the existing collapsed-results path.
model persistence: /model <name> now writes to /state/hyperhive-model
(in-container), Bus::new reads it on init. operator override survives
harness restart and container rebuild; gone on --purge like every
other piece of agent state. path overridable via HYPERHIVE_MODEL_FILE
for tests. failure to persist is a warn, not fatal — runtime override
still applies, just won't survive a restart.
unfree opt-in: drop the auto-allowUnfreePredicate from
harness-base.nix and the claude-unstable overlay. operator now has to
set nixpkgs.config.allowUnfree (or a predicate listing claude-code)
in their own host config. silent unfree bypass was sketchy; this is
honest. readme + gotchas updated to spell out the snippet.
todo: drops model-persistence + container-crash + journald (all
shipped); adds per-agent send allow-list (constrain who an agent can
message).
new hive_c0re::crash_watch task polls every 10s, builds the set of
currently-running containers, and on running→stopped transitions
checks the transient snapshot: if no Stopping / Restarting /
Destroying / Rebuilding flag is set, the container exited
unexpectedly and we fire HelperEvent::ContainerCrash into the
manager's inbox so it can react (typically: start it again).
first poll is a seeding pass — no events on harness startup. dbus
subscription would be lower-latency but polling is honest and
debuggable, and a 10s delay on crash detection is fine for our
scale.
manager prompt + approvals doc updated to advertise the new
event variant. todo drops the entry (and the journald-viewer
entry that already shipped).