Commit graph

9 commits

Author SHA1 Message Date
müde
edb0108ae7 docs+prompt: tag-driven flow + /applied RO mount
manager prompt: explain that arbitrary files now travel with
the proposal, document the /applied/<n>/.git RO mount and the
tag scheme (git show applied/deployed/<id> etc.), call out
that applied/main only advances on deployed so a failed build
isn't terminal. approvals.md: drop the old per-agent
applied.git phrasing in favour of the single /applied RO
bind, mention both manager binds together. claude.md
scratchpad flips from in-flight to just-landed.
2026-05-15 23:03:48 +02:00
müde
497cd15137 docs: tag-driven config-apply plan + migration story
scratchpad in claude.md marks this as in-flight; docs/approvals.md
gets the new tag state machine (proposal/approved/building/deployed/
failed/denied) and the manager applied.git read-only mount. todo
picks up the unprivileged-containers git-identity caveat and a web
ui for config repos as a downstream follow-up.
2026-05-15 22:43:47 +02:00
müde
75e7faff0c docs: full sync ahead of compaction + config-management overhaul
readme: manager mcp surface picks up update; operator-surface
recap mentions /model + last-turn + model chip + the three
collapsibles (inbox / journald / agent.nix).

web-ui.md: details-restore-key story under shape; port-conflict
banner mention on containers; agent.nix viewer alongside journald;
notifications use per-event tags + console.debug log on
block/show; deny endpoint takes note=<reason>; data-prompt /
data-prompt-field generalisation noted.

conventions.md: data-prompt and snapshot/restoreOpenDetails added
to the async-forms section.

persistence.md: operator_questions row picks up deadline_at (ttl)
column with a migration note.

todo.md: new 'Bugs' section captures the manager-question
not-rendering issue with three suspect paths to chase.

claude.md scratchpad rewritten as a clean handoff for the
compaction + the upcoming config-git overhaul. flags the
two-repo (proposed/ + applied/) split as the thing to
reconsider.
2026-05-15 22:12:40 +02:00
müde
80229c6af9 manager: needs_login / logged_in / needs_update events + update tool
crash_watch grows two more state-axes alongside running/stopped:

- logged-in (claude session dir populated for the agent)
- up-to-date (recorded flake rev matches current)

per-tick transitions emit HelperEvent::NeedsLogin / LoggedIn /
NeedsUpdate. seed-on-first-tick semantics retained — nothing fires
on harness boot for agents that were already in their state. only
needs_update fires the 'stale appeared' direction; the resolved
direction is already covered by Rebuilt.

new mcp__hyperhive__update(name) on the manager surface: idempotent
rebuild via auto_update::rebuild_agent. transient-aware (Rebuilding)
so the dashboard shows the spinner. login intentionally has NO tool
— it's interactive OAuth, only the operator can complete it.

prompts + approvals doc + turn-loop doc updated. todo grows a
'show per-agent applied config in dashboard' entry (separate
follow-up).
2026-05-15 21:42:13 +02:00
müde
62d1a74929 docs sync + revert auto-unfree removal
revert the earlier 'operator must set allowUnfree' move:
per-agent containers evaluate their own nixpkgs and the operator's
host-level allowUnfree doesn't propagate in. restoring the scoped
allowUnfreePredicate inside both the claude-unstable overlay and
harness-base.nix; documented in README + gotchas as 'nothing to
set on the operator side'.

docs:
- claude.md file map adds crash_watch.rs, kick_agent on coordinator,
  /api/model + journald viewer + bind-with-retry references.
- scratchpad rewritten to reflect the recent run.
- web-ui.md: notification row + browser notifications section,
  state row (badge + model chip + last-turn chip + cancel button),
  per-agent inbox, /model slash, /cancel-question + journald
  endpoints, focus-preservation on refresh.
- turn-loop.md: --model is read from Bus::model() per turn (runtime
  override via /model); recv(wait_seconds) up to 180s with the
  rationale; ask_operator gains ttl_seconds; new TurnState section;
  kick_agent inbox-on-startup hint.
- approvals.md: ttl/cancel resolution paths for operator questions.
- persistence.md: /state/hyperhive-model file.
- gotchas.md: web UI port collision policy (rename, don't probe);
  bind retry + SO_REUSEADDR shape; auto-unfree restored.
- todo.md: cleaned up empty sections and stale entries; /model
  shipped, dropped from the list.
2026-05-15 21:26:13 +02:00
müde
8b9f7d21b7 model persisted to /state; stop auto-allowing claude-code unfree
model persistence: /model <name> now writes to /state/hyperhive-model
(in-container), Bus::new reads it on init. operator override survives
harness restart and container rebuild; gone on --purge like every
other piece of agent state. path overridable via HYPERHIVE_MODEL_FILE
for tests. failure to persist is a warn, not fatal — runtime override
still applies, just won't survive a restart.

unfree opt-in: drop the auto-allowUnfreePredicate from
harness-base.nix and the claude-unstable overlay. operator now has to
set nixpkgs.config.allowUnfree (or a predicate listing claude-code)
in their own host config. silent unfree bypass was sketchy; this is
honest. readme + gotchas updated to spell out the snippet.

todo: drops model-persistence + container-crash + journald (all
shipped); adds per-agent send allow-list (constrain who an agent can
message).
2026-05-15 21:05:40 +02:00
müde
58c3cd853b container crash watcher → HelperEvent::ContainerCrash
new hive_c0re::crash_watch task polls every 10s, builds the set of
currently-running containers, and on running→stopped transitions
checks the transient snapshot: if no Stopping / Restarting /
Destroying / Rebuilding flag is set, the container exited
unexpectedly and we fire HelperEvent::ContainerCrash into the
manager's inbox so it can react (typically: start it again).

first poll is a seeding pass — no events on harness startup. dbus
subscription would be lower-latency but polling is honest and
debuggable, and a 10s delay on crash detection is fine for our
scale.

manager prompt + approvals doc updated to advertise the new
event variant. todo drops the entry (and the journald-viewer
entry that already shipped).
2026-05-15 21:02:05 +02:00
müde
8b10731aa4 split claude.md into docs/ — per-topic, human-readable
claude.md was eating 400 lines of subsystem detail that's useful
when you're working on that subsystem and noise the rest of the
time. split into:

- docs/conventions.md   naming, identity, async forms, commit style
- docs/gotchas.md       nspawn / nixos-container quirks
- docs/web-ui.md        dashboard + per-agent layouts and endpoints
- docs/turn-loop.md     claude invocation, wake prompt, mcp surface
- docs/approvals.md     approval flow, manager policy, helper events
- docs/persistence.md   sqlite dbs, retention, state dir layout

claude.md is now the entry point — file map, reading paths
("pick the doc that matches your task"), quick reminders that
fit on one screen, and a small scratchpad section for in-flight
context. references the docs; the docs don't reference claude.md.

no content was lost — the docs/ files cover everything the old
claude.md did, plus things i wrote up better while extracting.
2026-05-15 20:17:11 +02:00
müde
9af5234f74 Phase 7e: damocles migration plan; CLAUDE.md phase status 2026-05-15 00:32:26 +02:00