diff --git a/TODO.md b/TODO.md index fc1d8c4..23277de 100644 --- a/TODO.md +++ b/TODO.md @@ -1,141 +1,17 @@ -# TODO +# Hyperhive TODOs -Pick anything from here when relevant. Cross-cutting design notes live in -[CLAUDE.md](CLAUDE.md); high-level project intro in [README.md](README.md). +## Architecture / Features -## Security +- Shared space for all agents to access documents/files without manager routing +- Private git forge agents can push to and create new repos in +- Move bind mounts in agents to `/agents//state` so path for agent = path for manager -- **Unprivileged containers (userns mapping).** Today the nspawn container - runs as a fully privileged root. Goal: `PrivateUsersChown=yes` (or the - nixos-container equivalent) so uid 0 inside maps to an unprivileged uid - on the host, and a container-root compromise lands the attacker on an - ordinary user account, not the host's root. Requires per-agent state - dirs to be chown'd to that uid on the host side. The per-agent git - identity (currently injected via `programs.git.config.user` against - the root user in `setup_applied`'s generated flake) also needs to be - provisioned for whatever non-root user claude runs as, or commits - the manager makes against `/agents//config` will fall back to a - generic `nixos@…` identity. -- **Bash command allow-list.** Replace the blanket `Bash` allow with a - pattern allow-list (`Bash(git *)`, `Bash(nix build .*)`, etc.) per - claude-code's `--allowedTools` extended grammar. Likely lives in - `agent.nix` so each agent can scope its own shell surface. +## Reminder Tool -## Operational hygiene (post-meta-flake) +- Handle text overflow → suggest file_path option for long messages +- Per-agent reminder limits (burst capacity, rate limiting) -- **Tag retention.** Every approval mints up to 5 tags in - `applied//.git` (`proposal/`, `approved/`, `building/`, - `deployed/`, plus `failed/` or `denied/`). Every successful - deploy adds one commit to `/var/lib/hyperhive/meta/.git`. - Both grow unbounded. A retention policy — keep all - `deployed/*` indefinitely, age-out `failed/` + `denied/` - after N days, drop `proposal/` + `approved/` + `building/` - once a terminal sibling lands — would keep the audit - trails browsable without forever-growth. +## Dashboard -- **Inert `nix flake lock` no-args call in `meta::sync_agents`.** - Still valid in current nix (resolves missing inputs without - bumping existing ones) but parallel to the deprecated - `--update-input` we just had to migrate. Worth keeping an - eye on; if it gets renamed too, sync_agents stops being - able to seed a fresh meta repo. - -## Bugs - -- **Pending question doesn't always appear on the dashboard.** - Repro: manager calls `ask_operator`, tool result is - `question queued (id=N)` (so the row is in sqlite), but the - M1ND H4S QU3STI0NS section keeps showing "no pending - questions". Last seen with id=5. Diagnostic step landed: - `api_state` now warn-logs (target=`api_state`) when any of - its source queries fail instead of silently - `unwrap_or_default`-ing — next repro should print the - underlying error in journald and tell us whether this is - sqlite (likely `OperatorQuestions::pending()` row-decode - panic on a migrated column) or dashboard-JS-side - (`renderQuestions` exception). Re-investigate with the new - log once the bug fires. - -## UI / UX - -- **Dashboard layout overhaul.** A 3-column attempt (swarm - / 0per4t0r 1n / m3ss4g3s) landed + was reverted in 74ba8a6 - — looked worse in practice (sticky col-heads fighting the - banner, sub-heads too small, columns too narrow for the - container rows). Sections are now ordered semantically in - a single column (swarm bits first, then decisions, then - messages) which is a no-cost improvement. The bigger - restructure is still worth doing; next attempt should: - - keep current widths usable (don't crunch container - rows < ~36em — they have a lot inline) - - default the heavy-but-rare sections (kept-state, meta- - inputs, msg-flow history) into a collapsed `
` - so they don't dominate when empty - - drop the per-section banner divider lines in favour of - something quieter (a single border-top on the h2?) - - try a *masonry-ish* layout (CSS `grid-template-rows: - masonry` once browsers support it; or just two columns - where messages floats on the right at wide viewports - while the rest stacks left). avoid sticky headers — they - fought the page banner last time. - - - -- **Web UI for config repos + meta deploy log.** Browse - per-agent proposed / applied tags - (`proposal/* / approved/* / building/* / deployed/* / - failed/* / denied/*`) plus the swarm-wide meta repo's git - log on the dashboard. Read-only log + diff + raw-file view - is enough — something lighter than a full forge. The meta - log already answers "what's deployed where + when"; this - surfaces it without an ssh-to-host detour. - -- **xterm.js terminal** embedded per-agent, attached to a PTY exposed by - the harness. Pairs well with the unprivileged-container work — would let - the operator drop into the container without `nixos-container root-login`. - -## Telemetry - -- **Harness stats per agent in sqlite, charted on the agent page.** - bitburner-agent samples 18 series; for hyperhive the generally-applicable - ones are: - - turns/min, tool calls/turn, turn duration p50/p95 - - claude exit code distribution (ok vs `--compact`-retry vs failure) - - inbox depth (current + max-over-window) - - messages sent/received per turn (split by recipient: peer / operator / - manager / system) - - approval queue length (across all agents — dashboard-level) - - per-tool usage counts (Read/Edit/Bash/send/recv/…) - - time-since-last-turn (helps spot stuck agents) - - notes file size growth (cues compaction) - Backend: a `stats` table with `(agent, ts, key, value)` written from - the harness on `TurnEnd`; `GET /api/stats?since=…` returns the - series; agent page renders with a small chart lib (uPlot is light). - -## Spawn flow - -- **Two-step spawn.** Today `request_spawn(name)` is one shot: manager - asks → operator approves → container is created with a default - `agent.nix` and empty `/state/`. Manager has no way to pre-stage - per-agent prompt material, package additions, or initial notes before - the agent first wakes. Split into: - 1. `request_spawn_draft(name)` — host creates the per-agent - `proposed/` repo (initial commit) and `state/` dir with no - container; manager now has `/agents//{config,state}/` to - edit + commit just like an existing agent. - 2. `request_spawn_commit(name, commit_ref)` — submits the queued - approval; operator sees the diff in the dashboard like a normal - `apply_commit`; on approve the container is created from that - commit. - Backwards-compat: keep the existing one-shot `request_spawn` for - trivial agents (operator can still type a name in the dashboard). - Surface "drafts" as a new section between K3PT ST4T3 and approvals. - -## Loop substance - -- **Notes compaction.** `/state/` is bind-mounted persistently and agents - are told (in the system prompt) to keep `/state/notes.md` for durable - knowledge — but we don't currently nudge them to compact when notes - grow. Bitburner-agent's pattern: a short-lived secondary claude session - that takes the existing notes + a "compact this" prompt and rewrites - them in place. Add when the notes start bloating. +- Per-agent reminder status (pending, delivered) +- Reminder query interface for debugging