hyperhive/docs/boundary.md

2.5 KiB

The operator/agent boundary

Design rationale for hyperhive's two-principal trust model. The implementation work — container network isolation, the unifying gateway, core-daemon privsep — is tracked as area:ops issues on the forge.

Today "the operator surface" and "the agent surface" are a convention, not a boundary — nothing stops a container from curling the core daemon on localhost:<port>, or another agent's web UI. Network isolation, the gateway, and privsep together turn that convention into an enforced boundary.

Two principals, two paths

  • Operator — reaches every UI (the dashboard + every per-agent page) through the gateway, on one origin. Operator-authority actions (approve / deny, answer-as-operator, lifecycle POSTs) are served by the core daemon and only reachable via the gateway.
  • Agent — speaks only for itself, only over its per-agent unix socket. The socket's identity is the agent (see docs/conventions.md, "identity = socket"). An agent must not be able to reach the core daemon's HTTP surface, another agent's socket, or another agent's web UI.

Design rule

Operator-authority actions never get a per-agent-socket entry point. They live on the core backend.

Worked example — answering an operator-targeted question is a POST /answer-question/{id} on the core dashboard, never an AgentRequest variant. If it were a per-agent-socket request, an agent could curl its own socket and spoof an operator answer. The per-agent web UI POSTs cross-origin to the core for these (see the inline-answer feature — the loose-ends section on each agent page).

Why network isolation is the load-bearing step

Containers currently share the host network namespace, so a container can reach localhost:<core-port>, the dashboard, and every other agent's web port. Until that changes, the operator/agent split is on the honour system — every boundary claim above is aspirational. Network isolation is what makes the boundary real; the gateway and privsep are ergonomics and defence-in-depth layered on top.

Suggested sequencing of the area:ops issues:

  1. Gateway first — pure ergonomics win, unblocks same-origin (lets the cross-origin CORS shim on /answer-question/{id} go away), no behavioural risk.
  2. Network isolation next — the step that makes the boundary real. Everything before it is honour-system.
  3. Privsep last — defence in depth on the core process itself; valuable independent of the other two, but the biggest refactor.