agent ui: answer questions inline from the per-agent page
loose-ends question rows get a textarea + send button; the operator answers as operator by POSTing to the core dashboard's /answer-question route, not the per-agent socket — keeps the operator-authority path off the agent's own socket. cross-origin POST needs a CORS shim on that route for now; drops out once the gateway makes the page same-origin. also splits deployment/ops/boundaries/gateway work into TODO-ops.md.
This commit is contained in:
parent
f8795dc029
commit
56e7eb6e73
5 changed files with 221 additions and 8 deletions
119
TODO-ops.md
Normal file
119
TODO-ops.md
Normal file
|
|
@ -0,0 +1,119 @@
|
||||||
|
# Hyperhive — deployment, ops & boundaries
|
||||||
|
|
||||||
|
Tracking the deployment-shape + operational-hardening work:
|
||||||
|
container network isolation, the unifying gateway, the
|
||||||
|
operator-vs-agent trust boundary, and process privilege
|
||||||
|
separation.
|
||||||
|
|
||||||
|
These items interlock. Today "the operator surface" and "the
|
||||||
|
agent surface" are a *convention*, not a boundary — nothing
|
||||||
|
stops a container from curling the core daemon on
|
||||||
|
`localhost:<port>`, or another agent's web UI. The gateway,
|
||||||
|
network isolation, and privsep together turn that convention
|
||||||
|
into an enforced boundary. Sequencing matters; see the order at
|
||||||
|
the bottom.
|
||||||
|
|
||||||
|
## The boundary we're building toward
|
||||||
|
|
||||||
|
Two principals, two paths:
|
||||||
|
|
||||||
|
- **Operator** — reaches every UI (the dashboard + every
|
||||||
|
per-agent page) through the gateway, on one origin.
|
||||||
|
Operator-authority actions (approve / deny, answer-as-operator,
|
||||||
|
lifecycle POSTs) are served by the core daemon and only
|
||||||
|
reachable via the gateway.
|
||||||
|
- **Agent** — speaks only for itself, only over its per-agent
|
||||||
|
unix socket. The socket's identity *is* the agent (see
|
||||||
|
`docs/conventions.md`, "identity = socket"). An agent must not
|
||||||
|
be able to reach the core daemon's HTTP surface, another
|
||||||
|
agent's socket, or another agent's web UI.
|
||||||
|
|
||||||
|
Design rule that falls out of this: **operator-authority
|
||||||
|
actions never get a per-agent-socket entry point.** They live on
|
||||||
|
the core backend. Worked example — answering an
|
||||||
|
operator-targeted question is a `POST /answer-question/{id}` on
|
||||||
|
the core dashboard, *never* an `AgentRequest` variant. If it
|
||||||
|
were a per-agent-socket request, an agent could `curl` its own
|
||||||
|
socket and spoof an operator answer. The per-agent web UI POSTs
|
||||||
|
cross-origin to the core for these (see the inline-answer
|
||||||
|
feature — the loose-ends section on each agent page).
|
||||||
|
|
||||||
|
## Workstreams
|
||||||
|
|
||||||
|
### 1. Container network isolation
|
||||||
|
|
||||||
|
Today containers share the host network namespace, so a
|
||||||
|
container can reach `localhost:<core-port>`, the dashboard, and
|
||||||
|
every other agent's web port. **Until this changes, nothing
|
||||||
|
below is actually enforced** — the operator/agent split is on
|
||||||
|
the honour system.
|
||||||
|
|
||||||
|
- Give each container a private veth / bridge with no route to
|
||||||
|
the host's loopback-bound services.
|
||||||
|
- The per-agent unix socket stays the only host-bound channel
|
||||||
|
(it already is the intended one).
|
||||||
|
- Open question: the per-agent web UI still needs to be
|
||||||
|
reachable *by the operator's browser* — that is what the
|
||||||
|
gateway is for (below). The container itself should not be
|
||||||
|
able to reach the gateway or the core daemon.
|
||||||
|
|
||||||
|
### 2. Unifying gateway / reverse proxy
|
||||||
|
|
||||||
|
(Moved here from TODO.md "Dashboard".)
|
||||||
|
|
||||||
|
Today every agent's web UI is reached at
|
||||||
|
`<host>:<per-agent-port>/`, so operators juggle a port list.
|
||||||
|
Stand up nginx (or similar) terminating one domain that fans
|
||||||
|
requests to `/agent/<name>/...` out to each container's web
|
||||||
|
port, and `/` to the main dashboard. Touches: a NixOS module on
|
||||||
|
the host, the dashboard's per-agent link rendering, and the
|
||||||
|
per-agent web server's base-path handling (currently assumes
|
||||||
|
root). Lets bookmarks survive port reshuffles and unblocks
|
||||||
|
per-agent stats links being relative URLs instead of hard-coded
|
||||||
|
ports.
|
||||||
|
|
||||||
|
Boundary payoff: once the dashboard and the per-agent pages are
|
||||||
|
same-origin behind the gateway, the cross-origin CORS shim on
|
||||||
|
`POST /answer-question/{id}` (added with the inline-answer
|
||||||
|
feature) can be deleted — the per-agent page's POST becomes a
|
||||||
|
plain same-origin request. Grep for `with_cors` /
|
||||||
|
`Access-Control-Allow-Origin` in `hive-c0re/src/dashboard.rs`
|
||||||
|
and remove it when this lands.
|
||||||
|
|
||||||
|
The gateway is also the natural home for auth, if/when the
|
||||||
|
operator surface ever needs it.
|
||||||
|
|
||||||
|
### 3. Privsep the core daemon from the web UI
|
||||||
|
|
||||||
|
(Moved here from TODO.md "Security".)
|
||||||
|
|
||||||
|
hive-c0re runs as root (it has to — `nixos-container` create /
|
||||||
|
start / destroy, the meta git repo, every per-agent bind
|
||||||
|
mount). The HTTP server lives in the same process, so every
|
||||||
|
read-endpoint (`/api/state-file`, `/api/journal/{name}`,
|
||||||
|
`/api/agent-config/{name}`) is one allow-list bug away from
|
||||||
|
serving arbitrary host files. Split it: keep the privileged
|
||||||
|
daemon doing lifecycle + git + ipc, run the web UI as an
|
||||||
|
unprivileged user that talks to the daemon over a unix socket
|
||||||
|
with a narrow request surface (`ReadAgentStateFile { agent,
|
||||||
|
rel_path }` etc.). The unprivileged process can't read
|
||||||
|
`/etc/shadow` even if every check in `get_state_file` is
|
||||||
|
bypassed — it doesn't have the bits. Container-lifecycle POSTs
|
||||||
|
(`/restart`, `/destroy`, etc.) become forwarded RPCs the
|
||||||
|
privileged side authorises on its terms.
|
||||||
|
|
||||||
|
Cheaper once the harness/state split lands (see TODO.md "Split
|
||||||
|
harness-internal state from agent-visible state") — the
|
||||||
|
unprivileged web server then only needs read access to
|
||||||
|
`/agents/<n>/state/`, not `/agents/<n>/harness/`.
|
||||||
|
|
||||||
|
## Suggested sequencing
|
||||||
|
|
||||||
|
1. **Gateway** first — pure ergonomics win, unblocks
|
||||||
|
same-origin, no behavioural risk.
|
||||||
|
2. **Network isolation** next — the step that makes the
|
||||||
|
operator/agent boundary *real*. Everything before it is
|
||||||
|
honour-system.
|
||||||
|
3. **Privsep** last — defence in depth on the core process
|
||||||
|
itself; valuable independent of the other two, but the
|
||||||
|
biggest refactor.
|
||||||
9
TODO.md
9
TODO.md
|
|
@ -5,6 +5,10 @@
|
||||||
> for the operator is not. Use that as a hint when picking up items,
|
> for the operator is not. Use that as a hint when picking up items,
|
||||||
> not a hard rule.
|
> not a hard rule.
|
||||||
|
|
||||||
|
**Deployment / ops / boundaries:** the unifying gateway, container
|
||||||
|
network isolation, the operator-vs-agent trust boundary, and process
|
||||||
|
privsep are tracked separately in [`TODO-ops.md`](TODO-ops.md).
|
||||||
|
|
||||||
## Architecture / Features
|
## Architecture / Features
|
||||||
|
|
||||||
- Shared space for all agents to access documents/files without manager routing
|
- Shared space for all agents to access documents/files without manager routing
|
||||||
|
|
@ -23,13 +27,8 @@
|
||||||
|
|
||||||
## Dashboard
|
## Dashboard
|
||||||
|
|
||||||
- **Unified URL scheme via reverse proxy**: today every agent's web UI is reached at `<host>:<per-agent-port>/`, so operators juggle a port list. Stand up nginx (or similar) terminating one domain that fans requests to `/agent/<name>/...` out to each container's web port, and to `/` for the main dashboard. Touches: a NixOS module on the host, the dashboard's per-agent link rendering, and the per-agent web server's base-path handling (currently assumes root). Lets bookmarks survive port reshuffles and unblocks per-agent stats links being relative URLs instead of hard-coded ports.
|
|
||||||
- **Delivered-reminder rollup on the per-agent stats page**: surface attempt / success / failure counts for reminders this agent fired (in the existing `/stats` page). Needs an `AgentRequest::ReminderRollup { since_secs }` / matching `ManagerRequest::ReminderRollup` RPC so the agent can pull the counts from the host's broker DB (the reminders table is host-owned; agent state doesn't have them). Deferred from the initial stats page so the first cut stays self-contained to data the agent already owns.
|
- **Delivered-reminder rollup on the per-agent stats page**: surface attempt / success / failure counts for reminders this agent fired (in the existing `/stats` page). Needs an `AgentRequest::ReminderRollup { since_secs }` / matching `ManagerRequest::ReminderRollup` RPC so the agent can pull the counts from the host's broker DB (the reminders table is host-owned; agent state doesn't have them). Deferred from the initial stats page so the first cut stays self-contained to data the agent already owns.
|
||||||
|
|
||||||
## Security
|
|
||||||
|
|
||||||
- **Privsep the dashboard from the privileged daemon**: hive-c0re runs as root (it has to — `nixos-container` create / start / destroy, the meta git repo, every per-agent bind mount). The HTTP server lives in the same process, so every read-endpoint (`/api/state-file`, `/api/journal/{name}`, `/api/agent-config/{name}`) is one allow-list bug away from serving arbitrary host files. Split the architecture: keep the privileged daemon doing lifecycle + git + ipc, run the web UI as an unprivileged user that talks to the daemon over a unix socket with a narrow request surface (`ReadAgentStateFile { agent, rel_path }` etc.). The unprivileged process can't read `/etc/shadow` even if every check in `get_state_file` is bypassed — it doesn't have the bits. Container-lifecycle POSTs (`/restart`, `/destroy`, etc.) become forwarded RPCs the privileged side authorises on its terms.
|
|
||||||
|
|
||||||
## Harness Ergonomics (agent-side wishlist)
|
## Harness Ergonomics (agent-side wishlist)
|
||||||
|
|
||||||
Filed by damocles, who actually lives in this thing. Loosely ranked by
|
Filed by damocles, who actually lives in this thing. Loosely ranked by
|
||||||
|
|
|
||||||
|
|
@ -151,6 +151,42 @@ pre.diff {
|
||||||
.agent-inbox .inbox-sep { color: var(--muted); }
|
.agent-inbox .inbox-sep { color: var(--muted); }
|
||||||
.agent-inbox .inbox-body { color: var(--fg); white-space: pre-wrap; word-break: break-word; }
|
.agent-inbox .inbox-body { color: var(--fg); white-space: pre-wrap; word-break: break-word; }
|
||||||
|
|
||||||
|
.agent-inbox .answer-form {
|
||||||
|
grid-column: 1 / -1;
|
||||||
|
display: flex;
|
||||||
|
gap: 0.4em;
|
||||||
|
align-items: flex-start;
|
||||||
|
margin-top: 0.25em;
|
||||||
|
}
|
||||||
|
.agent-inbox .answer-form textarea {
|
||||||
|
flex: 1;
|
||||||
|
font-family: inherit;
|
||||||
|
font-size: inherit;
|
||||||
|
background: var(--bg);
|
||||||
|
color: var(--fg);
|
||||||
|
border: 1px solid var(--border);
|
||||||
|
border-radius: 3px;
|
||||||
|
padding: 0.3em;
|
||||||
|
resize: vertical;
|
||||||
|
}
|
||||||
|
.agent-inbox .answer-form button {
|
||||||
|
font-family: inherit;
|
||||||
|
font-size: inherit;
|
||||||
|
background: var(--bg-elev);
|
||||||
|
color: var(--fg);
|
||||||
|
border: 1px solid var(--border);
|
||||||
|
border-radius: 3px;
|
||||||
|
padding: 0.3em 0.7em;
|
||||||
|
cursor: pointer;
|
||||||
|
white-space: nowrap;
|
||||||
|
}
|
||||||
|
.agent-inbox .answer-form button:hover:not(:disabled) {
|
||||||
|
border-color: var(--purple);
|
||||||
|
color: var(--purple);
|
||||||
|
}
|
||||||
|
.agent-inbox .answer-form button:disabled { opacity: 0.5; cursor: default; }
|
||||||
|
.agent-inbox .answer-status { color: var(--muted); align-self: center; }
|
||||||
|
|
||||||
.last-turn {
|
.last-turn {
|
||||||
color: var(--muted);
|
color: var(--muted);
|
||||||
font-size: 0.8em;
|
font-size: 0.8em;
|
||||||
|
|
|
||||||
|
|
@ -22,6 +22,12 @@
|
||||||
return e;
|
return e;
|
||||||
};
|
};
|
||||||
|
|
||||||
|
// Base URL of the host dashboard (core backend). Set once the first
|
||||||
|
// /api/state lands. Operator-authority actions (answering a question
|
||||||
|
// as the operator) POST here rather than to this agent's own socket —
|
||||||
|
// see TODO-ops.md for why the boundary lives on the core side.
|
||||||
|
let dashboardBase = '';
|
||||||
|
|
||||||
// ─── async-form submit (shared with dashboard) ──────────────────────────
|
// ─── async-form submit (shared with dashboard) ──────────────────────────
|
||||||
document.addEventListener('submit', async (e) => {
|
document.addEventListener('submit', async (e) => {
|
||||||
const f = e.target;
|
const f = e.target;
|
||||||
|
|
@ -68,6 +74,7 @@
|
||||||
// ↑ DASHB04RD — back-link to the host dashboard. Opens in a new
|
// ↑ DASHB04RD — back-link to the host dashboard. Opens in a new
|
||||||
// tab to keep the agent page anchored where the operator is.
|
// tab to keep the agent page anchored where the operator is.
|
||||||
const dashUrl = `${location.protocol}//${location.hostname}:${dashboardPort}/`;
|
const dashUrl = `${location.protocol}//${location.hostname}:${dashboardPort}/`;
|
||||||
|
dashboardBase = dashUrl;
|
||||||
title.append(
|
title.append(
|
||||||
el('a', {
|
el('a', {
|
||||||
href: dashUrl, target: '_blank', rel: 'noopener',
|
href: dashUrl, target: '_blank', rel: 'noopener',
|
||||||
|
|
@ -454,6 +461,7 @@
|
||||||
el('span', { class: 'inbox-sep' }, t.asker + ' → ' + target), ' ',
|
el('span', { class: 'inbox-sep' }, t.asker + ' → ' + target), ' ',
|
||||||
el('span', { class: 'inbox-ts' }, fmtAge(t.age_seconds || 0) + ' ago'),
|
el('span', { class: 'inbox-ts' }, fmtAge(t.age_seconds || 0) + ' ago'),
|
||||||
el('div', { class: 'inbox-body' }, t.question || ''),
|
el('div', { class: 'inbox-body' }, t.question || ''),
|
||||||
|
buildAnswerForm(t.id),
|
||||||
);
|
);
|
||||||
} else if (t.kind === 'reminder') {
|
} else if (t.kind === 'reminder') {
|
||||||
// due_at is an absolute unix-seconds value; show time-until-fire
|
// due_at is an absolute unix-seconds value; show time-until-fire
|
||||||
|
|
@ -474,6 +482,42 @@
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Inline "answer as operator" form for a question loose-end. POSTs to
|
||||||
|
// the host dashboard (core backend), never this agent's socket — the
|
||||||
|
// core is the only place that can stamp `operator` as the answerer.
|
||||||
|
function buildAnswerForm(id) {
|
||||||
|
const wrap = el('div', { class: 'answer-form' });
|
||||||
|
const ta = el('textarea', { rows: '2', placeholder: 'answer as operator…' });
|
||||||
|
const btn = el('button', { type: 'button' }, 'send answer');
|
||||||
|
const status = el('span', { class: 'answer-status' });
|
||||||
|
btn.addEventListener('click', async () => {
|
||||||
|
const answer = ta.value.trim();
|
||||||
|
if (!answer) { status.textContent = 'answer required'; return; }
|
||||||
|
if (!dashboardBase) { status.textContent = 'dashboard url unknown'; return; }
|
||||||
|
btn.disabled = true;
|
||||||
|
status.textContent = 'sending…';
|
||||||
|
try {
|
||||||
|
const resp = await fetch(dashboardBase + 'answer-question/' + id, {
|
||||||
|
method: 'POST',
|
||||||
|
headers: { 'Content-Type': 'application/x-www-form-urlencoded' },
|
||||||
|
body: 'answer=' + encodeURIComponent(answer),
|
||||||
|
});
|
||||||
|
if (resp.ok) {
|
||||||
|
status.textContent = 'answered ✓';
|
||||||
|
refreshLooseEnds();
|
||||||
|
} else {
|
||||||
|
status.textContent = 'failed: ' + (await resp.text());
|
||||||
|
btn.disabled = false;
|
||||||
|
}
|
||||||
|
} catch (err) {
|
||||||
|
status.textContent = 'failed: ' + err;
|
||||||
|
btn.disabled = false;
|
||||||
|
}
|
||||||
|
});
|
||||||
|
wrap.append(ta, btn, status);
|
||||||
|
return wrap;
|
||||||
|
}
|
||||||
|
|
||||||
function renderInbox(rows) {
|
function renderInbox(rows) {
|
||||||
const root = $('inbox-section');
|
const root = $('inbox-section');
|
||||||
const list = $('inbox-list');
|
const list = $('inbox-list');
|
||||||
|
|
|
||||||
|
|
@ -759,6 +759,20 @@ struct AnswerForm {
|
||||||
answer: String,
|
answer: String,
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// Attach a permissive CORS header so the per-agent web UI — served on
|
||||||
|
/// a different port — can POST an operator answer here and read the
|
||||||
|
/// result. The dashboard has no auth, so `*` exposes nothing a plain
|
||||||
|
/// cross-origin form-POST couldn't already reach. This shim disappears
|
||||||
|
/// once the unifying gateway makes the agent page same-origin; see
|
||||||
|
/// `TODO-ops.md`.
|
||||||
|
fn with_cors(mut resp: Response) -> Response {
|
||||||
|
resp.headers_mut().insert(
|
||||||
|
axum::http::header::ACCESS_CONTROL_ALLOW_ORIGIN,
|
||||||
|
axum::http::HeaderValue::from_static("*"),
|
||||||
|
);
|
||||||
|
resp
|
||||||
|
}
|
||||||
|
|
||||||
async fn post_answer_question(
|
async fn post_answer_question(
|
||||||
State(state): State<AppState>,
|
State(state): State<AppState>,
|
||||||
AxumPath(id): AxumPath<i64>,
|
AxumPath(id): AxumPath<i64>,
|
||||||
|
|
@ -766,9 +780,9 @@ async fn post_answer_question(
|
||||||
) -> Response {
|
) -> Response {
|
||||||
let answer = form.answer.trim();
|
let answer = form.answer.trim();
|
||||||
if answer.is_empty() {
|
if answer.is_empty() {
|
||||||
return error_response("answer: required");
|
return with_cors(error_response("answer: required"));
|
||||||
}
|
}
|
||||||
match state
|
let resp = match state
|
||||||
.coord
|
.coord
|
||||||
.questions
|
.questions
|
||||||
.answer(id, answer, hive_sh4re::OPERATOR_RECIPIENT)
|
.answer(id, answer, hive_sh4re::OPERATOR_RECIPIENT)
|
||||||
|
|
@ -794,7 +808,8 @@ async fn post_answer_question(
|
||||||
(StatusCode::OK, "ok").into_response()
|
(StatusCode::OK, "ok").into_response()
|
||||||
}
|
}
|
||||||
Err(e) => error_response(&format!("answer {id} failed: {e:#}")),
|
Err(e) => error_response(&format!("answer {id} failed: {e:#}")),
|
||||||
}
|
};
|
||||||
|
with_cors(resp)
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Resolve a pending operator question with a sentinel answer when
|
/// Resolve a pending operator question with a sentinel answer when
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue