reminders: persist + surface delivery failures
Broker schema gains attempt_count INTEGER + last_error TEXT
columns via idempotent ALTER TABLE migration (pragma-probed so
fresh + existing dbs converge). reminder_scheduler::tick calls
record_reminder_failure on every deliver_reminder error,
bumping the counter + stashing the message. get_due_reminders
filters out rows where attempt_count >= MAX_REMINDER_ATTEMPTS
(5) so the scheduler stops retrying a stuck row until the
operator intervenes.
new POST /retry-reminder/{id} → reset_reminder_failure clears
the counters; next 5s tick re-attempts. cancel-reminder
unchanged (hard-delete).
dashboard renders failed rows with a red left rule, the error
text inline, and a ⚠ N failed badge. ↻ R3TRY button appears
when attempt_count > 0 — sits next to ✗ C4NC3L in a small
actions row below the body.
This commit is contained in:
parent
d395bdc945
commit
978a3cf391
5 changed files with 173 additions and 8 deletions
|
|
@ -57,6 +57,7 @@ pub async fn serve(port: u16, coord: Arc<Coordinator>) -> Result<()> {
|
|||
.route("/api/state-file", get(get_state_file))
|
||||
.route("/api/reminders", get(api_reminders))
|
||||
.route("/cancel-reminder/{id}", post(post_cancel_reminder))
|
||||
.route("/retry-reminder/{id}", post(post_retry_reminder))
|
||||
.route("/api/agent-config/{name}", get(get_agent_config))
|
||||
.route("/request-spawn", post(post_request_spawn))
|
||||
.route("/op-send", post(post_op_send))
|
||||
|
|
@ -1126,6 +1127,25 @@ async fn post_cancel_reminder(
|
|||
}
|
||||
}
|
||||
|
||||
/// Reset a pending reminder's failure state so the scheduler
|
||||
/// retries it on the next tick. Useful when the failure was
|
||||
/// transient (sqlite lock contention, disk full → freed up) and
|
||||
/// the operator wants delivery to resume immediately instead of
|
||||
/// the row sitting in attempt-count-capped purgatory.
|
||||
async fn post_retry_reminder(
|
||||
State(state): State<AppState>,
|
||||
AxumPath(id): AxumPath<i64>,
|
||||
) -> Response {
|
||||
match state.coord.broker.reset_reminder_failure(id) {
|
||||
Ok(0) => error_response(&format!("reminder {id} not pending (already delivered?)")),
|
||||
Ok(_) => {
|
||||
tracing::info!(%id, "operator reset reminder failure for retry");
|
||||
(StatusCode::OK, "ok").into_response()
|
||||
}
|
||||
Err(e) => error_response(&format!("retry reminder {id} failed: {e:#}")),
|
||||
}
|
||||
}
|
||||
|
||||
async fn post_purge_tombstone(
|
||||
State(state): State<AppState>,
|
||||
AxumPath(name): AxumPath<String>,
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue