Operations¶
Day-2 operations once Ampora is installed and configured. Each page is self-contained — you should be able to read just the one runbook you need.
| Page | When you need it |
|---|---|
| High availability | Running multiple Ampora instances behind a load balancer |
| Scaling out | Past a few hundred agents per instance |
| Backup & restore | Disaster preparation (and recovery) |
| Self-observability | Wiring up Ampora's own metrics, traces, logs |
| Upgrades | Rolling new versions, downgrade policy, breaking changes |
| Disaster recovery | RTO/RPO planning, full recovery procedure |
| Audit retention | Tuning hot/archive windows for compliance |
On-call cheat sheet¶
When something is broken at 3am, in this order:
/health/live— is the process alive?/health/ready— is it ready to take traffic? If not, read the response body — it lists the probes that failed.- Pod / container logs — Ampora logs are structured JSON; pipe through
jqfor human reading. - Postgres — is the DB reachable?
kubectl -n ampora exec deploy/ampora-web -- nc -z db.acme.svc 5432. - The audit log itself — every incident leaves prints in the audit table.
audit_eventsfiltered byEntityTypeis often the fastest way to ground a "what happened?" question.
If you have not yet set up self-observability, do so before anything else: graphs and traces beat console-grepping every time.