How Ampora Works¶

This page walks through what happens between an OpAMP agent and the Ampora server, from the first connection to a successful rolled-out config change. If you already know OpAMP, skim it. If you do not, this is the page to read first.

The high-level picture¶

flowchart LR
    subgraph Cluster["Your environment"]
      A1[Collector agent A]
      A2[Collector agent B]
      A3[Collector agent C]
    end
    subgraph Ampora["Ampora server"]
      WS[OpAMP WebSocket endpoint]
      App[Application services]
      DB[(PostgreSQL)]
      UI[Blazor Web UI]
    end
    User[Operator] --> UI
    UI --> App
    App --> DB
    App --> WS
    A1 -- "WebSocket (OpAMP)" --> WS
    A2 -- "WebSocket (OpAMP)" --> WS
    A3 -- "WebSocket (OpAMP)" --> WS

Operators interact with a Blazor web UI that runs in the same process as the OpAMP server.
Agents open a long-lived WebSocket to Ampora and stay connected. The protocol is bidirectional.
Persistence is PostgreSQL (or SQLite in dev). Configurations, rollouts, agent status snapshots, and audit events live there.

The lifecycle of an agent¶

1. Bootstrap¶

A new agent has no certificate yet. It opens a TLS connection to Ampora and upgrades to a WebSocket. The HTTP upgrade carries a bootstrap token: short-lived (default 24 h), single-use, generated in the UI when an operator expects an agent to register.

If the token is valid:

Ampora issues an mTLS client certificate for the agent. Issuance is handled by Ampora's persisted CA (or a configured HSM/KMS, see HSM/KMS integration).
The agent receives the certificate as part of ConnectionSettings and reconnects with it.
The bootstrap token is consumed and never accepted again.

2. Steady-state connection¶

On every reconnect the agent presents its mTLS certificate. The server identifies the agent by certificate fingerprint, not by the agent-supplied instance_uid — a hostile agent cannot impersonate another by choosing the same UID.

The agent advertises its capabilities — the OpAMP capability bitfield that says what it accepts ("AcceptsRemoteConfig", "AcceptsPackages", …) and what it reports ("ReportsEffectiveConfig", "ReportsHealth", …). Ampora persists the capability set and uses it to gate every server-to-agent push: a config is only sent to an agent that has signalled AcceptsRemoteConfig.

3. Reporting¶

While connected, the agent streams:

Field	What it tells the server
`AgentDescription`	OS, hostname, agent type/version, identifying labels
`EffectiveConfig`	the YAML the agent is actually running, as text
`RemoteConfigStatus`	`APPLIED`, `APPLYING`, `FAILED` for the last assigned config + error string
`ComponentHealth`	per-component health (`receivers/otlp`, `exporters/otlp`, …) and overall status
`PackageStatuses`	versions and hashes of installed packages, if `ReportsPackageStatuses` is signalled
`Heartbeat`	implicit; "last seen" timestamp is updated on every keepalive

Ampora stores both the current snapshot (for fast reads on the Fleet page) and an event history (for audit and time-travel debugging). Large payloads are gzipped before they hit the JSONB columns.

4. Drift detection¶

Ampora's domain model separates assigned config (what the operator told the agent to run) from effective config (what the agent reports it is running). When they diverge, an agent is in drift:

because the agent hasn't applied the new config yet,
because the agent applied a config and rejected it (RemoteConfigStatus = FAILED),
because something or someone changed the config on the agent host out of band.

The Drift dashboard surfaces all three cases with the same query.

The lifecycle of a configuration¶

1. Authoring¶

An operator authors a config in the Configurations UI. They can:

paste or upload OpenTelemetry Collector YAML,
import it from a registered Git source (read-only sync, see the GitOps tutorial),
or build it visually with the drawflow-based pipeline editor.

The config is parsed, validated, optionally linted against tenant-specific rules, and rendered as a swimlane. It lives as a draft with a row-version that prevents lost-update conflicts on concurrent edits.

2. Publishing¶

Publishing makes a draft immutable and assigns it a content-addressable hash. The hash is what flows over the wire to agents — Ampora deduplicates on hash, so an agent already running that exact config will not be told to re-apply it. Published versions are never edited again; the next change becomes a new version.

3. Rollout¶

A rollout binds a published configuration version to a target group (static or dynamic) and a strategy. Strategies include:

Batch — fixed N agents per batch, manually advanced.
Percentage — 5 % → 25 % → 50 % → 100 % steps, manually or schedule-advanced.
Canary step-up — like percentage but with time-based dwell between steps.

A rollout walks through pending → in_progress → completed. Health gates (see Health gates) can move it to paused automatically when too many agents fail to apply, drop offline, or report unhealthy. The operator decides to resume, rollback, or abort.

4. Audit¶

Every state-changing action is logged: who did what, against which row, with the old and new values. Audit logs land in a hot table that is moved to an archive table after a retention window (default: 90 days hot, 7 years archived) and finally purged.

The OpAMP protocol, in a paragraph¶

OpAMP — Open Agent Management Protocol — is a protocol specification for managing agents that emit telemetry. It defines:

a wire format (Protobuf over WebSocket — Ampora uses the 1-byte-header binary format described in ADR-016)
a set of messages (AgentToServer, ServerToAgent, Heartbeat, …)
a capability bitfield that says what each side can do
a trust model that the server defines — OpAMP itself does not mandate one. Ampora's trust model is bootstrap-token-then-mTLS (see Threat model).

Most teams encounter OpAMP through the OpenTelemetry Collector's opamp_extension. Ampora is an implementation of the server side of this spec, plus a fleet-management UI and a pipeline analyzer on top.

Where this is going next¶

The exact shapes of Configuration, Rollout, AgentSession etc. live on Core Concepts.
The deployment topology — single instance, HA, scale-out, federation — is covered on Architecture.
A working server in five minutes is on Quickstart.