Policies¶
A policy is a rule the rollout engine evaluates before pushing a configuration to an agent. If the policy denies, the push does not happen and the audit log records why.
Policies are tenant-scoped — what one tenant requires is none of another tenant's business.
Built-in policies¶
Two ship out of the box (and are enforced by default):
| Policy | What it forbids |
|---|---|
default-deny-exporter-removal | Removing an exporter that has agents currently sending data through it without a manual override |
default-deny-tls-insecure-on-non-localhost | An exporter with tls.insecure: true to a non-localhost endpoint |
Both can be relaxed per tenant if your environment has a real reason.
Where policies run¶
The policy engine runs:
- when a rollout is created (precondition check),
- when a rollout starts pushing to each agent (per-agent evaluation),
- when an auto-apply is queued.
A failed precondition refuses to start the rollout. A failed per-agent evaluation skips that one agent and records a Rollout.PushDenied audit event.
Listing policies¶
Settings → Policies lists every policy on the active tenant:
- name,
- type (Built-in or Custom),
- status (Draft / Approved / Active / Retired),
- last edit author and approver,
- audit-log link.
Creating a custom policy¶
Settings → Policies → New policy:
- Name — unique per tenant.
- Description — free text. Reviewers will read this; explain intent.
- Body — the DSL expression (see Custom policy DSL).
- Save as draft.
Drafts run in the policy engine in a "shadow" mode — the engine evaluates them but never blocks; instead, every "would-deny" is recorded as a Policy.WouldDeny audit event so you can dry-run a new policy before turning it on.
Approval flow¶
A draft policy must be Approved by a different operator (the four-eyes principle) before it goes Active. See Approval flows.
Active policies¶
Active policies block in real time. The "would-deny" shadow stops; the engine becomes blocking. The audit-log row for the activation includes the policy version's hash for tamper-evidence.
Retiring a policy¶
Retire keeps the policy in the database for the audit trail but takes it out of evaluation. Retire is reversible (re-Activate). Hard delete is not offered — the audit history of what the policy did is itself audit material.
Performance and safety¶
The policy engine evaluates expressions with a 50 ms wall-clock budget per evaluation. An expression that times out is treated as deny, not allow — fail-closed by design. Expressions are cached in compiled form so the second evaluation reuses the parse tree.
For 1 000 agents in a single rollout step, total policy evaluation overhead is a few hundred milliseconds. Custom policies that walk the entire YAML AST are slower; the rule of thumb is "tens of milliseconds for a tight rule, low hundreds for a heavy one".
Examples¶
A handful of common custom policies tenants write:
- "No exporter to a non-EU endpoint." — checks the destination hostname against an allow-list.
- "Every traces pipeline must go through
tail_sampling." — checks pipeline composition. - "Hostmetrics interval ≥ 60 s." — caps cardinality.
- "No
debugexporter in production." — gated by group label.
Walkthroughs of each are on Custom policy DSL.