Skip to main content

Alerts

Alerts are emitted by the monitoring engine when a Rule trigger condition is met for a monitored resource. The engine evaluates every active rule once per minute and opens at most one alert per (rule, resource) pair — repeated triggers update the existing alert rather than creating a new one.

Accessing Alerts

Navigate to Monitoring > Alerts for the full list. The Overview page shows the alerts dashboard with summary widgets and the most recent activity.

Alerts List

The alerts table presents every alert across the tenant.

Columns:

ColumnDescription
CreatedTimestamp the alert first opened
Last UpdatedTimestamp of the most recent trigger, severity change, or status change
SeverityCritical / High / Medium / Low — current level after any overrides
Rule NameRule that produced the alert (clickable, opens the rule detail)
AlertRule type that fired (for example, Disk utilization, No data received)
Resource GroupDirectors / Devices / Targets
ResourceName of the affected resource
StatusOpen / Acknowledged / Resolved

Controls:

  • Search alerts — free-text search across rule name and resource name
  • Severity filter — All, Critical, High, Medium, Low
  • Status filter — All, Open, Acknowledged, Resolved
  • Alert Type filter — All, Director, Device, Target
  • Pagination at the foot of the table

Severity and Status

Severity levels rank the urgency of an alert. A multi-severity rule sets the level from its configured thresholds; a single-severity rule applies the same level to every trigger.

SeverityBadge
Criticalred
Highred (lighter)
Mediumorange
Lowyellow

Status values describe where the alert sits in its lifecycle.

StatusDescription
OpenAlert was opened by a trigger and has not been acknowledged or resolved
AcknowledgedAn operator has taken ownership; the alert is still active but is being worked
ResolvedThe alert is closed, either manually or automatically by the engine

Alert Detail

Clicking a row opens the alert detail page. The main panel surfaces:

  • Current status badge and severity badge
  • An alert summary line naming the rule type and resource (for example, "Disk utilization alert for web-01"), followed by a state-dependent description
  • Rule Name and Rule Type
  • A See alert rule details link that opens the rule's detail page
  • Acknowledged By and Acknowledgement Note (shown once the alert is acknowledged)
  • Resolved By and Resolution Note (shown once the alert is resolved)

A side info panel lists the Created and Last updated timestamps, plus Acknowledged and Resolved timestamps when those states have been reached.

Below the panel, a timeline section records every state change — the original trigger, severity upgrades, acknowledgements, and the resolution. Its columns are Event Time, Event, Severity, and Trigger Condition.

Acknowledging and Resolving

Both actions are available from the alerts list row menu and from the Actions menu on the alert detail page. The menu items are state-conditional: Acknowledge alert appears while the alert is Open, Remove acknowledgement while it is Acknowledged, and Resolve alert as long as it is not Resolved. These actions are gated by the alert-edit permission; opening the detail page itself requires the alert-read permission.

Acknowledge

Acknowledge alert moves the alert from Open to Acknowledged and records the operator and timestamp. The acknowledgement note is optional. Remove acknowledgement returns the alert to Open.

Resolve

Resolve alert moves the alert to Resolved and closes the lifecycle. The resolution note is required, captured in the timeline alongside the operator and timestamp. Resolution is final — a resolved alert cannot be reopened, but a new trigger against the same (rule, resource) pair will open a fresh alert.

Alert Lifecycle

Evaluation tick

The monitoring engine re-evaluates every active rule once a minute. Threshold and metric checks run against the most recent observation window; status checks look at the current connection state.

Severity override

When a multi-severity rule fires at a higher level against a resource that already has an Open alert, the existing alert's severity is upgraded in place — the timeline gains an entry recording the change, but no new alert is created. Triggers at the same or a lower severity update the last-triggered timestamp and increment the occurrence count without changing the severity.

Auto-resolve

Some rule types support a resolve condition, configured on the rule itself:

Rule typeAuto-resolve source
Crash detection, BackpressureOperator-configured Resolve after period — when the period elapses with no further triggers, the alert closes automatically
Threshold rules (utilization, data volume, event volume, total ingest amount, queue usage)The alert auto-resolves once the triggering condition stops being met for the configured stale grace window

Auto-resolved alerts carry the system-generated resolution note This alert was automatically resolved by the system after the defined resolution period expired. and are flagged in the timeline.

Multi-resource fan-out

A rule scoped to All directors, All devices, or All targets — with or without exceptions — fans out to every resource it matches. If five Directors breach the rule simultaneously, five independent alerts open — one per resource — and each follows its own acknowledge / resolve lifecycle.