How it works

The five-stage pipeline that powers Planekeeper's version lag detection, from gathering releases to delivering notifications.

View as Markdown

Planekeeper detects version lag through a five-stage pipeline. Each stage feeds the next, and the whole system runs automatically on schedules you define.

  Gather          Scrape           Rules           Alerts          Notify
+-----------+   +-----------+   +-----------+   +-----------+   +-----------+
| Fetch     |   | Extract   |   | Compare   |   | Create or |   | Send to   |
| upstream  |-->| deployed  |-->| versions  |-->| update    |-->| webhooks  |
| releases  |   | versions  |   | against   |   | alerts    |   | (Slack,   |
|           |   |           |   | rules     |   |           |   |  Discord) |
+-----------+   +-----------+   +-----------+   +-----------+   +-----------+
      |               |               |               |               |
      v               v               v               v               v
  upstream        version         evaluation       alerts        deliveries
  releases       snapshots        results          table           table

Stage 1: Gather

Gather jobs fetch the latest available versions from upstream sources. Planekeeper currently supports three source types:

GitHub Releases – pulls release data from any public GitHub repository
Helm Repository – reads chart versions from a Helm repository index
OCI Container Registry – reads image tags from an OCI-compliant container registry

Each gather job runs on a cron schedule (for example, every 6 hours) and stores every release it finds. This builds a complete version history for the artifact, including release dates and prerelease flags.

When a gather job completes, the rules engine re-evaluates all alert configs that reference it. If the latest upstream version has changed, alerts update automatically.

Stage 2: Scrape

Scrape jobs extract the version you currently have deployed. They work by cloning a Git repository, reading a specific file, and parsing a version string from it.

Three parser types are available:

Parser	Best for	Example expression
YQ	YAML files	`.version` or `.dependencies[0].version`
JQ	JSON files	`.version` or `.dependencies.react`
Regex	Any text file	`version:\s*([\d.]+)`

Each scrape creates a version snapshot – a point-in-time record of the version found. Planekeeper keeps a configurable history of snapshots so you can track version changes over time, including rollbacks.

For environments where agent deployment is not practical, or for demos and testing, scrape jobs can use the manual parse type – users enter the deployed version directly via the UI or API, and the same rule evaluation pipeline applies. See Manual version entry for details.

When a scrape job completes (or a manual version is set), the rules engine re-evaluates all related alert configs immediately.

Stage 3: Rules

The rules engine compares your deployed version (from a scrape job) against the latest upstream version (from a gather job) and calculates a “behind-by” value.

Three rule types measure staleness differently:

Rule type	What it measures	Example
Days behind	How many days since the deployed version was released	Your version is 90 days old
Majors behind	Major version distance	You are on v1.x, latest is v3.x (2 majors behind)
Minors behind	Minor version distance, counting actual releases	You are on 1.2, latest is 1.8 (6 minors behind)

Each rule defines three severity thresholds:

behind-by value:    0 .... 5 ........ 10 ........... 20
                    |      |          |              |
                    OK     MODERATE   HIGH           CRITICAL

If the behind-by value is below the moderate threshold, there is no violation and no alert. Once it crosses a threshold, Planekeeper assigns the corresponding severity.

Rules also support a stable only flag. When enabled, prerelease versions (alpha, beta, RC, dev, snapshot, canary, nightly, and pre) are excluded from the latest version lookup.

Stage 4: Alerts

When a rule evaluation finds a violation, Planekeeper creates or updates an alert. Key behaviors:

One alert per config. Each alert config has at most one active alert at any time. If the situation changes (version updates, severity escalates), the existing alert is updated in place.
Automatic acknowledgment reset. If you acknowledge an alert and then the deployed version changes, the acknowledgment clears so you review the new state.
Automatic resolution. When a future scrape finds that you have upgraded and no longer violate the rule, the alert is resolved automatically. Resolved alerts are kept in history.
Severity escalation. If the version gap increases and crosses a higher threshold, the alert escalates (for example, from moderate to high). This triggers a new notification.

Stage 5: Notify

Alert lifecycle events trigger notifications to your configured channels. Five events can generate notifications:

Event	When it fires
Created	A new alert is generated for the first time
Escalated	An existing alert’s severity increases
Acknowledged	Someone marks the alert as reviewed
Unacknowledged	Acknowledgment is reset due to a version change
Resolved	The deployed version is updated and no longer violates

Notification rules control which events go to which channels, filtered by severity. The notifier service delivers webhooks with retry logic and exponential backoff. Failed deliveries are retried for approximately 16 hours before being placed in a dead letter queue.

The execution layer

Two components run the pipeline behind the scenes:

Agents are lightweight workers that execute gather and scrape tasks. They poll the API server for work, run the task, and report results. You can deploy agents alongside the server (server agent) or at remote sites (client agent) to access private repositories behind firewalls.

TaskEngine handles scheduling and coordination. It activates jobs when their cron schedule fires, detects timed-out tasks, recovers orphaned jobs from disconnected agents, and processes results.

You do not interact with these components directly. They operate automatically once you create your jobs and rules.

Event-driven evaluation

The pipeline is not just scheduled – it is event-driven. Several actions trigger immediate rule re-evaluation:

A scrape job completes
A gather job completes
An alert config is created, updated, or toggled on

This means you do not wait for the next scheduled run to see updated alerts. Changes propagate within seconds.