Source: https://docs.planekeeper.com/reference/parser-types/
Site: Planekeeper Docs
Title: Parser types
Description: Reference for the four parser types — YQ, JQ, Regex, and Manual — used by scrape jobs to extract or set version strings.

***

# Parser types


Scrape jobs extract version strings from files in your Git repositories, or accept manually entered versions. Choose a parser type based on the file format you are targeting or whether you need agent-free operation.

| Parser | File format | Best for |
|--------|-------------|----------|
| YQ | YAML and JSON | Helm Chart.yaml, Kubernetes manifests, package.json with arrays |
| JQ | JSON (simple) | package.json, composer.json — simple key lookups only |
| Regex | Any text | Dockerfiles, Makefiles, .env files, arbitrary formats |
| Manual | None (user-provided) | Manual version entry — no agent or repository needed. Users set the version string directly via the UI or API. |

> **info:** 
**About the parser implementations**

The YQ and JQ parsers are **lightweight, built-in path navigators** — they do not shell out to the `yq` or `jq` CLI tools. This keeps Planekeeper dependency-free and fast, but it means the parsers support a subset of what the full CLI tools offer. Both use dot-notation path expressions (`.field.subfield`) rather than the full query languages.

The **YQ parser** supports array indexing (`.dependencies[0].version`) and can parse both YAML and JSON files, making it the more capable of the two. The **JQ parser** only supports simple key-based navigation and does not handle arrays.

We are actively exploring a more feature-rich parser implementation with broader query support. For now, if you need array access in JSON files, use the YQ parser.


---

## YQ parser

Use the YQ parser for YAML files **and JSON files that require array access**. Expressions use dot-notation to navigate the document structure. Since YAML is a superset of JSON, the YQ parser handles both formats.

### Expression syntax

```yaml title="Basic field access"
.version
```

```yaml title="Nested field access"
.metadata.version
```

```yaml title="Array element access"
.dependencies[0].version
```

```yaml title="Deeply nested with arrays"
.items[2].spec.containers[0].image
```

### Example: Helm Chart.yaml

Given this `Chart.yaml`:

```yaml title="Chart.yaml"
apiVersion: v2
name: my-app
version: 1.4.2
appVersion: 3.1.0
dependencies:
  - name: postgresql
    version: 12.1.5
    repository: https://charts.bitnami.com/bitnami
```

| Expression | Extracted value |
|-----------|----------------|
| `.version` | `1.4.2` |
| `.appVersion` | `3.1.0` |
| `.dependencies[0].version` | `12.1.5` |
| `.dependencies[0].name` | `postgresql` |

### Example: Kubernetes manifest

Given a `deployment.yaml`:

```yaml title="deployment.yaml"
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
spec:
  template:
    spec:
      containers:
        - name: app
          image: myregistry/app:2.5.0
```

| Expression | Extracted value |
|-----------|----------------|
| `.spec.template.spec.containers[0].image` | `myregistry/app:2.5.0` |

> **success:** 
**Combine with version transforms**

If the extracted value includes a prefix or suffix (like `myregistry/app:2.5.0`), pair the YQ parser with a regex version transform or use the Regex parser instead to isolate the version number.


---

## JQ parser

Use the JQ parser for **simple key lookups** in JSON files. Expressions use dot-notation to navigate the document.

> **warning:** 
**Limited functionality**

The JQ parser only supports simple dot-notation key access (`.field.subfield`). It does **not** support array indexing, filters, pipes, or any other JQ query syntax. If your JSON file requires array access (for example, `.items[0].version`), use the **YQ parser** instead — it handles both YAML and JSON with full array support.


### Expression syntax

```json title="Top-level field"
.version
```

```json title="Nested field"
.dependencies.react
```

```json title="Deeply nested"
.engines.node
```

### Example: package.json

Given this `package.json`:

```json title="package.json"
{
  "name": "my-app",
  "version": "2.1.0",
  "dependencies": {
    "express": "^4.18.2",
    "react": "^18.2.0"
  },
  "engines": {
    "node": ">=18.0.0"
  }
}
```

| Expression | Extracted value |
|-----------|----------------|
| `.version` | `2.1.0` |
| `.dependencies.express` | `^4.18.2` |
| `.dependencies.react` | `^18.2.0` |
| `.engines.node` | `>=18.0.0` |

> **success:** 
**Stripping version prefixes**

If the extracted value includes a prefix like `^` or `>=`, use a version transform or switch to the Regex parser to extract only the numeric portion.


### Example: composer.json

Given a `composer.json`:

```json title="composer.json"
{
  "name": "myorg/api",
  "require": {
    "php": ">=8.1",
    "laravel/framework": "^10.0"
  }
}
```

| Expression | Extracted value |
|-----------|----------------|
| `.require.php` | `>=8.1` |

---

## Regex parser

Use the Regex parser for any text file. The parser applies a regular expression to the file content and returns the first capture group. If there is no capture group, it returns the full match.

> **info:** 
**Go RE2 regex engine**

Planekeeper uses Go's [RE2 regex engine](https://github.com/google/re2/wiki/Syntax), which differs from the PCRE engine used by most online regex testers (e.g., [regexr.com](https://regexr.com), [regex101.com](https://regex101.com)). Key differences:

- **No lookahead/lookbehind** (`(?=...)`, `(?<=...)`) — RE2 does not support these.
- **No backreferences** (`\1`) — use explicit capture groups instead.
- **No possessive quantifiers** (`a++`) — use standard greedy/lazy quantifiers.

For accurate testing, use regex101.com with the **Golang** flavor selected, or use the **Test** button in the scrape job form which validates against the actual Go engine.


### Test button

The **Test** button in the scrape job form checks whether your pattern is valid Go RE2 syntax. It does **not** test the pattern against any file content — it only verifies that the pattern compiles without errors. Any string that is valid RE2 syntax will pass (e.g., plain text like `hello` is a valid regex that matches the literal word "hello"). To verify that your pattern extracts the correct version, test it against a local copy of your target file using a Go-compatible tool or regex101.com with the Golang flavor.

### Expression syntax

Enclose the version portion in parentheses to create a capture group:

```text title="Capture group extracts only the version"
version:\s*([\d.]+)
```

```text title="Full semver match with v prefix stripped"
^v(\d+\.\d+\.\d+)$
```

```text title="No capture group -- returns full match"
\d+\.\d+\.\d+
```

### Example: Dockerfile

Given this `Dockerfile`:

```dockerfile title="Dockerfile"
FROM node:18.17.1-alpine
WORKDIR /app
COPY . .
RUN npm install
```

| Expression | Extracted value |
|-----------|----------------|
| `FROM node:([\d.]+)` | `18.17.1` |
| `node:(\S+)` | `18.17.1-alpine` |

### Example: Makefile

Given this `Makefile`:

```makefile title="Makefile"
VERSION := 3.2.1
BINARY_NAME := myapp
```

| Expression | Extracted value |
|-----------|----------------|
| `VERSION\s*:=\s*([\d.]+)` | `3.2.1` |

### Example: .env file

Given a `.env` file:

```text title=".env"
APP_VERSION=1.0.5
DATABASE_URL=postgres://localhost/mydb
```

| Expression | Extracted value |
|-----------|----------------|
| `APP_VERSION=([\d.]+)` | `1.0.5` |

### Example: Terraform

Given a `versions.tf`:

```hcl title="versions.tf"
terraform {
  required_version = ">= 1.5.0"
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.20"
    }
  }
}
```

| Expression | Extracted value |
|-----------|----------------|
| `required_version\s*=\s*"[>=~]*\s*([\d.]+)"` | `1.5.0` |

> **warning:** 
**Multiline matching**

The regex is applied to the entire file content. Make sure your pattern is specific enough to match only the intended version string, especially in files with multiple version references.


---

## Version transforms

After a parser extracts a value, an optional version transform normalizes the string before comparison with upstream releases.

| Transform | Input | Output |
|-----------|-------|--------|
| `none` (default) | `1.2.3` | `1.2.3` |
| `add_v_lower` | `1.2.3` | `v1.2.3` |
| `add_v_upper` | `1.2.3` | `V1.2.3` |
| `strip_v_lower` | `v1.2.3` | `1.2.3` |
| `strip_v_upper` | `V1.2.3` | `1.2.3` |

Use transforms when upstream releases use a `v` prefix (for example, `v1.2.3`) but your configuration file stores the bare version number (`1.2.3`), or vice versa.

---

## Manual parser

The Manual parser does not extract a version from a file. Instead, the user enters the version string directly via the UI or the API. No agent, repository, or expression is needed.

Use the Manual parser when:

- You want to demo or test the monitoring pipeline without deploying an agent
- The deployed version is known but not stored in a Git-accessible file
- You want to report versions from a CI/CD pipeline via the API

When you set a version on a manual scrape job, it is stored as a version snapshot and triggers the same rule evaluation, alert generation, and notification pipeline as agent-discovered versions. If a version transform is configured on the job, it is applied to the entered value.

**API endpoint:**

```
POST /api/v1/client/scrape-jobs/{id}/set-version
Content-Type: application/json

{
  "version": "1.2.3"
}
```

See the [Scrape jobs guide](https://docs.planekeeper.com/guides/scrape-jobs/#manual-version-entry) for step-by-step instructions.

---

## Choosing the right parser

| Your file | Recommended parser | Expression example | Notes |
|-----------|-------------------|--------------------|-------|
| `Chart.yaml` | YQ | `.version` | |
| `Chart.yaml` (dependency) | YQ | `.dependencies[0].version` | Array access required |
| `package.json` (simple) | JQ | `.version` | Top-level key only |
| `package.json` (nested array) | YQ | `.dependencies[0].version` | Use YQ for arrays in JSON |
| `Dockerfile` | Regex | `FROM image:([\d.]+)` | |
| `docker-compose.yml` | YQ | `.services.app.image` | |
| `.tool-versions` | Regex | `nodejs\s+([\d.]+)` | |
| `go.mod` | Regex | `^go\s+([\d.]+)` | |
| `requirements.txt` | Regex | `flask==([\d.]+)` | |
| No file (manual entry) | Manual | *(none)* | Version entered via UI or API |

> **success:** 
**When in doubt, use YQ**

The YQ parser handles YAML and JSON, supports array indexing, and covers the most use cases. Use JQ only for straightforward JSON key lookups, Regex for non-structured text files, and Manual when you do not need agent-based extraction at all.


