Regex Extract
Synopsis
Extracts named fields from text using regular expressions with named capture groups.
Schema
regex_extract:
- field: <ident>
- regex: <string>
- additional_regex: <string[]>
- description: <text>
- field_name_format: <string>
- if: <script>
- ignore_failure: <boolean>
- ignore_missing: <boolean>
- max_exec: <integer>
- on_failure: <processor[]>
- on_success: <processor[]>
- overwrite_existing: <boolean>
- tag: <string>
Configuration
Field | Required | Default | Description |
---|---|---|---|
field | Y | - | Field containing text to extract from |
regex | Y | - | Regular expression with named capture groups |
additional_regex | N | - | Additional patterns to match after primary regex |
description | N | - | Explanatory note |
field_name_format | N | - | Template for formatting extracted field names (${name} ) |
if | N | - | Condition to run |
ignore_failure | N | false | Continue on regex match failures |
ignore_missing | N | false | Continue if source field doesn't exist |
max_exec | N | 100 | Maximum number of matches to process |
on_failure | N | - | See Handling Failures |
on_success | N | - | See Handling Success |
overwrite_existing | N | false | Replace existing fields instead of converting to array |
tag | N | - | Identifier |
Details
The processor supports dynamic field naming using _NAME_
and _VALUE_
pattern pairs, field name formatting, and handling of multiple matches.
Golang regular expressions provied named capture groups to extract fields.
Complex regular expressions on large texts may impact performance
Each named group becomes a field in the output. Special _NAME_n
and _VALUE_n
pairs allow dynamic field naming based on extracted content.
The _NAME_n
and _VALUE_n
pairs must use matching indices, e.g. _NAME_0
with _VALUE_0
Multiple regex patterns, array conversion for duplicate fields, field name templating, and match count limiting are also supported.
Field names are automatically sanitized to remove invalid characters. However, the field_name_format
should produce valid field names. Also, when overwrite_existing
is set to false
, duplicate matches are converted to arrays.
Be careful with the max_exec
setting when dealing with high-frequency matches.
Consider using ignore_failure
when regex patterns might not match all inputs.
Examples
Basic
Extracting a numeric value with a static field name... |
|
creates a new field: |
|
Complex Logs
Extracting multiple fields from structured log... |
|
yields HTTP log components: |
|
Dynamic Fields
Extracting key-value pairs as dynamic fields... |
|
creates new fields based on the extracted names: |
|
Formatting
Formatting extracted field names... |
|
adds suffixes: |
|
Multi-Match
Extracting multiple matches with array conversion... |
|
creates an array of up to |
|
Structured Data
Using multiple regexes with structured data... |
|
extracts nested key-value pairs: |
|