Skip to main content

Grok

Parse Elastic Compatible

Synopsis

Extracts structured fields from unstructured log messages using predefined patterns.

Schema

grok:
- field: <ident>
- patterns: <string[]>
- description: <text>
- if: <script>
- ignore_failure: <boolean>
- ignore_missing: <boolean>
- on_failure: <processor[]>
- on_success: <processor[]>
- pattern_definitions: <map>
- tag: <string>
- trace_match: <boolean>

Configuration

FieldRequiredDefaultDescription
fieldY-Text field to extract patterns from
patternsY-List of patterns to try matching (first match wins)
descriptionN-Documentation note
ifN-Conditional expression
ignore_failureNfalseSkip pattern match failures
ignore_missingNfalseSkip if input field missing
on_failureN-Error handling processors
on_successN-Success handling processors
pattern_definitionsN-Custom pattern definitions
tagN-Identifier for logging
trace_matchNfalseTrack which pattern matched

Details

The Grok processor combines pre-defined patterns to match and extract values from text fields. It uses a pattern syntax that combines pattern names with field names in the format %{PATTERN_NAME:FIELD_NAME}.

The processor provides type conversion by appending :type to field names, e.g. %{NUMBER:duration:int}. It supports two types of conversion:

Integer (:int)
Converts matched values to 32-bit integers
Long (:long)
Converts matched values to 64-bit integers
note

Pattern matching stops at the first successful match. Order your patterns from most specific to most general.

warning

Complex patterns may impact performance. Monitor matching time, and consider optimizing patterns for frequently processed fields.

Examples

Basic

Parsing an Apache log entry...

{
"message": "192.168.1.1 - - [07/Jul/2023:12:34:56 +0000] \"GET /index.html HTTP/1.1\" 200 1234"
}
grok:
- field: message
- patterns:
- "%{COMMONAPACHELOG}"

extracts structured fields:

{
"clientip": "192.168.1.1",
"timestamp": "07/Jul/2023:12:34:56 +0000",
"verb": "GET",
"request": "/index.html",
"httpversion": "1.1",
"response": 200,
"bytes": 1234
}

Custom Patterns

Defining and using custom patterns...

{
"log_entry": "User[123] Action=LOGIN Status=SUCCESS"
}
grok:
- field: log_entry
- pattern_definitions:
USER_ID: "User\\[(?<user_id>\\d+)\\]"
ACTION: "Action=(?<action>\\w+)"
STATUS: "Status=(?<status>\\w+)"
- patterns:
- "%{USER_ID} %{ACTION} %{STATUS}"

creates structured output:

{
"user_id": "123",
"action": "LOGIN",
"status": "SUCCESS"
}

Conversion

Converting matched values to specific types...

{
"metric": "cpu_usage=85 memory=1024"
}
grok:
- field: metric
- patterns:
- "cpu_usage=%{NUMBER:cpu:int} memory=%{NUMBER:memory:long}"

is automatically handled:

{
"cpu": 85,
"memory": 1024
}

Syslog Parsing

Parsing a structured syslog message...

{
"message": "<134>1 2022-07-06T15:53:08Z host app 2700 - [action:\"login\"; status:\"failed\"]"
}
grok:
- field: message
- patterns:
- "%{SYSLOG5424PRI}%{NONNEGINT:ver} +(?:%{TIMESTAMP_ISO8601:ts}|-) +(?:%{IPORHOST:host}|-) +(?:%{SYSLOG5424PRINTASCII:app}) +(?:%{NONNEGINT:pid}|-) +(?:%{SYSLOG5424PRINTASCII:msgid}|-) +\\[%{GREEDYDATA:structured_data}\\]"

extracts the syslog fields:

{
"ver": "1",
"ts": "2022-07-06T15:53:08Z",
"host": "host",
"app": "app",
"pid": "2700",
"structured_data": "action:\"login\"; status:\"failed\""
}

Multiple Patterns

Trying multiple patterns in turn...

{
"message": "Error: Connection timeout after 30s"
}
grok:
- field: message
- patterns:
- "Error: %{GREEDYDATA:error_message}"
- "Warning: %{GREEDYDATA:warning_message}"
- "Info: %{GREEDYDATA:info_message}"

where the first matches:

{
"error_message": "Connection timeout after 30s"
}