Skip to main content

Grok

Parse Elastic Compatible

Synopsis

Extracts structured fields from unstructured log messages using predefined patterns.

Reusable grok patterns can be created and managed in the Library.

Schema

- grok:
field: <ident>
patterns: <string[]>
break_on_match: <boolean>
description: <text>
if: <script>
ignore_failure: <boolean>
ignore_missing: <boolean>
on_failure: <processor[]>
on_success: <processor[]>
pattern_definitions: <map>
tag: <string>
trace_match: <boolean>

Configuration

The following fields are used to define the processor:

FieldRequiredDefaultDescription
fieldY-Text field to extract patterns from
patternsY-List of patterns to try matching. See below
break_on_matchNtrueStop at first matching pattern (true) or evaluate all patterns and merge captures (false)
descriptionN-Documentation note
ifN-Conditional expression
ignore_failureNfalseSkip pattern match failures
ignore_missingNfalseSkip if input field missing
on_failureN-Error handling processors
on_successN-Success handling processors
pattern_definitionsN-Custom pattern definitions
tagN-Identifier for logging
trace_matchNfalseTrack which pattern matched

Built-in Patterns

CategoryPatterns
GeneralDATA GREEDYDATA NOTSPACE SPACE WORD
NumericBASE10NUM INT NUMBER
NetworkingHOSTNAME IP IPV4 IPV6 MAC
Date and TimeDATESTAMP DATESTAMP_RFC822 TIMESTAMP_ISO8601
File SystemFILENAME PATH
HTTPHTTPDATE HTTPDERRORLOG HTTPDUSER
SystemSYSLOGBASE SYSLOGHOST SYSLOGTIMESTAMP
OtherEMAILADDRESS URIPARAM URIPATH UUID

The Grok processor combines pre-defined patterns to match and extract values from text fields. It uses a pattern syntax that combines pattern names with field names in the format %{PATTERN_NAME:FIELD_NAME}.

The processor provides type conversion by appending :type to field names, e.g. %{NUMBER:duration:int}. It supports two types of conversion:

Integer (:int)
Converts matched values to 32-bit integers
Long (:long)
Converts matched values to 64-bit integers
note

By default (break_on_match: true), pattern matching stops at the first successful match — order your patterns from most specific to most general. Set break_on_match: false to evaluate every pattern and merge captures from each successful match.

caution

Complex patterns may impact performance. Monitor matching time, and consider optimizing patterns for frequently processed fields.

Examples

Basic

Parsing an Apache log entry...

{
"message": "192.168.1.1 - - [07/Jul/2023:12:34:56 +0000] \"GET /index.html HTTP/1.1\" 200 1234"
}
- grok:
field: message
patterns:
- "%{COMMONAPACHELOG}"

extracts structured fields:

{
"clientip": "192.168.1.1",
"timestamp": "07/Jul/2023:12:34:56 +0000",
"verb": "GET",
"request": "/index.html",
"httpversion": "1.1",
"response": 200,
"bytes": 1234
}

Custom

Defining and using custom patterns...

{
"log_entry": "User[123] Action=LOGIN Status=SUCCESS"
}
- grok:
field: log_entry
pattern_definitions:
USER_ID: "User\\[(?<user_id>\\d+)\\]"
ACTION: "Action=(?<action>\\w+)"
STATUS: "Status=(?<status>\\w+)"
patterns:
- "%{USER_ID} %{ACTION} %{STATUS}"

creates structured output:

{
"user_id": "123",
"action": "LOGIN",
"status": "SUCCESS"
}

Multiple

Trying multiple patterns in turn...

{
"message": "Error: Connection timeout after 30s"
}
- grok:
field: message
patterns:
- "Error: %{GREEDYDATA:error_message}"
- "Warning: %{GREEDYDATA:warning_message}"
- "Info: %{GREEDYDATA:info_message}"

where the first matches:

{
"error_message": "Connection timeout after 30s"
}

Cumulative

Evaluating all patterns to capture multiple segments...

{
"message": "user=alice action=login src=10.0.0.1"
}
- grok:
field: message
break_on_match: false
patterns:
- "user=%{WORD:user}"
- "action=%{WORD:action}"
- "src=%{IP:src_ip}"

Every successful pattern contributes its captures:

{
"user": "alice",
"action": "login",
"src_ip": "10.0.0.1"
}

Conversion

Converting matched values to specific types...

{
"metric": "cpu_usage=85 memory=1024"
}
- grok:
field: metric
patterns:
- "cpu_usage=%{NUMBER:cpu:int} memory=%{NUMBER:memory:long}"

is automatically handled:

{
"cpu": 85,
"memory": 1024
}

Parsing

Parsing a structured syslog message...

{
"message": "<134>1 2022-07-06T15:53:08Z host app 2700 - [action:\"login\"; status:\"failed\"]"
}
- grok:
field: message
patterns:
- "%{SYSLOG5424PRI}%{NONNEGINT:ver} +(?:%{TIMESTAMP_ISO8601:ts}|-) +(?:%{IPORHOST:host}|-) +(?:%{SYSLOG5424PRINTASCII:app}) +(?:%{NONNEGINT:pid}|-) +(?:%{SYSLOG5424PRINTASCII:msgid}|-) +\\[%{GREEDYDATA:structured_data}\\]"

extracts the syslog fields:

{
"ver": "1",
"ts": "2022-07-06T15:53:08Z",
"host": "host",
"app": "app",
"pid": "2700",
"structured_data": "action:\"login\"; status:\"failed\""
}