Normalization
Normalization is a critical stage connecting ingestion from sources and forwarding to targets used to coalesce log data from diverse sources into consistent formats, enabling unified handling across different logging systems.
Log Formats
The processor supports several widely-used log formats:
Generic
Format | Notation | Key Identifier | Layout Characteristics | Example Fields |
---|---|---|---|---|
Elastic Common Schema (ECS) | Dot notation with lowercase | @timestamp | Hierarchical structure | source.ip , network.direction |
Splunk Common Information Model (CIM) | Underscore with lowercase | _time | Flat structure | src_ip , network_direction |
Advanced Security Information Model (ASIM) | PascalCase | TimeGenerated | Explicit names | SourceIp , NetworkDirection |
Security-specific
Format | Description | Key Identifier | Example Fields |
---|---|---|---|
Common Event Format (CEF) | ArcSight's standard format | rt (receiptTime) | networkUser , sourceAddress |
Log Event Extended Format (LEEF) | IBM QRadar's format | devTime | networkUser , srcAddr |
Common Security Log (CSL) | Microsoft Sentinel's format | TimeGenerated | NetworkUser , SourceAddress |
Format Detection
Source formats can be automatically detected using certain characteristic fields, e.g.
Context | Field | Format |
---|---|---|
Timestamp | @timestamp | ECS |
_time | CIM | |
TimeGenerated | ASIM/CSL | |
Security | rt | CEF |
devTime | LEEF | |
CSL detection | TimeGenerated + LogSeverity | CSL |
TimeGenerated only | ASIM |
Conversion
Casing and Delimiters
Each format follows specific naming conventions:
ECS | source.ip , event.severity |
CIM | src_ip , event_severity |
ASIM | SourceIp , EventSeverity |
CEF | sourceAddress , eventSeverity |
LEEF | srcAddr , evtSev |
CSL | SourceIP , EventSeverity |
Complex format conversions may impact performance.
Field Mapping
There are identifiable common network fields based on context across various formats:
Context | |||
---|---|---|---|
Format | Source IP | Destination IP | Direction |
ecs | source.ip | destination.ip | network.direction |
cim | src_ip | dest_ip | network_direction |
asim | SourceIp | DstIp | NetworkDirection |
cef | src | dst | networkDirection |
leef | srcAddr | dstAddr | netDir |
csl | SourceIp | DestinationIp | NetworkDirection |
Configuration
Basic
Convert from ECS to ASIM format:
normalize:
source_format: ecs
target_format: asim
Field-specific
Convert a specific network field:
normalize:
field: network_data
source_format: cef
target_format: ecs
Auto-detection
Let the processor detect the source format:
normalize:
target_format: cim
Preprocessing
Fields are standardized with normalize
for conversion between the ECS, CIM, ASIM, CEF, LEEF and CSL formats (see the Log Formats and Conversion sections above). Values are formatted for uniform casing with uppercase
and lowercase
processors when required by the target format's naming conventions.
Postprocessing
Fields are optimized for storage and queries using format conversion with the normalize
processor (see the Conversion and Field Mapping sections above). For Microsoft Sentinel integration, data is prepared by converting to the ASIM format with normalize
(see Log Formats table).
Complex format conversions may impact processing performance and delivery latency.