Normalization
Normalization is a critical stage connecting ingestion from sources and forwarding to targets used to coalesce log data from diverse sources into consistent formats, enabling unified handling across different logging systems.
Log Formats
The processor supports several widely-used log formats:
Generic
Format | Notation | Key Identifier | Layout Characteristics | Example Fields |
---|---|---|---|---|
Elastic Common Schema (ECS) | Dot notation with lowercase | @timestamp | Hierarchical structure | source.ip , network.direction |
Splunk Common Information Model (CIM) | Underscore with lowercase | _time | Flat structure | src_ip , network_direction |
Advanced Security Information Model (ASIM) | PascalCase | TimeGenerated | Explicit names | SourceIp , NetworkDirection |
Security-specific
Format | Description | Key Identifier | Example Fields |
---|---|---|---|
Common Event Format (CEF) | ArcSight's standard format | rt (receiptTime) | networkUser , sourceAddress |
Log Event Extended Format (LEEF) | IBM QRadar's format | devTime | networkUser , srcAddr |
Common Security Log (CSL) | Microsoft Sentinel's format | TimeGenerated | NetworkUser , SourceAddress |
Format Detection
Source formats can be automatically detected using certain characteristic fields, e.g.:
Context | Field | Format |
---|---|---|
Timestamp | @timestamp | ECS |
_time | CIM | |
TimeGenerated | ASIM/CSL | |
Security | rt | CEF |
devTime | LEEF | |
CSL detection | TimeGenerated + LogSeverity | CSL |
TimeGenerated only | ASIM |
Conversion
Casing and Delimiters
Each format follows specific naming conventions:
ECS | source.ip , event.severity |
CIM | src_ip , event_severity |
ASIM | SourceIp , EventSeverity |
CEF | sourceAddress , eventSeverity |
LEEF | srcAddr , evtSev |
CSL | SourceIP , EventSeverity |
Complex format conversions may impact performance.
Field Mapping
There are identifiable common network fields based on context across various formats:
Context | |||
---|---|---|---|
Format | Source IP | Destination IP | Direction |
ECS | source.ip | destination.ip | network.direction |
CIM | src_ip | dest_ip | network_direction |
ASIM | SourceIp | DstIp | NetworkDirection |
CEF | src | dst | networkDirection |
LEEF | srcAddr | dstAddr | netDir |
CSL | SourceIp | DestinationIp | NetworkDirection |
Configuration
Basic
Convert from ECS to ASIM format:
normalize:
- source_format: ecs
- target_format: asim
Field-specific
Convert a specific network field:
normalize:
- field: network_data
- source_format: cef
- target_format: ecs
Auto-detection
Let the processor detect the source format:
normalize:
- target_format: cim
Best Practices
For data integrity, always validate transformed logs against originals, keep original fields when possible for debugging, and document format-specific transformations.
For performance, do the normalization early in the pipeline, cache results for lookup when possible, and monitor transformation overhead.
For error handling, use ignore_failure
and implement fallback mechanisms. Also, do not forget to test with diverse samples.