Skip to main content

Check Schema

Flow Control

Synopsis

Validates event data against schemas defined in Parquet or Avro format — including ASIM, OCSF, UDM, and custom schemas — detecting schema drift by identifying missing fields, extra fields, and type mismatches.

Custom schemas can be created and managed in the Library.

Schema

- check_schema:
schema: <string>
target_field: <string>
schema_type: <string>
requirement_filter: <string>
check_mode: <string>
validate_recommended: <boolean>
validate_optional: <boolean>
disabled: <boolean>
on_missing: <processor[]>
on_extra: <processor[]>
on_type_mismatch: <processor[]>
description: <text>
if: <script>
ignore_failure: <boolean>
on_failure: <processor[]>
on_success: <processor[]>
tag: <string>

Configuration

FieldRequiredDefaultDescription
schemaYSchema name to validate against (e.g., ASimNetworkSessionLogs, ASimAuthenticationEventLogs, ocsf_network_activity)
target_fieldYField name to store validation results
schema_typeNparquetSchema format: parquet or avro
requirement_filterNallFilter for schema field requirements
check_modeNmissingValidation mode: missing, extra, or both
validate_recommendedNfalseInclude recommended fields in validity check
validate_optionalNfalseInclude optional fields in validity check
disabledNfalseDisable the processor without removing it from the pipeline
on_missingNProcessors to execute when missing fields are detected
on_extraNProcessors to execute when extra fields are detected
on_type_mismatchNProcessors to execute when type mismatches are detected
descriptionNExplanatory note
ifNCondition to run
ignore_failureNfalseSee Handling Failures
on_failureNSee Handling Failures
on_successNSee Handling Success
tagNIdentifier

Details

The check_schema processor validates events against schema definitions, detecting schema drift that occurs when vendor log formats change unexpectedly. Schemas are loaded as Parquet or Avro definitions (selected by schema_type), which covers ASIM, OCSF, UDM, and custom schemas placed in the package or user schema directories.

Validation Result Structure: Results are written to the target_field as a structured object:

{
"is_valid": false,
"missing_required_fields": ["EventSchema", "EventVendor"],
"missing_recommended_fields": ["DvcAction", "EventSeverity"],
"missing_optional_fields": ["SrcNatIpAddr"],
"extra_fields": ["CustomField1"],
"type_mismatches": [
{
"field": "EventCount",
"expected_type": "INT32",
"actual_type": "STRING"
}
]
}

Validation Levels:

  • Required fields: Always checked when check_mode includes missing. Missing required fields make is_valid false.
  • Recommended fields: Reported but don't affect validity unless validate_recommended: true.
  • Optional fields: Reported but don't affect validity unless validate_optional: true.
  • Extra fields: Detected when check_mode includes extra. Never affect validity (informational only).
  • Type mismatches: Checked for present fields. Impact follows the field's requirement level.

Check Modes:

  • missing: Only detect missing fields
  • extra: Only detect extra fields not in schema
  • both: Detect both missing and extra fields

Metadata Fields: Fields prefixed with @ (like @timestamp, @metadata) are automatically ignored during validation.

For integration patterns, see Schema Drift Detection and Multi-Tier Pipelines.

Examples

Basic ASIM Validation

Validating network session event against ASIM schema...

{
"TimeGenerated": "2024-01-01T12:00:00Z",
"EventCount": 1,
"EventSchema": "NetworkSession",
"EventSchemaVersion": "0.2.6",
"EventStartTime": "2024-01-01T12:00:00Z",
"EventEndTime": "2024-01-01T12:00:01Z",
"Dvc": "firewall01",
"EventType": "NetworkSession",
"EventResult": "Success",
"EventProduct": "Firewall",
"EventVendor": "Microsoft"
}
- check_schema:
schema: "ASimNetworkSessionLogs"
target_field: "schema_check"
check_mode: "both"

All required fields present, validation passes...

{
"schema_check": {
"is_valid": true,
"missing_required_fields": [],
"missing_recommended_fields": ["DvcAction", "EventSeverity"],
"missing_optional_fields": [],
"extra_fields": [],
"type_mismatches": []
}
}

Detecting Missing Fields

Event missing required fields triggers validation failure...

{
"TimeGenerated": "2024-01-01T12:00:00Z",
"EventCount": 1,
"EventType": "NetworkSession"
}
- check_schema:
schema: "ASimNetworkSessionLogs"
target_field: "schema_check"
check_mode: "missing"

Missing required fields listed in result...

{
"schema_check": {
"is_valid": false,
"missing_required_fields": [
"EventSchema",
"EventSchemaVersion",
"EventEndTime",
"Dvc",
"EventResult",
"EventProduct",
"EventVendor"
],
"type_mismatches": []
}
}

Detecting Extra Fields

Detecting fields not defined in schema...

{
"TimeGenerated": "2024-01-01T12:00:00Z",
"EventCount": 1,
"ExtraField1": "unexpected",
"CustomData": {"key": "value"},
"UnknownMetric": 999.99
}
- check_schema:
schema: "ASimNetworkSessionLogs"
target_field: "schema_check"
check_mode: "extra"

Extra fields detected but don't affect validity...

{
"schema_check": {
"is_valid": true,
"extra_fields": [
"ExtraField1",
"CustomData",
"UnknownMetric"
]
}
}

Conditional Processing Chains

Triggering alerts when schema drift is detected...

- check_schema:
schema: "ASimNetworkSessionLogs"
target_field: "schema_check"
check_mode: "both"
on_missing:
- set:
field: "drift_type"
value: "missing_fields"
- set:
field: "alert_required"
value: true
on_extra:
- set:
field: "drift_type"
value: "extra_fields"
on_type_mismatch:
- set:
field: "drift_type"
value: "type_mismatch"

Conditional chains execute based on drift type...

{
"drift_type": "missing_fields",
"alert_required": true,
"schema_check": {
"is_valid": false,
"missing_required_fields": ["EventSchema"]
}
}

Strict Validation

Including recommended fields in validity check...

{
"TimeGenerated": "2024-01-01T12:00:00Z",
"EventCount": 1,
"EventSchema": "NetworkSession",
"EventSchemaVersion": "0.2.6",
"EventStartTime": "2024-01-01T12:00:00Z",
"EventEndTime": "2024-01-01T12:00:01Z",
"Dvc": "firewall01",
"EventType": "NetworkSession",
"EventResult": "Success",
"EventProduct": "Firewall",
"EventVendor": "Microsoft"
}
- check_schema:
schema: "ASimNetworkSessionLogs"
target_field: "schema_check"
check_mode: "both"
validate_recommended: true

Missing recommended fields now affect validity...

{
"schema_check": {
"is_valid": false,
"missing_required_fields": [],
"missing_recommended_fields": ["DvcAction", "EventSeverity"]
}
}