Skip to main content
Version: 1.3.0

File

Long Term Storage

Synopsis

Creates a file target that writes log messages to files in various formats like JSON, MultiJSON, Avro, Parquet, with support for various compression methods and schemas.

Schema

- name: <string>
description: <string>
type: file
pipelines: <pipeline[]>
status: <boolean>
properties:
location: <string>
name: <string>
format: <string>
compression: <string>
extension: <string>
schema: <string>
field_format: <string>
no_buffer: <boolean>
batch_size: <integer>
max_size: <integer>
locations: <location[]>

Configuration

The following fields are used to define the target:

FieldRequiredDefaultDescription
nameYTarget name
descriptionN-Optional description
typeYMust be file
pipelinesN-Optional post-processor pipelines
statusNtrueEnable/disable the target

Files

Files can have the following properties:

FieldRequiredDefaultDescription
locationN<service-root>File output directory
nameN"vmetric.{{.Timestamp}}.{{.Extension}}"File name template
formatN"json"File format. See Formats below
compressionN-Compression algorithm. See Compression below
extensionNMatches typeCustom file extension
schemaN-Data schema for Avro / Parquet formats or schema template name
batch_sizeN100000Maximum number of messages per file
max_sizeN32MBMaximum file size before rotating
no_bufferNfalseDisable write buffering
field_formatN-Data normalization format. See applicable Normalization section

Multiple Locations

You can define multiple output locations with different settings:

targets:
- name: multi_location_logs
type: file
properties:
locations:
- id: "security_logs"
path: "/var/log/security"
schema: "{{CommonSecurityLog}}"
format: "parquet"
- id: "system_logs"
path: "/var/log/system"
schema: "{{CommonSystemLog}}"
format: "json"

Details

The file target supports writing to multiple file locations with different formats and schemas. When using SystemS3 field in your logs, the value will be used to route the message to the location with a matching ID.

If no schema is specified for Avro or Parquet formats, a default schema will be used that captures epoch timestamp and message content.

The target supports the following built-in schema templates:

  • {{Syslog}} - Standard schema for Syslog messages
  • {{CommonSecurityLog}} - Schema compatible with Common Event Format (CEF)
note

Files with no messages (i.e. with counter=0) are automatically removed when the target is disposed.

caution

When no_buffer is enabled, each write operation will be immediately flushed to disk. This provides durability but may impact performance.

Templates

The following template variables can be used in the file name:

VariableDescriptionExample
{{.Year}}Current year2024
{{.Month}}Current month01
{{.Day}}Current day15
{{.Timestamp}}Current timestamp in nanoseconds1703688533123456789
{{.Format}}File formatjson
{{.Extension}}File extensionjson
{{.Compression}}Compression typezstd
{{.TargetName}}Target namemy_logs
{{.TargetType}}Target typefile
{{.Table}}Location IDsecurity_logs

Formats

FormatDescription
jsonEach log entry is written as a separate JSON line (JSONL format)
multijsonAll log entries are written as a single JSON array
ocfOCF format with schema
avroApache Avro format with schema
parquetApache Parquet columnar format with schema

Compression

Files can use the following compression algorithms:

FormatDefaultCompression Codecs
AvrozstdSee Appendix
ParquetzstdSee Appendix

Examples

JSON

Configuration for a JSON output— as "json" is the default format, no need to specify it:

targets:
- name: json_logs
type: file
properties:
location: "/var/log/vmetric"
compression: "zstd"

Multiple Locations

Configuration for multiple output locations; a different format is used for each location:

targets:
- name: multi_location_logs
type: file
properties:
locations:
- id: "security"
path: "/var/log/vmetric/security"
format: "parquet"
schema: "{{CommonSecurityLog}}"
- id: "system"
path: "/var/log/vmetric/system"
format: "json"
- id: "application"
path: "/var/log/vmetric/app"
format: "multijson"
name: "app_{{.Year}}_{{.Month}}_{{.Day}}.json"

Parquet

Configuration for a Parquet output with compression and schema:

targets:
- name: parquet_logs
type: file
properties:
location: "/var/log/vmetric"
format: "parquet"
compression: "zstd"
schema: |
{
"timestamp": {
"type": "INT",
"bitWidth": 64,
"signed": true
},
"message": {
"type": "STRING",
"compression": "ZSTD"
}
}

Using built-in schema templates:

name: cef_logs
type: file
properties:
location: "/var/log/vmetric"
format: "parquet"
schema: "{{CommonSecurityLog}}"

OCF

Configuration for an OCF output with daily file rotation:

targets:
- name: ocf_logs
type: file
properties:
location: "/var/log/vmetric"
format: "ocf"
compression: "zstd"
name: "logs_{{.Year}}_{{.Month}}_{{.Day}}.ocf"

Windows

Configuration for a Windows environment with a proper path structure:

targets:
- name: windows_logs
type: file
properties:
location: "C:\\ProgramData\\VMetric\\Logs"
compression: "zstd"
name: "windows_{{.Year}}\\{{.Month}}\\system_logs.json"