Overview
Datasets and Profiles provide reusable data collection rule templates that standardize how telemetry is collected across device fleets. Instead of configuring each device's data collection individually, you define a dataset once and assign it to multiple devices.
To create and assign datasets and profiles through the web interface, see Management. For the collector types available on each platform, see Windows Datasets, WEC Datasets, and Linux Datasets.
Definitions
A Dataset is a reusable data collection rule template that defines what data to collect and how to collect it. Each dataset specifies a collection type (Windows Event Logs, DNS Logs, etc.) and the configuration parameters for that type. Datasets can optionally reference a preprocessing pipeline for inline processing of collected data.
A Profile is a grouping layer that composes multiple datasets into a single assignable unit. Profiles allow you to bundle related collection rules and apply them to devices as a set.
Type hierarchy: Each dataset has a type (category) and a definition type (specific collector). The type groups datasets into categories — windows, wec, or linux — while the definition type identifies the exact collector implementation (e.g., windows_security_log_collector). This two-level classification drives device compatibility and determines which configuration interface is presented.
Status lifecycle: Datasets and profiles have a status of active, passive, or deleted. Active items are applied to their assigned devices. Passive items remain configured but are not actively applied. Deleted items are soft-deleted and no longer visible in the UI.
Relationship to Devices: Datasets and profiles have a many-to-many relationship with devices. A single dataset can be assigned to multiple devices, and a single device can have multiple datasets assigned to it. This eliminates repetitive per-device configuration and ensures consistent data collection across your fleet.
Processing flow context: Datasets and profiles operate at the device layer of the DataStream processing flow. They govern what data a device collects before it enters preprocessing and pipeline stages.
Provider → Device (dataset rules applied here) → Preprocessing → Pipeline → Postprocessing → Target → Consumer
Permissions
Access to datasets and profiles is controlled by the following permission scopes:
| Scope | Description |
|---|---|
DATASET_READ | View datasets and their configurations |
DATASET_CREATE | Create new datasets |
DATASET_EDIT | Modify existing datasets and device assignments |
DATASET_DELETE | Delete datasets |
PROFILE_READ | View profiles and their configurations |
PROFILE_CREATE | Create new profiles |
PROFILE_EDIT | Modify existing profiles, dataset selection, and device assignments |
PROFILE_DELETE | Delete profiles |