Version: 1.5.1

Microsoft Sentinel Integration

VirtualMetric Director supports Microsoft Sentinel integration through two different approaches: automatic discovery and manual configuration. Choose the method that best fits your environment and requirements.

Prerequisites

Both integration approaches require:

an Azure subscription with permissions to create resources
a Log Analytics workspace required by Microsoft Sentinel

Autodiscovery Setup

VirtualMetric Director provides an autodiscovery feature for Microsoft Sentinel integration. This enables automatic detection and configuration of Data Collection Rules (DCRs) and their associated streams, simplifying the setup process and providing dynamic updates as your Sentinel environment changes.

Open a terminal with Administrative access and navigate to <vm_root>. Then, type the following command and press Enter.

PowerShell
Bash

C:\vmetric-director -sentinel -autodiscovery

vmetric-director -sentinel -autodiscovery

Follow the on-screen prompts to complete the setup process. For detailed step-by-step instructions, refer to Microsoft Sentinel Overview.

Manual Setup

Manual integration requires step-by-step configuration of Microsoft Sentinel components. This approach provides full control over the integration process and is ideal for environments requiring specific configuration requirements.

Service Principal Setup

Create a service principal for DataStream authentication:

Navigate to Azure Active Directory > App registrations
Select New registration
Enter DataStream as the application name
Select Accounts in this organizational directory only
Click Register
Record the Application (client) ID and Directory (tenant) ID
Go to Certificates & secrets > New client secret
Create a secret and record the Client secret value

Data Collection Endpoint Setup

Navigate to Azure Portal > Monitor > Data Collection Endpoints
Select Create
Configure the DCE:
Field Value
Name datastream-dce
Resource group Select your resource group
Region Same region as your Log Analytics workspace
Click Review + create > Create
Record the Logs Ingestion endpoint URL

Field	Value
Name	`datastream-dce`
Resource group	Select your resource group
Region	Same region as your Log Analytics workspace

Data Collection Rule Creation

Navigate to Monitor > Data Collection Rules
Select Create
Configure basic settings:
Field Value
Rule name datastream-dcr
Resource group Same as your DCE
Region Same as your DCE
Platform Type Windows or Linux based on your data sources
In Resources tab: add your Log Analytics workspace
In Collect and deliver tab,
Field Value
Data source type Custom Text Logs or Windows Event Logs
Data source name DataStreamLogs
File pattern Configure based on your log sources
Configure Destination:
Field Value
Destination type Azure Monitor Logs
Destination Your Log Analytics workspace
Table Create or select target table
Click Review + create > Create
Record the DCR Immutable ID

Field	Value
Rule name	`datastream-dcr`
Resource group	Same as your DCE
Region	Same as your DCE
Platform Type	Windows or Linux based on your data sources

Field	Value
Data source type	Custom Text Logs or Windows Event Logs
Data source name	`DataStreamLogs`
File pattern	Configure based on your log sources

Field	Value
Destination type	Azure Monitor Logs
Destination	Your Log Analytics workspace
Table	Create or select target table

Required Permissions

Director needs the following permissions for Microsoft Sentinel integration.

note

If you used the Automation tool with App Registration, these permissions are already configured.

Autodiscovery
Manual

For Data Collection - For each DCR name prefixed with vmetric:
1. Navigate to the DCR in Azure Portal
2. Go to Access Control (IAM)
3. Select Add > Add role assignment
4. Assign the following permissions:
  Role Assignee
  Monitoring Metrics Publisher Your Managed Identity
  -or-
  Application
For Autodiscovery - To enable the DCR autodiscovery features:
1. Navigate to the Resource Group containing your DCRs
2. Go to Access Control (IAM)
3. Select Add > Add role assignment
4. Assign the following permissions:
  Role Assignee
  Monitoring Reader Your Managed Identity
  -or-
  Application
  
  important
  The Monitoring Reader role should be assigned at the Resource Group level only. Assigning this role at the Subscription level is not recommended since it is not required for the functionality to work, and it increases the autodiscovery scan duration.

Role	Assignee
Monitoring Metrics Publisher	Your Managed Identity -or- Application

Role	Assignee
Monitoring Reader	Your Managed Identity -or- Application

For Data Collection - For each manually created DCR:
1. Navigate to the DCR in Azure Portal
2. Go to Access Control (IAM)
3. Select Add > Add role assignment
4. Assign the following permissions:
  Role Assignee
  Monitoring Metrics Publisher Your Service Principal
  -or-
  Application

Role	Assignee
Monitoring Metrics Publisher	Your Service Principal -or- Application

Configuration

Autodiscovery
Manual

Basic Configuration

targets:
  - name: sentinel
    type: sentinel
    properties:
      tenant_id: "<your-tenant-id>"
      client_id: "<your-client-id>"
      client_secret: "<your-client-secret>"
      endpoint: "/subscriptions/.../dataCollectionEndpoints/<your-dce-name>"  # Use Resource ID

Filtered Streams

You can filter the autodiscovered streams that you intend to use:

targets:
  - name: sentinel
    type: sentinel
    properties:
      tenant_id: "<your-tenant-id>"
      client_id: "<your-client-id>"
      client_secret: "<your-client-secret>"
      endpoint: "/subscriptions/.../dataCollectionEndpoints/<your-dce-name>"
      streams:
        - name: "Custom-WindowsEvent"
        - name: "Custom-SecurityEvent"

Cache Configuration

You can optionally adjust the cache timeout (in seconds):

targets:
  - name: sentinel
    type: sentinel
    properties:
      endpoint: "/subscriptions/.../dataCollectionEndpoints/<your-dce-name>"
      cache:
        timeout: 300

Basic Configuration

targets:
  - name: sentinel
    type: sentinel
    properties:
      tenant_id: "<your-tenant-id>"
      client_id: "<your-client-id>"
      client_secret: "<your-client-secret>"
      endpoint: "https://<dce-name>-<region>.ingest.monitor.azure.com"  # Direct URL
      streams:
        - name: "Custom-DataStreamLogs"
          dcr_id: "dcr-<immutable-id>"

Multi-Stream Configuration

For multiple tables and streams:

targets:
  - name: sentinel
    type: sentinel
    properties:
      tenant_id: "<your-tenant-id>"
      client_id: "<your-client-id>"
      client_secret: "<your-client-secret>"
      endpoint: "https://<dce-name>-<region>.ingest.monitor.azure.com"
      streams:
        - name: "Custom-SecurityEvents"
          dcr_id: "dcr-<security-dcr-id>"
        - name: "Custom-NetworkEvents"
          dcr_id: "dcr-<network-dcr-id>"
        - name: "Custom-SystemEvents"
          dcr_id: "dcr-<system-dcr-id>"

Field Filtering Configuration

Prevent phantom fields with explicit field management:

targets:
  - name: sentinel
    type: sentinel
    properties:
      tenant_id: "<your-tenant-id>"
      client_id: "<your-client-id>"
      client_secret: "<your-client-secret>"
      endpoint: "https://<dce-name>-<region>.ingest.monitor.azure.com"
      streams:
        - name: "Custom-FilteredLogs"
          dcr_id: "dcr-<filtered-dcr-id>"
      # Use with pipeline processors to filter fields
      field_mapping:
        allowed_fields:
          - "TimeGenerated"
          - "EventID"
          - "Level"
          - "Message"
          - "Computer"

Verification

After completing the setup and assigning permissions:

Wait a few minutes for Azure RBAC to propagate (can take up to 30 minutes)
Start Director with your configuration and monitor the startup logs for connection status
Check the logs for any permission-related or configuration errors
Verify data appears in your Log Analytics workspace table

Troubleshooting

For autodiscovery issues:

Verify that all role assignments are properly configured
Ensure the identity has the correct access scope
Check that Azure RBAC changes have propagated

For manual configuration issues:

Verify the DCE endpoint URL is correct
Confirm the DCR Immutable ID matches your configuration
Ensure the service principal has proper permissions on both the DCR and Log Analytics workspace

How It Works

Autodiscovery
Manual

Resource ID-Based Discovery

Instead of manually configuring the Data Collection Endpoint (DCE) URL, you can provide the DCE Resource ID. For example:

/subscriptions/<subscription-id>/resourceGroups/<resource-group>/providers/Microsoft.Insights/dataCollectionEndpoints/<dce-name>

When using a Resource ID, Director will discover all DCRs associated with the specified DCE, and collect detailed stream information including table names, table schemas (column definitions), and stream configurations.

Caching Mechanism

The default cache duration is 5 minutes. The cache is automatically invalidated when the configuration file (sentinel.yml) is modified or the cache timeout is reached.

Dynamic Updates

The autodiscovery feature continuously adapts to changes in your Sentinel environment, enabling automatic detection of new DCRs, recognition of table schema changes, and discovery and integration of custom tables and columns.

Phantom Field Prevention

Microsoft Sentinel has moved to DCR-based log ingestion and manual schema management. This change, while powerful, can lead to phantom fields, i.e. data fields that are ingested and billed even though they are not part of the table schema and therefore are inaccessible for querying, and yet are still incurring storage costs.

For a comprehensive understanding of phantom fields, see Sentinel Phantom Fields by ManagedSentinel.

Common scenarios that cause phantom fields include log splitting with mismatched schemas, temporary fields in transformations, duplicate fields emerging from improper field mapping, and schema modifications without proper cleanup.

Autodiscovery
Manual

Director's autodiscovery feature includes a built-in phantom field prevention mechanism based on the following:

Schema Validation - Automatically discovers table schemas from DCRs, validates each field against the known schema, and discards fields not present in the table schema.

Dynamic Field Mapping - Fields that exist in the schema or are required are kept while others are discarded.

Cost Optimization - Prevents unnecessary data ingestion thereby reducing storage costs while maintaining data accessibility.

The guiding principles here are:

For schema management, regularly review table schemas, update schemas when adding new fields, use autodiscovery to validate field usage.
For field mapping, let autodiscovery handle field validation, define critical fields explicitly when needed, and monitor for any dropped fields in logs.
For cost monitoring, track ingestion volumes, monitor field usage patterns, and verify data accessibility.

Manual integration requires proactive phantom field prevention since automatic schema validation is not available.

Schema Pre-Definition - Define table schemas in advance and ensure all pipeline processors output only schema-defined fields.

Pipeline Field Filtering - Use remove processors to eliminate fields not present in your target table schema.

Manual Validation - Regularly review ingested data to identify phantom fields and update pipeline configuration.

The most salient reason for preventing phantom fields is to reduce their impact on cost. Some environments show up to 65% of table data as phantom fields.

Microsoft Sentinel Integration

Prerequisites

Autodiscovery Setup

Manual Setup

Service Principal Setup

Data Collection Endpoint Setup

Data Collection Rule Creation

Required Permissions

Configuration

Basic Configuration

Filtered Streams

Cache Configuration

Basic Configuration

Multi-Stream Configuration

Field Filtering Configuration

Verification

How It Works

Resource ID-Based Discovery

Caching Mechanism

Dynamic Updates

Direct Endpoint Configuration

Static Configuration

Schema Management

Phantom Field Prevention

Prerequisites​

Autodiscovery Setup​

Manual Setup​

Service Principal Setup​

Data Collection Endpoint Setup​

Data Collection Rule Creation​

Required Permissions​

Configuration​

Basic Configuration​

Filtered Streams​

Cache Configuration​

Basic Configuration​

Multi-Stream Configuration​

Field Filtering Configuration​

Verification​

How It Works​

Resource ID-Based Discovery​

Caching Mechanism​

Dynamic Updates​

Direct Endpoint Configuration​

Static Configuration​

Schema Management​

Phantom Field Prevention​

Prerequisites

Autodiscovery Setup

Manual Setup

Service Principal Setup

Data Collection Endpoint Setup

Data Collection Rule Creation

Required Permissions

Configuration

Basic Configuration

Filtered Streams

Cache Configuration

Basic Configuration

Multi-Stream Configuration

Field Filtering Configuration

Verification

How It Works

Resource ID-Based Discovery

Caching Mechanism

Dynamic Updates

Direct Endpoint Configuration

Static Configuration

Schema Management

Phantom Field Prevention