Skip to main content

Autodiscovery

Overview

VirtualMetric Director provides an autodiscovery feature for Microsoft Sentinel integration. This feature enables automatic detection and configuration of Data Collection Rules (DCRs) and their associated streams, simplifying the setup process and providing dynamic updates as your Sentinel environment changes.

Prerequisites

VirtualMetric Sentinel Automation tool requires the following prerequisites:

  • An Azure subscription with permissions to create resources
  • Microsoft Sentinel requires a Log Analytics workspace - we'll create one during this setup
  • Administrative access to run PowerShell or Bash commands

Open a PowerShell or Terminal with Admin rights, and navigate to <vm_root>. Then, type the following command and press Enter.

C:\<vm_root>\vmetric-director -sentinel -autodiscovery

Follow the on-screen prompts to complete the setup process. For detailed step-by-step instructions, refer to our Microsoft Sentinel Automation documentation.

tip

The Autodiscovery configuration is handled automatically when using the Automation tool. You only need to complete the "Required Permissions" step if you're using Managed Identity.

Required Permissions

VirtualMetric Director needs the following permissions to fetch Data Collection Rules and their associated streams:

note

If you used the Automation tool with App Registration, these permissions are already configured automatically.

Data Collection Rules (DCR) Permissions

For each DCR with a name starting with "vmetric":

  1. Navigate to the DCR in Azure Portal
  2. Go to Access Control (IAM)
  3. Select Add > Add role assignment
  4. Assign these permissions:
    • Role: Monitoring Metrics Publisher
    • Assignee: Your Managed Identity or Application

Autodiscovery Permissions

To enable DCR autodiscovery features:

  1. Navigate to the Resource Group containing your DCRs
  2. Go to Access Control (IAM)
  3. Select Add > Add role assignment
  4. Assign these permissions:
    • Role: Monitoring Reader
    • Assignee: Your Managed Identity or Application
important

The Monitoring Reader role should be assigned at the Resource Group level only.

Assigning this role at the Subscription level is not recommended as it:

  • Is unnecessary for functionality
  • Increases the autodiscovery scan duration

Verification Steps

After assigning permissions:

  1. Wait a few minutes for Azure RBAC to propagate
  2. Test the connection using VirtualMetric Director
  3. Check the logs for any permission-related errors
tip

If you encounter permission issues, verify that:

  • All role assignments are properly configured
  • Azure RBAC changes have propagated (can take up to 30 minutes)
  • The identity has the correct scope of access

How It Works

Resource ID-Based Discovery

Instead of manually configuring the Data Collection Endpoint (DCE) URL, you can provide the DCE Resource ID. For example:

/subscriptions/<subscription-id>/resourceGroups/<resource-group>/providers/Microsoft.Insights/dataCollectionEndpoints/<dce-name>

When using a Resource ID, VirtualMetric Director will:

  • Discover all DCRs associated with the specified DCE
  • Collect detailed stream information including:
    • Table names
    • Table schemas (column definitions)
    • Stream configurations

Caching Mechanism

  • Default cache duration: 5 minutes
  • Cache is automatically invalidated when:
    • The configuration file (sentinel.yml) is modified
    • The cache timeout is reached

Dynamic Updates

The autodiscovery feature continuously adapts to changes in your Sentinel environment:

  • New DCRs are automatically detected
  • Changes to table schemas are recognized
  • Custom tables and columns are discovered and integrated

Phantom Fields Prevention

Understanding Phantom Fields

Microsoft Sentinel has moved to DCR-based log ingestion and manual schema management. This change, while powerful, can lead to "phantom fields" - data fields that are:

  • Ingested and billed
  • Not part of the table schema
  • Inaccessible for querying
  • Still incurring storage costs

Credit: For a comprehensive understanding of phantom fields, see Sentinel Phantom Fields by ManagedSentinel.

How Autodiscovery Prevents Phantom Fields

VirtualMetric Director's autodiscovery feature includes built-in phantom field prevention:

  1. Schema Validation

    • Automatically discovers table schemas from DCRs
    • Validates each field against the known schema
    • Discards fields not present in the table schema
  2. Dynamic Field Mapping

    # Example of how fields are mapped
    input_event:
    field1: "value1" # Exists in schema - Kept
    field2: "value2" # Not in schema - Discarded
    TimeGenerated: "..." # Required field - Always kept
  3. Cost Optimization

    • Prevents unnecessary data ingestion
    • Reduces storage costs
    • Maintains data accessibility

Why Phantom Fields Matter

  1. Cost Impact

    • Each phantom field increases ingestion costs
    • Data remains inaccessible despite being paid for
    • Some environments show up to 65% of table data as phantom fields
  2. Common Scenarios Prevented

    • Log splitting with mismatched schemas
    • Temporary fields in transformations
    • Duplicate fields from improper field mapping
    • Schema modifications without proper cleanup
  3. Automatic Protection

    // Example of how Director handles fields internally
    if !schemaContainsField(tableName, fieldName) {
    // Field not in schema - discard to prevent phantom field
    continue
    }

Best Practices with Autodiscovery

  1. Schema Management

    • Regularly review table schemas
    • Update schemas when adding new fields
    • Use autodiscovery to validate field usage
  2. Field Mapping

    • Let autodiscovery handle field validation
    • Define critical fields explicitly when needed
    • Monitor for any dropped fields in logs
  3. Cost Monitoring

    • Track ingestion volumes
    • Monitor field usage patterns
    • Verify data accessibility

Configuration

Basic Setup

targets:
- name: sentinel
type: sentinel
properties:
tenant_id: "<your-tenant-id>"
client_id: "<your-client-id>"
client_secret: "<your-client-secret>"
endpoint: "/subscriptions/...</dataCollectionEndpoints/<dce-name>" # Use Resource ID instead of URL

Filtering Specific Streams

You can filter which autodiscovered streams to use:

targets:
- name: sentinel
type: sentinel
properties:
tenant_id: "<your-tenant-id>"
client_id: "<your-client-id>"
client_secret: "<your-client-secret>"
endpoint: "/subscriptions/...</dataCollectionEndpoints/<dce-name>"
streams:
- name: "Custom-WindowsEvent"
- name: "Custom-SecurityEvent"

Cache Configuration

Optionally adjust the cache timeout (in seconds):

targets:
- name: sentinel
type: sentinel
properties:
endpoint: "/subscriptions/...</dataCollectionEndpoints/<dce-name>"
cache:
timeout: 300 # 5 minutes (default)