Autodiscovery
Overview
VirtualMetric Director provides an autodiscovery feature for Microsoft Sentinel integration. This feature enables automatic detection and configuration of Data Collection Rules (DCRs) and their associated streams, simplifying the setup process and providing dynamic updates as your Sentinel environment changes.
Prerequisites
VirtualMetric Sentinel Automation tool requires the following prerequisites:
- An Azure subscription with permissions to create resources
- Microsoft Sentinel requires a Log Analytics workspace - we'll create one during this setup
- Administrative access to run PowerShell or Bash commands
Open a PowerShell or Terminal with Admin rights, and navigate to <vm_root>
. Then, type the following command and press
- PowerShell
- Bash
C:\<vm_root>\vmetric-director -sentinel -autodiscovery
~/path/to/<vm_root>: ./vmetric-director -sentinel -autodiscovery
Follow the on-screen prompts to complete the setup process. For detailed step-by-step instructions, refer to our Microsoft Sentinel Automation documentation.
The Autodiscovery configuration is handled automatically when using the Automation tool. You only need to complete the "Required Permissions" step if you're using Managed Identity.
Required Permissions
VirtualMetric Director needs the following permissions to fetch Data Collection Rules and their associated streams:
If you used the Automation tool with App Registration, these permissions are already configured automatically.
Data Collection Rules (DCR) Permissions
For each DCR with a name starting with "vmetric":
- Navigate to the DCR in Azure Portal
- Go to Access Control (IAM)
- Select Add > Add role assignment
- Assign these permissions:
- Role:
Monitoring Metrics Publisher
- Assignee: Your Managed Identity or Application
- Role:
Autodiscovery Permissions
To enable DCR autodiscovery features:
- Navigate to the Resource Group containing your DCRs
- Go to Access Control (IAM)
- Select Add > Add role assignment
- Assign these permissions:
- Role:
Monitoring Reader
- Assignee: Your Managed Identity or Application
- Role:
The Monitoring Reader
role should be assigned at the Resource Group level only.
Assigning this role at the Subscription level is not recommended as it:
- Is unnecessary for functionality
- Increases the autodiscovery scan duration
Verification Steps
After assigning permissions:
- Wait a few minutes for Azure RBAC to propagate
- Test the connection using VirtualMetric Director
- Check the logs for any permission-related errors
If you encounter permission issues, verify that:
- All role assignments are properly configured
- Azure RBAC changes have propagated (can take up to 30 minutes)
- The identity has the correct scope of access
How It Works
Resource ID-Based Discovery
Instead of manually configuring the Data Collection Endpoint (DCE) URL, you can provide the DCE Resource ID. For example:
/subscriptions/<subscription-id>/resourceGroups/<resource-group>/providers/Microsoft.Insights/dataCollectionEndpoints/<dce-name>
When using a Resource ID, VirtualMetric Director will:
- Discover all DCRs associated with the specified DCE
- Collect detailed stream information including:
- Table names
- Table schemas (column definitions)
- Stream configurations
Caching Mechanism
- Default cache duration: 5 minutes
- Cache is automatically invalidated when:
- The configuration file (
sentinel.yml
) is modified - The cache timeout is reached
- The configuration file (
Dynamic Updates
The autodiscovery feature continuously adapts to changes in your Sentinel environment:
- New DCRs are automatically detected
- Changes to table schemas are recognized
- Custom tables and columns are discovered and integrated
Phantom Fields Prevention
Understanding Phantom Fields
Microsoft Sentinel has moved to DCR-based log ingestion and manual schema management. This change, while powerful, can lead to "phantom fields" - data fields that are:
- Ingested and billed
- Not part of the table schema
- Inaccessible for querying
- Still incurring storage costs
Credit: For a comprehensive understanding of phantom fields, see Sentinel Phantom Fields by ManagedSentinel.
How Autodiscovery Prevents Phantom Fields
VirtualMetric Director's autodiscovery feature includes built-in phantom field prevention:
-
Schema Validation
- Automatically discovers table schemas from DCRs
- Validates each field against the known schema
- Discards fields not present in the table schema
-
Dynamic Field Mapping
# Example of how fields are mapped
input_event:
field1: "value1" # Exists in schema - Kept
field2: "value2" # Not in schema - Discarded
TimeGenerated: "..." # Required field - Always kept -
Cost Optimization
- Prevents unnecessary data ingestion
- Reduces storage costs
- Maintains data accessibility
Why Phantom Fields Matter
-
Cost Impact
- Each phantom field increases ingestion costs
- Data remains inaccessible despite being paid for
- Some environments show up to 65% of table data as phantom fields
-
Common Scenarios Prevented
- Log splitting with mismatched schemas
- Temporary fields in transformations
- Duplicate fields from improper field mapping
- Schema modifications without proper cleanup
-
Automatic Protection
// Example of how Director handles fields internally
if !schemaContainsField(tableName, fieldName) {
// Field not in schema - discard to prevent phantom field
continue
}
Best Practices with Autodiscovery
-
Schema Management
- Regularly review table schemas
- Update schemas when adding new fields
- Use autodiscovery to validate field usage
-
Field Mapping
- Let autodiscovery handle field validation
- Define critical fields explicitly when needed
- Monitor for any dropped fields in logs
-
Cost Monitoring
- Track ingestion volumes
- Monitor field usage patterns
- Verify data accessibility
Configuration
Basic Setup
targets:
- name: sentinel
type: sentinel
properties:
tenant_id: "<your-tenant-id>"
client_id: "<your-client-id>"
client_secret: "<your-client-secret>"
endpoint: "/subscriptions/...</dataCollectionEndpoints/<dce-name>" # Use Resource ID instead of URL
Filtering Specific Streams
You can filter which autodiscovered streams to use:
targets:
- name: sentinel
type: sentinel
properties:
tenant_id: "<your-tenant-id>"
client_id: "<your-client-id>"
client_secret: "<your-client-secret>"
endpoint: "/subscriptions/...</dataCollectionEndpoints/<dce-name>"
streams:
- name: "Custom-WindowsEvent"
- name: "Custom-SecurityEvent"
Cache Configuration
Optionally adjust the cache timeout (in seconds):
targets:
- name: sentinel
type: sentinel
properties:
endpoint: "/subscriptions/...</dataCollectionEndpoints/<dce-name>"
cache:
timeout: 300 # 5 minutes (default)