Skip to main content

Federated Search

Modern security and observability workflows generate data everywhere — on-premises file systems, object storage buckets, edge collectors, and dozens of other locations. Traditionally, acting on that data meant ingesting it into a central SIEM first. That creates cost, latency, and complexity: you pay to move and store data before you even know whether it is worth keeping.

Federated Search changes that. It lets you query data directly where it lives, using the same query language you already know, without requiring any upfront ingestion or centralization. You get answers faster, spend less on storage pipelines, and retain full control over where your data resides.


How It Works

VirtualMetric Federated Search runs queries against local and remote storage targets — such as local filesystems, S3-compatible buckets, and other structured data sources — at query time. Instead of pulling data into a pipeline and forwarding it to a destination, the query travels to the data. Only the results come back.

This works in two contexts within the VirtualMetric platform.

Pipeline Enrichment — The enrich processor can execute KQL or SQL queries inline during pipeline processing, joining live event data against lookup tables, CSV files, or datasets sitting in local storage. No separate ingestion step is needed.

Federated Query — Queries can be issued directly against storage targets outside of pipeline processing, letting you explore, audit, or investigate data that has never been forwarded anywhere.

Both modes use the same KQL query interface, so there is no new language to learn if you are already working with Microsoft Sentinel or Azure Data Explorer.


Why It Matters

Stop Paying to Move Data You May Never Need

Not every log line is worth the cost of ingestion and long-term SIEM retention. With Federated Search, you can leave cold or low-priority data in cheap local or object storage and query it on demand. If an investigation requires it, you query it. If it never becomes relevant, you never paid to move it.

Faster Investigations Without Centralization Lag

When an incident occurs, waiting for a pipeline to backfill historical data into a SIEM can cost hours. Federated Search queries the source directly, so there is no lag between "data exists" and "data is queryable."

Use the Query Language You Already Know

VirtualMetric's federated query engine is built around Kusto Query Language (KQL) — the same language used in Microsoft Sentinel and Azure Data Explorer. Security teams already writing Sentinel detection rules, hunting queries, and workbooks can apply that knowledge directly to local storage without translation or retraining.

Keep Data Where Compliance Requires It

Some data cannot leave a region, a network segment, or an on-premises environment. Federated Search respects those boundaries. The data does not move; the query does.

Reduce SIEM Ingestion Costs

SIEM pricing is typically tied to ingestion volume. By keeping lower-priority or high-volume telemetry in local storage and querying it federally only when needed, you can significantly reduce the volume of data flowing into your SIEM — without losing the ability to investigate it.


Normalized Schema Views

One of the most significant challenges in multi-source querying is that raw data arrives in vendor-specific, native formats. A firewall log from one vendor looks nothing like one from another, even when both are describing the same network event. Writing detection queries against raw formats means writing and maintaining a different query for every source.

VirtualMetric solves this by exposing your data — regardless of its original format — through normalized schema views. Even though the underlying source data remains in its native structure, VirtualMetric presents it through familiar, standardized table schemas at query time. You write your query once, against a well-known schema, and it works across all sources mapped to that schema.

This means a detection query written for Microsoft Sentinel against SecurityEvent or ASIMNetworkSessionLogs can run as-is inside VirtualMetric — against live streaming data or data sitting in local storage — without any modification to the query itself.


KQL Support

VirtualMetric implements a full KQL-to-SQL compilation layer that translates KQL queries into the appropriate SQL dialect for the target storage backend. Supported dialects include SQLite, MySQL, ClickHouse, and PostgreSQL.

This means you can write a single KQL query and run it against any of these backends without changing the query itself. The compiler handles dialect-specific syntax differences, function mappings, and rewrite rules automatically.

For a full breakdown of which KQL operators, functions, and statements are supported per dialect, see the KQL Support Matrix.


Early Threat Detection Roadmap — 2026 Q3

The long-term vision for Federated Search goes beyond ad-hoc querying. Early Threat Detection will allow security teams to run detection queries — including existing Microsoft Sentinel analytic rules — continuously against both live streaming data and local storage, before data ever reaches a SIEM.

This closes two critical gaps in most security architectures.

The first is the storage gap: logs sitting in a local collector or an on-premises storage tier waiting to be forwarded are currently blind spots. Early Threat Detection will make them queryable with the same detection logic you use in Sentinel, so threats hiding in cold or queued data do not go undetected.

The second is the stream gap: even data that is actively flowing through a pipeline is not visible to your SIEM until it arrives. Early Threat Detection will run detection queries directly on the stream, against normalized schema views, so that a match can be flagged the moment it passes through VirtualMetric — not minutes or hours later when it lands in Sentinel.

Because VirtualMetric normalizes source data into standard schema views at the pipeline layer, detection queries do not need to account for vendor-specific field names or formats. A Sentinel analytic rule written against ASIMNetworkSessionLogs will work against traffic from any source that maps to that schema, whether the data is streaming live or sitting in an S3 bucket.

Early Threat Detection is planned for 2026 Q3.

As the first step toward that goal, full KQL support is being delivered now. Microsoft Sentinel customers can already bring their detection queries into VirtualMetric pipelines today — through the enrich processor and the federated query interface — and run them against local data sources. When Early Threat Detection ships, those same queries will run automatically and continuously, without manual intervention.

The path forward looks like this:

StageStatusDescription
Full KQL support✅ Available nowWrite and run KQL queries against local storage via pipelines and federated query
Normalized schema views✅ Available nowQuery any source data through standard schemas like CommonSecurityLog, SecurityEvent, and ASIM tables
Microsoft Sentinel query compatibility✅ Available nowSentinel detection rules and hunting queries work without modification
Early Threat Detection — Storage🗓 Roadmap — 2026 Q3Continuous, automated detection against local storage before SIEM ingestion
Early Threat Detection — Stream🗓 Roadmap — 2026 Q3Real-time detection on live streaming data, before it reaches the SIEM

Next Steps