Skip to main content

Azure Fabric and SQL Database Mirroring

Microsoft Fabric's Mirroring capability allows you to replicate data from Azure SQL Database to Fabric OneLake in near real-time, making it immediately available for analytics tools like Power BI.

This article explains how mirroring works, its impact on production databases, and how it differs from traditional Power BI connection modes.

Overview of Microsoft Fabric and Mirroring

Microsoft Fabric is an all-in-one SaaS platform that unifies data engineering, integration, science, analytics, and BI.

Mirroring is a feature that continuously and automatically replicates data from an operational database, such as Azure SQL Database, to Fabric's OneLake (unified data lake). Data is stored in the open Delta Parquet format, making it directly accessible to all Fabric workloads (SQL Analytics Endpoint, Spark, Power BI).

Key Features

  • No Code / Zero ETL: No need to build or manage complex ETL pipelines (like Data Factory).
  • Near Real-Time: Data changes are reflected in Fabric within seconds to minutes.
  • Direct Lake: Power BI reads Delta Parquet files directly from OneLake without importing data, providing performance comparable to Import mode.

Architecture and Mechanism

Azure SQL Database Mirroring relies on transaction log change detection rather than traditional SQL triggers or scheduled queries.

  1. Change Detection: An internal Change Feed feature within the Azure SQL Database engine scans the Transaction Log to identify data changes (INSERT, UPDATE, DELETE).
  2. Transfer to Landing Zone: Changed data is pushed to a Fabric-managed Landing Zone.
  3. Apply to Delta Tables: A replicator on the Fabric side processes the changes and merges them into Delta Parquet tables in OneLake.

This process is provided as a fully managed service, so users do not need to worry about replication details.

Impact on Production Database

The most critical concern when implementing mirroring is: "Does it impact production transaction processing (OLTP)?"

Overall Assessment

Compared to traditional "ETL running full or incremental SELECT queries," mirroring is designed to have a significantly lower load on the production database. However, the impact is not zero.

Resource Consumption (CPU/IO)

  • Log Scanning: Scans the transaction log, consuming CPU and IO resources on the source database.
  • Initial Snapshot: When mirroring starts (or is re-seeded), all table data is read, potentially causing a temporary high load.
  • Backoff Mechanism: If the system detects overload, the mirroring process has a mechanism to automatically back off (throttle speed) to prioritize the production workload.

Impact on Log Space (Critical)

  • Log Truncation Delay: If mirroring stops or falls significantly behind, transaction log truncation may be held to preserve untransferred log records. This creates a risk of log file growth, potentially leading to disk space exhaustion and preventing updates.
  • This behavior is similar to Replication or CDC (Change Data Capture).

Conclusion

  • It is safe to use for general workloads.
  • For systems with extremely heavy write traffic or little available resource margin (consistently high DTU/vCore utilization), a Proof of Concept (PoC) is recommended.

Comparison with Power BI Connection Modes

Traditionally, "Import" and "DirectQuery" were the primary ways to connect Power BI to Azure SQL Database. Mirroring (Direct Lake) is a third, and often optimal, option.

FeatureMirroring (Direct Lake)Import ModeDirectQuery
Source DB LoadLow - Medium
Efficient log-based replication.
Analytical queries do not hit the source.
Temporarily High
Heavy SELECTs run during refresh.
Analytical queries do not hit the source.
High - Very High
Queries hit source for every report action.
Dangerous with high concurrency.
Data FreshnessNear Real-Time
(Seconds to minutes latency)
Low
Dedepends on scheduled refresh (e.g., 8x/48x daily).
Real-Time
Always shows the latest state.
Report SpeedVery Fast
Reads Parquet via Fabric compute.
Very Fast
Reads from Power BI in-memory.
Slow
Depends on source DB and network.
ETL RequirementNone (Auto-sync)Required (Power Query)None

Recommended Use Cases:

  • Mirroring: When near real-time analytics are needed, and you want zero analytical query load on the source DB, or are handling large datasets.
  • Import: For standalone Power BI reports with small data volumes not using Fabric.
  • DirectQuery: Only when strict real-time (sub-second) is mandatory. Performance tuning is often difficult.

Billing and Cost Implications

Each connection mode influences billing differently.

  1. Mirroring (Fabric)

    • OneLake Storage: You are billed for the storage of the replicated data copy (comparable to Azure Data Lake Storage Gen2 rates).
    • Fabric Capacity (Compute): Querying the data (e.g., via Power BI Direct Lake) consumes Capacity Units (CUs).
    • Replication Process: The compute used for the replication process itself (moving data from SQL DB to OneLake) is currently free (does not consume CUs).
  2. Import Mode

    • Power BI License: Requires Pro, Premium Per User, or Premium Capacity depending on dataset size.
    • Hidden Costs: You may need to provision a higher tier (more vCores) for the source database just to handle the intense resource spikes during scheduled refresh windows. Data egress fees apply if regions differ.
  3. DirectQuery

    • Source DB Resources: Since every report interaction triggers a query, you often need to significantly over-provision the source database to maintain acceptable dashboard performance, making this potentially the most expensive option for the database side.

Limitations and Prerequisites

Check the following limitations before implementation (based on 2024 information):

  1. Tier Limits:
    • vCore model: Supported on all tiers.
    • DTU model: Less than 100 DTUs (Basic, Standard S0-S2) are not supported. S3 or higher is required.
  2. Exclusivity:
    • Cannot be enabled if CDC (Change Data Capture) or Azure Synapse Link for SQL is already enabled on the database.
  3. Network:
    • Special attention is needed if the logical server is in a VNet or has public access disabled.
  4. Table Requirements:
    • Tables with Primary Keys are recommended (support for heap tables is rolling out, but PK is best practice).
    • Constraints may apply to specific data types (JSON, Geometry, etc.) or features (Temporal Tables).

Security and Authentication (Best Practice)

Microsoft recommends using Workspace Identity as the best practice for connecting to SQL Database.

Using Workspace Identity eliminates the need for password management and minimizes security risks by synchronizing permissions with the workspace lifecycle (deleting the workspace removes access).

Prerequisites:

  • The "Use workspace identity" setting must be enabled in the Fabric Admin Portal.
  • A Microsoft Entra Admin must be configured on the active Azure SQL Database logical server.

User Creation and Permission Assignment (Least Privilege Approach)

Connect to the SQL Database as the Entra Admin and execute the following queries. While db_owner is an option, the following role configuration allows for more granular permission control.

  1. Operations in Master Database

    Grant server-level read permissions.

    -- Execute in Master database as Admin
    -- * Replace [MyFabricWorkspace] with your actual Fabric Workspace name
    CREATE LOGIN [MyFabricWorkspace] FROM EXTERNAL PROVIDER;
    ALTER SERVER ROLE [##MS_ServerStateReader##] ADD MEMBER [MyFabricWorkspace];
  2. Operations in Target Database

    Grant permissions on the database to be mirrored.

    -- Execute in the target user database
    CREATE USER [MyFabricWorkspace] FOR LOGIN [MyFabricWorkspace];

    -- Grant permissions required for mirroring configuration and data reading
    GRANT SELECT, ALTER ANY EXTERNAL MIRROR TO [MyFabricWorkspace];

Authentication Method Comparison

MethodRecommendationPros & Cons
Workspace IdentityRecommendedNo passwords. Security lifecycle is automated.
SQL AuthenticationNot RecommendedRequires regular password rotation and leak risk management.
Service PrincipalAcceptableGood for automation, but requires managing App Registrations and secrets separately.

Summary

Azure Fabric Mirroring is a powerful feature that breaks down the walls between operational databases and analytics platforms.

  • Load Isolation: Offload analytical query load from the production DB to OneLake in Fabric.
  • Freshness & Speed: Combines the real-time nature of DirectQuery with the performance of Import Mode (Direct Lake).
  • Simplicity: Frees you from managing complex ETL pipelines.

If you are facing refresh delays or source DB load issues in your existing Power BI environment, migrating to Fabric Mirroring is worth considering.