Hybrid SQL–Synapse Data Mesh Blueprint

Hybrid SQL–Synapse Data Mesh

This blueprint enables a domain-driven data mesh spanning on-prem SQL Server, Azure SQL, and Azure Synapse Analytics, with governed data products that publish read-optimized interfaces to consumers. It balances regulatory controls, performance, and cost using serverless/elastic compute, caching, and layered storage.

When to Use

Logical Architecture

Core Components

ComponentPurposeNotes
Azure Data Factory / Synapse PipelinesBatch ingestion & orchestrationIR for hybrid connectivity; trigger on CDC windows; parameterized pipelines.
SQL MI Link / DMS + CDCLow-latency incremental replicationUse for near-real time ingestion from on-prem SQL to ADLS/Synapse.
ADLS Gen2 (Raw→Curated→Product)Lake zoning & lifecycleImmutable raw, curated conformed, product for domain contracts & SLAs.
Synapse (Serverless + Dedicated)Virtualization + MPP warehousingServerless for cheap, elastic access; dedicated for heavy, repeatable workloads.
Azure SQL (read replicas)Operational serving & APIsPublish productized tables to consumer apps; optionally read replicas for scale.
Microsoft PurviewCatalog, lineage, DQ policiesRegister domains, enforce PII tags, publish product contracts.
Entra ID + Private EndpointsZero-trust access controlMSI, RBAC, row-level & column-level security; VNet injection.

Domain Data Products

Each domain publishes a contract (schema + semantics + SLOs) and exposes governed interfaces:Synapse views/external tables, Azure SQL tables, or Power BI semantic models.

Security & Compliance

Resilience & DR

Performance & Cost (FinOps)

Reference Diagram

Hybrid SQL–Synapse Data Mesh Diagram

Example: Serverless External Table on ADLS

-- In Synapse Serverless
CREATE EXTERNAL DATA SOURCE ds_adls
WITH ( TYPE = HADOOP, LOCATION = 'abfss://curated@<storageaccount>.dfs.core.windows.net' );

CREATE EXTERNAL FILE FORMAT parquet_format
WITH ( FORMAT_TYPE = PARQUET );

CREATE EXTERNAL TABLE product_payments
WITH (
  LOCATION = '/payments/v1/',
  DATA_SOURCE = ds_adls,
  FILE_FORMAT = parquet_format
)
AS SELECT * FROM OPENROWSET(
  BULK 'abfss://curated@<storageaccount>.dfs.core.windows.net/payments/v1/*.parquet',
  FORMAT='PARQUET'
) AS src;

Example: Contracted Consumer View

CREATE VIEW product_payments_v1 AS
SELECT
  transaction_id,
  merchant_id,
  amount,
  currency,
  txn_ts AT TIME ZONE 'UTC' AS txn_utc,
  customer_id  -- masked / RLS applied downstream
FROM product_payments
WHERE txn_ts &gt;= DATEADD(day, -90, SYSUTCDATETIME());

Operational Runbook (Brief)

  1. Provision Purview, workspaces, storage (raw/curated/product), Key Vault, private endpoints.
  2. Stand up ingestion (CDC + batch) with parameterized pipelines per domain.
  3. Implement DQ checks → promote to Product only on pass.
  4. Expose product contracts (views/tables/models) and register in Purview.
  5. Set SLOs, alerts (latency, freshness, failure), and weekly cost/usage reviews.

Summary

This hybrid SQL–Synapse mesh lets domains publish governed, reliable data products with predictable SLOs while controlling cost via serverless access and tiered storage. It’s a pragmatic path for regulated institutions to modernize without a risky big-bang migration.