Tag Archives: ADLS

SQL Managed Instance Push to Databricks Delta Live Tables via CETAS and APIs

Let’s face it, a lot of a data engineer’s time is spent waiting to see if things executed as expected or for data to be refreshed; We write pipelines, buy expensive replication software, or sometime manually move files (I hope we still aren’t in this day and age), and in the end all of this has a cost associated with it when working in a cloud environment. In the case of Databricks jobs, we often find ourselves creating clusters just to move data, where the cluster lays dormant for the most part during these extractions. In my eyes, that’s wasteful and could probably be improved upon.

Continue reading

Querying Delta Lake with T-SQL via Synapse Serverless and Managed Instance

In this blog post I try to demystify how to setup an environment that utilizes Azure Synapse Serverless, Delta Lake on ADLS Gen2, and SQL Managed Instances to enable you to query your delta lake with T-SQL as if it were any other SQL source in order to accomplish something like polybase.

Continue reading