One of the most common and useful pipelines you’ll build in Microsoft Fabric is moving data from Azure Blob Storage into a Lakehouse. This is often the first step in preparing data for analytics and reporting. In this post, we’ll walk through how to set up and run this pipeline.
Steps Overview
Here’s the high-level process we’ll follow:
- Create a Connection (similar to Linked Services in Azure Data Factory)
- Build a Pipeline and add a Copy activity
- Configure the Source (Azure Blob Storage)
- Configure the Destination (Fabric Lakehouse)
- Run and Validate the Pipeline
Step 1: Create the Connection to Azure Blob Storage
First, we need to create a connection between Fabric and Azure Blob Storage so that the pipeline can access the source files.
- In your Fabric workspace, go to Manage connections and gateways. Here it shows all available connections which were already created and u have access to it
- Click New under connections tab.
- In the panel that opens on the right, select Cloud and choose Azure Blob Storage as the connection type. And fill the other properties as below.
- Connection Name: Give it a descriptive name
- Account: Your Azure Storage Account name
- Domain: Gen1 → blob.core.windows.net | Gen2 → dfs.core.windows.net
- Authentication Method: Choose based on your organization’s standards (e.g., Service Principal or Managed Identity)
- Privacy Level: Set to Organizational or Private
Once saved, your connection is ready and it can you used in any pipeline
Step 2: Create the Pipeline
- Go back to your workspace and create a new Data Pipeline (e.g., name it CopyBlobtoLH).
- From the Home screen, add a Copy activity.
Step 3: Configure the Source
Now, we’ll configure the source settings to point to the files stored in Azure Blob Storage.
- In the Copy activity, choose the connection you just created.
- Under File Path, browse to and select a sample file in Blob Storage (for this demo, we’ll use CSV files from Azure sample datasets).
- Set the file format to Delimited Text (CSV).
- Configure additional file properties if required (similar to dataset settings in ADF) using the settings option
Step 4: Configure the Destination (Lakehouse)
Once the source is set, the next step is to configure the destination, which is our Lakehouse.
- From the Connections dropdown, select your Lakehouse.
- Under Destination, choose Tables. If a target table already exists, select it else click New and provide a table name. (Optional) If you want to partition the data, enable Partitioning and specify the partition column. For simplicity, we’ll skip partitioning in this demo.
Step 5: Run and Validate
- Save the pipeline and click Run
- Once execution is complete, go to your Lakehouse, refresh the Tables section, and verify that the data has been loaded into the selected or newly created table
Conclusion:
That’s it! With just a few steps, you can copy data from Azure Blob Storage into a Fabric Lakehouse table using pipelines. This is a straightforward way to start working with external data in Fabric. If you run into issues or have questions, feel free to drop a comment below—I’d be happy to help.