Configure ADLS delimited pipelines
To support scalable and efficient schema discovery in Azure Data Lake Storage (ADLS) delimited pipelines, Delphix Compliance Services relies on direct integration between Azure Data Factory (ADF) and the Azure Blob Storage REST API. To allow Azure Data Factory (ADF) pipelines to access the Azure Blob Storage API, you need to add a system-assigned managed identity to the storage account and assign it to your ADF instance.
Adding a system-assigned managed identity enables ADF pipelines to:
-
Connect to the Blob Storage REST API.
-
Perform large-scale and recursive directory and file discovery.
-
Support schema identification at scale, even in environments with tens of thousands of files and folders.
Steps
-
In the Azure portal, go to your storage account, and click Access Control (IAM).
- In the Access Control (IAM) pane, click Add role assignment.
- In Role, search for and select Storage Blob Data Contributor, then click Next.
- In Assign access to, select a managed identity.
- Click Select members and add a managed identity for the Data Factory (v2).
- Select the appropriate identity and click Select.
- Review the role assignment details and click Review, then Assign to complete the process.
Once these steps are completed, your ADF instance has the necessary permissions to securely access the Azure Blob Storage API.