Azure Data Share

  • Author: Ronald Fung

  • Creation Date: 15 May 2023

  • Next Modified Date: 15 May 2024


A. Introduction

Azure Data Share enables organizations to securely share data with multiple customers and partners. Data providers are always in control of the data that they’ve shared and Azure Data Share makes it simple to manage and monitor what data was shared, when and by whom.

In today’s world, data is viewed as a key strategic asset that many organizations need to simply and securely share with their customers and partners. There are many ways that customers do this today, including through FTP, e-mail, APIs to name a few. Organizations can easily lose track of who they’ve shared their data with. Sharing data through FTP or through standing up their own API infrastructure is often expensive to provision and administer. There’s management overhead associated with using these methods of sharing on a large scale. In addition to accountability, many organizations would like to be able to control, manage, and monitor all of their data sharing in a simple way that stays up to date, so they can derive timely insights.

Using Data Share, a data provider can share data and manage their shares all in one place. They can stay in control of how their data is handled by specifying terms of use for their data share. The data consumer must accept these terms before being able to receive the data. Data providers can specify the frequency at which their data consumers receive updates. Access to new updates can be revoked at any time by the data provider.

Azure Data Share helps enhance insights by making it easy to combine data from third parties to enrich analytics and AI scenarios. Easily use the power of Azure analytics tools to prepare, process, and analyze data shared with Azure Data Share.

Both the data provider and data consumer must have an Azure subscription to share and receive data. If you don’t have an Azure subscription, create a free account.


B. How is it used at Seagen

As a biopharma research company using Microsoft Azure, you can use Azure Data Share to securely and easily share data with other organizations or teams. Here are some ways you can use Azure Data Share:

  1. Share data with collaborators: Azure Data Share can be used to securely share data with other organizations or teams. You can use Azure Data Share to share data with research partners, universities, or other organizations that are involved in your research process.

  2. Control access to data: Azure Data Share allows you to control access to your data and specify who can access and use the data. You can use Azure Active Directory to manage access to your data and set up granular permissions to control what data can be accessed and by whom.

  3. Automate data sharing: Azure Data Share allows you to automate the data sharing process and schedule data transfers to occur at regular intervals. You can use Azure Logic Apps or Azure Functions to trigger data transfers based on specific events or schedules.

  4. Monitor data sharing: Azure Data Share provides monitoring and auditing capabilities to help you track data sharing activities and ensure compliance with data sharing policies. You can use the Azure portal or APIs to view detailed reports and analysis of data sharing activities.

  5. Secure data sharing: Azure Data Share provides security features such as encryption and secure transfer protocols to ensure that your data is protected during the data sharing process. You can use Azure Key Vault to store your encryption keys and ensure that your data is protected.


C. Features

Azure Data Share currently offers snapshot-based sharing and in-place sharing.

data share

Snapshot-based sharing

In snapshot-based sharing, data moves from the data provider’s Azure subscription and lands in the data consumer’s Azure subscription. As a data provider, you provision a data share and invite recipients to the data share. Data consumers receive an invitation to your data share via e-mail. Once a data consumer accepts the invitation, they can trigger a full snapshot of the data shared with them. This data is received into the data consumers storage account. Data consumers can receive regular, incremental updates to the data shared with them so that they always have the latest version of the data.

Data providers can offer their data consumers incremental updates to the data shared with them through a snapshot schedule. Snapshot schedules are offered on an hourly or a daily basis. When a data consumer accepts and configures their data share, they can subscribe to a snapshot schedule. This is beneficial in scenarios where the shared data is updated regularly, and the data consumer needs the most up-to-date data.

When a data consumer accepts a data share, they’re able to receive the data in a data store of their choice. For example, if the data provider shares data using Azure Blob Storage, the data consumer can receive this data in Azure Data Lake Store. Similarly, if the data provider shares data from an Azure Synapse Analytics, the data consumer can choose whether they want to receive the data into an Azure Data Lake Store, an Azure SQL Database or an Azure Synapse Analytics. If sharing from SQL-based sources, the data consumer can also choose whether they receive data in parquet or csv.

In-place sharing

With in-place sharing, data providers can share data where it resides without copying the data. After sharing relationship is established through the invitation flow, a symbolic link is created between the data provider’s source data store and the data consumer’s target data store. Data consumer can read and query the data in real time using its own data store. Changes to the source data store are available to the data consumer immediately. In-place sharing is currently available for Azure Data Explorer.

Key capabilities

Azure Data Share enables data providers to:

  • Share data from the list of supported data stores with customers and partners outside of your organization

  • Keep track of who you have shared your data with

  • Choice of snapshot or in-place sharing

  • How frequently your data consumers are receiving updates to your data

  • Allow your customers to pull the latest version of your data as needed, or allow them to automatically receive incremental changes to your data at an interval defined by you.

Azure Data Share enables data consumers to:

  • View a description of the type of data being shared

  • View terms of use for the data

  • Accept or reject an Azure Data Share invitation

  • Accept data shared with you into a supported data store.

  • Access data in place or trigger a full or incremental snapshot of shared data

All key capabilities listed above are supported through the Azure portal or via REST APIs. For more details on using Azure Data Share through REST APIs, check out our reference documentation.


D. Where Implemented

LeanIX


E. How it is tested

Testing Azure Data Share involves ensuring that the data sharing service is functioning correctly, securely, and meeting the needs of all stakeholders involved in the project. Here are some steps to follow to test Azure Data Share:

  1. Define the scope and requirements: Define the scope of the project and the requirements of all stakeholders involved in the project. This will help ensure that Azure Data Share is designed to meet the needs of all stakeholders.

  2. Develop test cases: Develop test cases that cover all aspects of Azure Data Share functionality, including data sharing, data access, and data management. The test cases should be designed to meet the needs of the organization, including scalability and resilience.

  3. Conduct unit testing: Test the individual components of Azure Data Share to ensure that they are functioning correctly. This may involve using tools like PowerShell or Azure CLI for automated testing.

  4. Conduct integration testing: Test Azure Data Share in an integrated environment to ensure that it works correctly with other systems and applications. This may involve testing Azure Data Share with different operating systems, browsers, and devices.

  5. Conduct user acceptance testing: Test Azure Data Share with end-users to ensure that it meets their needs and is easy to use. This may involve conducting surveys, interviews, or focus groups to gather feedback from users.

  6. Automate testing: Automate testing of Azure Data Share to ensure that it is functioning correctly and meeting the needs of all stakeholders. This may involve using tools like Azure DevOps to set up automated testing pipelines.

  7. Monitor performance: Monitor the performance of Azure Data Share in production to ensure that it is meeting the needs of all stakeholders. This may involve setting up monitoring tools, such as Azure Monitor, to track usage and identify performance issues.

  8. Address issues: Address any issues that are identified during testing and make necessary changes to ensure that Azure Data Share is functioning correctly and meeting the needs of all stakeholders.

By following these steps, you can ensure that Azure Data Share is tested thoroughly and meets the needs of all stakeholders involved in the project. This can help improve the quality of Azure Data Share and ensure that it functions correctly in a production environment.


F. 2023 Roadmap

????


G. 2024 Roadmap

????


H. Known Issues

There are several known issues that can impact Azure Data Share. Here are some of the most common issues to be aware of:

  1. Data accuracy issues: Data accuracy issues can arise when the data is not updated in real-time. It is important to ensure that data is updated in real-time to avoid data accuracy issues.

  2. Security issues: Security is a critical concern when it comes to Azure Data Share. It is important to ensure that Azure Data Share is secured and that access to the solution is restricted to authorized personnel.

  3. Integration issues: Integration issues can arise when integrating Azure Data Share with other systems and applications. It is important to ensure that Azure Data Share is designed to work seamlessly with other systems and applications to avoid integration issues.

  4. Compatibility issues: Azure Data Share may not be compatible with all platforms, devices, or languages. It is important to ensure that Azure Data Share is compatible with the organization’s existing infrastructure before implementation.

  5. Performance issues: If the data sharing service is not properly sized, it can impact performance and availability, causing issues with the speed and reliability of Azure Data Share.

  6. Metadata issues: Metadata issues can arise when metadata is not updated in real-time. It is important to ensure that metadata is updated in real-time to avoid metadata accuracy issues.

  7. Testing issues: Testing issues can arise when testing Azure Data Share. It is important to ensure that testing is carried out thoroughly and that all aspects of Azure Data Share functionality are tested.

Overall, Azure Data Share requires careful planning and management to ensure that it is functioning correctly and meeting the needs of all stakeholders involved in the project. By being aware of these known issues and taking steps to address them, you can improve the quality of Azure Data Share and ensure the success of your project.


[x] Reviewed by Enterprise Architecture

[x] Reviewed by Application Development

[x] Reviewed by Data Architecture