Azure Cosmos DB
Author: Ronald Fung
Creation Date: May 12, 2023
Next Modified Date: May 12, 2024
A. Introduction
Today’s applications are required to be highly responsive and always online. To achieve low latency and high availability, instances of these applications need to be deployed in datacenters that are close to their users. Applications need to respond in real time to large changes in usage at peak hours, store ever increasing volumes of data, and make this data available to users in milliseconds.
Azure Cosmos DB is a fully managed NoSQL and relational database for modern app development. Azure Cosmos DB offers single-digit millisecond response times, automatic and instant scalability, along with guarantee speed at any scale. Business continuity is assured with SLA-backed availability and enterprise-grade security.
App development is faster and more productive thanks to:
Turnkey multi region data distribution anywhere in the world
Open source APIs
SDKs for popular languages.
As a fully managed service, Azure Cosmos DB takes database administration off your hands with automatic management, updates and patching. It also handles capacity management with cost-effective serverless and automatic scaling options that respond to application needs to match capacity with demand.
You can Try Azure Cosmos DB for Free without an Azure subscription, free of charge and commitments or use the Azure Cosmos DB free tier to get an account with the first 1000 RU/s and 25 GB of storage free.
B. How is it used at Seagen
As a biopharma research company using Microsoft Azure, you can use Azure Cosmos DB to store and process large amounts of data, including structured, semi-structured, and unstructured data. Here are some ways you can use Azure Cosmos DB:
Fast and scalable data storage: Azure Cosmos DB can help you to store and process large amounts of data at scale, with low latency and high throughput. It supports multiple data models, including document, key-value, graph, and column-family.
Multi-cloud and multi-model support: Azure Cosmos DB is a multi-cloud and multi-model database service that can run on any cloud platform and support multiple data models.
Global distribution: Azure Cosmos DB can replicate your data across multiple regions and provide low-latency access to your data from anywhere in the world.
Security and compliance: Azure Cosmos DB provides security features such as encryption, access control, and network isolation. It also complies with regulatory requirements such as GDPR, HIPAA, and ISO.
Real-time analytics: Azure Cosmos DB can help you to perform real-time analytics on your data using built-in features such as change feed, aggregation pipeline, and graph traversal.
Integration with other Azure services: Azure Cosmos DB can integrate with other Azure services such as Azure Functions, Azure Stream Analytics, and Azure Search. This can help you to extend the functionality of your database and create more complex workflows.
Overall, Azure Cosmos DB can help your biopharma research company to store and process large amounts of data at scale, with low latency and high throughput. With fast and scalable data storage, multi-cloud and multi-model support, global distribution, security and compliance, real-time analytics, and integration with other Azure services, Azure Cosmos DB can help you to create more powerful and efficient applications that meet the needs of your customers and employees.
C. Features
Guaranteed speed at any scale
Gain unparalleled SLA-backed speed and throughput, fast global access, and instant elasticity.
Real-time access with fast read and write latencies globally, and throughput and consistency all backed by SLAs
Multi-region writes and data distribution to any Azure region with just a button.
Independently and elastically scale storage and throughput across any Azure region – even during unpredictable traffic bursts – for unlimited scale worldwide.
Simplified application development
Build fast with open source APIs, multiple SDKs, schemaless data and no-ETL analytics over operational data.
Deeply integrated with key Azure services used in modern (cloud-native) app development including Azure Functions, IoT Hub, AKS (Azure Kubernetes Service), App Service, and more.
Choose from multiple database APIs including the native API for NoSQL, MongoDB, PostgreSQL, Apache Cassandra, Apache Gremlin, and Table.
Build apps on API for NoSQL using the languages of your choice with SDKs for .NET, Java, Node.js and Python. Or your choice of drivers for any of the other database APIs.
Change feed makes it easy to track and manage changes to database containers and create triggered events with Azure Functions.
Azure Cosmos DB’s schema-less service automatically indexes all your data, regardless of the data model, to deliver blazing fast queries.
Mission-critical ready
Guarantee business continuity, 99.999% availability, and enterprise-level security for every application.
Azure Cosmos DB offers a comprehensive suite of SLAs including industry-leading availability worldwide.
Easily distribute data to any Azure region with automatic data replication. Enjoy zero downtime with multi-region writes or RPO 0 when using Strong consistency.
Enjoy enterprise-grade encryption-at-rest with self-managed keys.
Azure role-based access control keeps your data safe and offers fine-tuned control.
Fully managed and cost-effective
End-to-end database management, with serverless and automatic scaling matching your application and TCO needs
Fully managed database service. Automatic, no touch, maintenance, patching, and updates, saving developers time and money.
Cost-effective options for unpredictable or sporadic workloads of any size or scale, enabling developers to get started easily without having to plan or manage capacity.
Serverless model offers spiky workloads automatic and responsive service to manage traffic bursts on demand.
Autoscale provisioned throughput automatically and instantly scales capacity for unpredictable workloads, while maintaining SLAs.
Azure Synapse Link for Azure Cosmos DB
Azure Synapse Link for Azure Cosmos DB is a cloud-native hybrid transactional and analytical processing (HTAP) capability that enables near real time analytics over operational data in Azure Cosmos DB. Azure Synapse Link creates a tight seamless integration between Azure Cosmos DB and Azure Synapse Analytics.
Reduced analytics complexity with No ETL jobs to manage.
Near real-time insights into your operational data.
No effect on operational workloads.
Optimized for large-scale analytics workloads.
Cost effective.
Analytics for locally available, globally distributed, multi-region writes.
Native integration with Azure Synapse Analytics.
Solutions that benefit from Azure Cosmos DB
Web, mobile, gaming, and IoT application that handle massive amounts of data, reads, and writes at a global scale with near-real response times for various data will benefit from Azure Cosmos DB. Azure Cosmos DB’s guaranteed high availability, high throughput, low latency, and tunable consistency are huge advantages when building these types of applications. Learn about how Azure Cosmos DB can be used to build IoT and telematics, retail and marketing, gaming and web and mobile applications.
D. Where implemented
E. How it is tested
Testing Azure Cosmos DB involves ensuring that the database service is functioning correctly, securely, and meeting the needs of all stakeholders involved in the project. Here are some steps to follow to test Azure Cosmos DB:
Define the scope and requirements: Define the scope of the project and the requirements of all stakeholders involved in the project. This will help ensure that Azure Cosmos DB is designed to meet the needs of all stakeholders.
Develop test cases: Develop test cases that cover all aspects of Azure Cosmos DB functionality, including database creation, data insertion, retrieval, and management. The test cases should be designed to meet the needs of the organization, including scalability and resilience.
Conduct unit testing: Test the individual components of Azure Cosmos DB to ensure that they are functioning correctly. This may involve using tools like Postman or Azure Cosmos DB Emulator for automated testing.
Conduct integration testing: Test Azure Cosmos DB in an integrated environment to ensure that it works correctly with other systems and applications. This may involve testing Azure Cosmos DB with different operating systems, browsers, and devices.
Conduct user acceptance testing: Test Azure Cosmos DB with end-users to ensure that it meets their needs and is easy to use. This may involve conducting surveys, interviews, or focus groups to gather feedback from users.
Automate testing: Automate testing of Azure Cosmos DB to ensure that it is functioning correctly and meeting the needs of all stakeholders. This may involve using tools like Azure DevOps to set up automated testing pipelines.
Monitor performance: Monitor the performance of Azure Cosmos DB in production to ensure that it is meeting the needs of all stakeholders. This may involve setting up monitoring tools, such as Azure Monitor, to track usage and identify performance issues.
Address issues: Address any issues that are identified during testing and make necessary changes to ensure that Azure Cosmos DB is functioning correctly and meeting the needs of all stakeholders.
By following these steps, you can ensure that Azure Cosmos DB is tested thoroughly and meets the needs of all stakeholders involved in the project. This can help improve the quality of Azure Cosmos DB and ensure that it functions correctly in a production environment.
F. 2023 Roadmap
May 2023
Support for up to 30 shards for clustered Azure Cache for Redis instances
Azure Cache for Redis now supports clustered caches with upto 30 shards which means your applications can store more data and scale better with your workloads.
For more information, see Configure clustering for Azure Cache for Redis instance.
March 2023
In-place scale up and scale out for the Enterprise tiers (preview)
The Enterprise and Enterprise Flash tiers now support the ability to scale cache instances up and out without requiring downtime or data loss. Scale up and scale out actions can both occur in the same operation.
For more information, see Scale an Azure Cache for Redis instance
Support for RedisJSON in active geo-replicated caches (preview)
Cache instances using active geo-replication now support the RedisJSON module.
For more information, see Configure active geo-replication.
Flush operation for active geo-replicated caches (preview)
Caches using active geo-replication now include a built-in flush operation that can be initiated at the control plane level. Use the flush operation with your cache instead of the FLUSH ALL and FLUSH DB operations, which are blocked by design for active geo-replicated caches.
For more information, see Flush operation
Customer managed key (CMK) disk encryption (preview)
Redis data that is saved on disk can now be encrypted using customer managed keys (CMK) in the Enterprise and Enterprise Flash tiers. Using CMK adds another layer of control to the default disk encryption.
For more information, see Enable disk encryption
Connection event audit logs (preview)
Enterprise and Enterprise Flash tier caches can now log all connection, disconnection, and authentication events through diagnostic settings. Logging this information helps in security audits. You can also monitor who has access to your cache resource.
For more information, see Enabling connection audit logs
G. 2024 Roadmap
????
H. Known Issues
There are several known issues that can impact Azure Cosmos DB. Here are some of the most common issues to be aware of:
Partitioning issues: Partitioning issues can arise when partition keys are not chosen carefully. It is important to ensure that the partition key chosen is evenly distributed and can handle the anticipated workload.
Performance issues: If the database service is not properly sized, it can impact performance and availability, causing issues with the speed and reliability of Azure Cosmos DB.
Query issues: Query issues can arise when queries are not optimized. It is important to ensure that queries are optimized to avoid slow query times and improve performance.
Integration issues: Integration issues can arise when integrating Azure Cosmos DB with other systems and applications. It is important to ensure that Azure Cosmos DB is designed to work seamlessly with other systems and applications to avoid integration issues.
Security issues: Security is a critical concern when it comes to Azure Cosmos DB. It is important to ensure that Azure Cosmos DB is secured and that access to the solution is restricted to authorized personnel.
Compatibility issues: Azure Cosmos DB may not be compatible with all platforms, devices, or languages. It is important to ensure that Azure Cosmos DB is compatible with the organization’s existing infrastructure before implementation.
Testing issues: Testing issues can arise when testing Azure Cosmos DB. It is important to ensure that testing is carried out thoroughly and that all aspects of Azure Cosmos DB functionality are tested.
Overall, Azure Cosmos DB requires careful planning and management to ensure that it is functioning correctly and meeting the needs of all stakeholders involved in the project. By being aware of these known issues and taking steps to address them, you can improve the quality of Azure Cosmos DB and ensure the success of your project.
[x] Reviewed by Enterprise Architecture
[x] Reviewed by Application Development
[x] Reviewed by Data Architecture