Google Document DB

  • Author: Ronald Fung

  • Creation Date: 8 June 2023

  • Next Modified Date: 8 June 2024


A. Introduction

Google DocumentDB is a cloud-based NoSQL document database service provided by Google Cloud. It is designed to store and manage semi-structured data as JSON documents. DocumentDB is a highly scalable and highly available database service that offers automatic sharding, replication, and load balancing to ensure high performance and reliability. It also supports ACID transactions and provides a flexible query language for searching and filtering data. With Google DocumentDB, developers can easily build modern, web-scale applications that require low-latency, high-throughput data access.


B. How is it used at Seagen

Seagen can use Google DocumentDB as a cloud-based NoSQL database service to store and manage semi-structured data as JSON documents. If you are already using Microsoft Azure, you can integrate Google DocumentDB with your current cloud infrastructure using various tools and services provided by Google Cloud, such as Google Cloud Dataflow, Google Cloud Pub/Sub, or Google Cloud Functions.

To migrate your data from Microsoft Azure to Google DocumentDB, you can use the Google Cloud Data Transfer Service, which offers a simple and secure way to transfer data between cloud storage providers. Once your data is migrated, you can use the Google Cloud Console to create, manage, and monitor your DocumentDB instances, configure replication and sharding, and create indexes for faster data access.

Google DocumentDB also provides a flexible query language for searching and filtering data, which can be used in conjunction with other Google Cloud services such as Google Cloud Functions or Google Cloud Run to build serverless applications that can scale automatically and handle high throughput data processing.

Overall, Google DocumentDB can be a great choice for Seagen to store and manage their semi-structured data in the cloud, offering high scalability, reliability, and performance for their biopharma research projects.


C. Features

Google DocumentDB is a cloud-based NoSQL document database service provided by Google Cloud. Some of the key features of Google DocumentDB are:

  1. Document-based storage: DocumentDB is designed to store semi-structured data as JSON documents, which can be easily indexed and queried using its flexible query language.

  2. Fully managed service: Google DocumentDB is a fully managed service, which means that it takes care of database administration tasks such as scaling, replication, and backups, allowing developers to focus on their applications.

  3. High scalability: DocumentDB offers automatic sharding, which allows it to scale horizontally to handle large volumes of data and high traffic workloads.

  4. ACID transactions: DocumentDB supports ACID transactions, which ensures data consistency and reliability in multi-step data operations.

  5. High availability: DocumentDB replicates data across multiple availability zones to ensure high availability and durability.

  6. Flexible indexing: DocumentDB supports various types of indexes such as single-field, compound, and geo-spatial indexes, which can be used to optimize query performance.

  7. Integration with other Google Cloud services: DocumentDB can be easily integrated with other Google Cloud services such as Google Cloud Functions, Google Cloud Pub/Sub, and Google Cloud Storage, allowing developers to build modern, web-scale applications.

Overall, Google DocumentDB is a powerful and flexible document database service that offers high scalability, availability, and reliability for modern cloud-based applications.


D. Where Implemented

LeanIX


E. How it is tested

To test Google DocumentDB, you can follow these steps:

  1. Create a DocumentDB instance: First, create a DocumentDB instance using the Google Cloud Console or the gcloud command-line tool.

  2. Insert test data: Insert some test data into the DocumentDB instance using the Google Cloud Console, the gcloud command-line tool, or a client library such as the Node.js client library.

  3. Query the data: Use the Google Cloud Console or a client library to query the data and verify that it returns the expected results.

  4. Test scalability: To test the scalability of DocumentDB, you can insert a large amount of test data and measure the response time of queries. You can also test the automatic sharding feature by adding more nodes to the cluster and measuring the performance improvement.

  5. Test durability: To test the durability of DocumentDB, you can simulate a node failure and verify that the data is still available and consistent.

  6. Test backups and restores: To test the backup and restore feature of DocumentDB, you can create a backup of the instance and restore it to a new instance, and then verify that the data is the same.

  7. Test security: To test the security of DocumentDB, you can configure access controls and authentication mechanisms such as IAM roles and service accounts, and verify that they are working correctly.

Overall, testing Google DocumentDB involves creating an instance, inserting data, querying the data, testing scalability and durability, testing backups and restores, and testing security. You can use the Google Cloud Console, the gcloud command-line tool, or client libraries to perform these tests.


F. 2023 Roadmap

????


G. 2024 Roadmap

????


H. Known Issues

While Google DocumentDB is generally a highly reliable and stable database service, there are a few known issues that users may encounter. Some of the known issues are:

  1. Limited query support: While DocumentDB supports a flexible query language, it has some limitations compared to traditional relational databases. For example, it does not support joins across collections, and some types of queries may require complex indexes to optimize performance.

  2. Limited transaction support: While DocumentDB supports ACID transactions, it has some limitations compared to traditional relational databases. For example, transactions are limited to a single partition, and there are some restrictions on the types of operations that can be performed within a transaction.

  3. Slow performance for large documents: DocumentDB may experience slow performance when working with very large documents, especially when performing complex queries or updates.

  4. Limited backup and restore options: DocumentDB only supports full instance backups, which can be time-consuming and costly for large databases. Additionally, there are some limitations on the frequency and retention of backups.

  5. Limited integration with third-party tools: DocumentDB has limited integration with third-party tools and services compared to other databases, which can make it challenging to use with certain development workflows.

Overall, while these issues may impact some users, Google DocumentDB remains a highly scalable, reliable, and flexible document database service that is well-suited for modern cloud-based applications.


[x] Reviewed by Enterprise Architecture

[x] Reviewed by Application Development

[x] Reviewed by Data Architecture