Google Cloud Storage

  • Author: Ronald Fung

  • Creation Date: 9 June 2023

  • Next Modified Date: 9 June 2024


A. Introduction

Cloud Storage is a managed service for storing unstructured data. Store any amount of data and retrieve it as often as you like.


B. How is it used at Seagen

Seagen can use Google Cloud Storage to store and manage their research data in a secure, scalable, and cost-effective way. Here are some steps to get started with Google Cloud Storage:

  1. Create a Google Cloud account: Seagen can create a Google Cloud account in the Google Cloud Console. This will give them access to Google Cloud Storage and other Google Cloud services.

  2. Create a storage bucket: Seagen can create a storage bucket in Google Cloud Storage, which is a container for their data objects. They can specify the storage class, access control settings, and other configurations for the bucket.

  3. Upload data objects: Seagen can upload their research data objects to the storage bucket, using the Google Cloud Console, the gsutil command-line tool, or the Google Cloud Storage API. They can organize the data objects in folders and subfolders for easy management.

  4. Manage data objects: Seagen can manage their data objects in Google Cloud Storage, including copying, moving, renaming, and deleting objects. They can also set up lifecycle policies to automatically delete or archive objects based on their age or other criteria.

  5. Access data objects: Seagen can access their data objects in Google Cloud Storage, using the Google Cloud Console, the gsutil command-line tool, or the Google Cloud Storage API. They can also set up access control policies to control who can access the data objects and how they can access them.

Overall, by using Google Cloud Storage, Seagen can store and manage their research data in a scalable and cost-effective way, with high availability and durability. With its user-friendly interface and robust features, Google Cloud Storage is an excellent choice for businesses and individuals who need to store and manage large amounts of data in the cloud.


C. Features

Automatic storage class transitions

With features like Object Lifecycle Management (OLM) and Autoclass you can easily optimize costs with object placement across storage classes. You can enable, at the bucket level, policy-based automatic object movement to colder storage classes based on the last access time. There are no early deletion or retrieval fees, nor class transition charges for object access in colder storage classes.

Continental-scale and SLA-backed replication

Industry leading dual-region buckets support an expansive number of regions. A single, continental-scale bucket offers nine regions across three continents, providing a Recovery Time Objective (RTO) of zero. In the event of an outage, applications seamlessly access the data in the alternate region. There is no failover and failback process. For organizations requiring ultra availability, turbo replication with dual-region buckets offers a 15 minute Recovery Point Objective (RPO) SLA.

Fast and flexible transfer services

Storage Transfer Service offers a highly performant, online pathway to Cloud Storage—both with the scalability and speed you need to simplify the data transfer process. For offline data transfer our Transfer Appliance is a shippable storage server that sits in your datacenter and then ships to an ingest location where the data is uploaded to Cloud Storage.

Default and configurable data security

Cloud Storage offers secure-by-design features to protect your data and advanced controls and capabilities to keep your data private and secure against leaks or compromises. Security features include access control policies, data encryption, retention policies, retention policy locks, and signed URLs.

Leading analytics and ML/AI tools

Once your data is stored in Cloud Storage, easily plug into Google Cloud’s powerful tools to create your data warehouse with BigQuery, run open-source analytics with Dataproc, or build and deploy machine learning (ML) models with Vertex AI.

Object lifecycle management

Define conditions that trigger data deletion or transition to a cheaper storage class.

Object Versioning

Continue to store old copies of objects when they are deleted or overwritten.

Retention policies

Define minimum retention periods that objects must be stored for before they’re deletable.

Object holds

Place a hold on an object to prevent its deletion.

Customer-managed encryption keys

Encrypt object data with encryption keys stored by the Cloud Key Management Service and managed by you.

Customer-supplied encryption keys

Encrypt object data with encryption keys created and managed by you.

Uniform bucket-level access

Uniformly control access to your Cloud Storage resources by disabling object ACLs.

Requester pays

Require accessors of your data to include a project ID to bill for network charges, operation charges, and retrieval fees.

Bucket Lock

Bucket Lock allows you to configure a data retention policy for a Cloud Storage bucket that governs how long objects in the bucket must be retained.

Pub/Sub notifications for Cloud Storage

Send notifications to Pub/Sub when objects are created, updated, or deleted.

Cloud Audit Logs with Cloud Storage

Maintain admin activity logs and data access logs for your Cloud Storage resources.

Object- and bucket-level permissions

Cloud Identity and Access Management (IAM) allows you to control who has access to your buckets and objects.


D. Where Implemented

LeanIX


E. How it is tested

Testing Google Cloud Storage involves ensuring that the data objects are being stored and managed correctly, and that the data is accessible and secure. Here are some steps to test Google Cloud Storage:

  1. Create a test storage bucket: Create a test storage bucket in Google Cloud Storage that mimics the production storage bucket as closely as possible, including the access control settings, storage class, and lifecycle policies.

  2. Upload test data objects: Upload test data objects to the test storage bucket, using the Google Cloud Console, the gsutil command-line tool, or the Google Cloud Storage API. Ensure that the data objects are being stored and managed correctly, and that the access control policies are working as expected.

  3. Manage test data objects: Manage the test data objects in Google Cloud Storage, including copying, moving, renaming, and deleting objects. Ensure that the management operations are working correctly, and that the data objects are being updated and deleted as expected.

  4. Access test data objects: Access the test data objects in Google Cloud Storage, using the Google Cloud Console, the gsutil command-line tool, or the Google Cloud Storage API. Ensure that the data objects are accessible and that the access control policies are working as expected.

  5. Test scalability: Test the scalability of Google Cloud Storage by uploading and managing large amounts of test data objects simultaneously, and monitoring the resource utilization and performance. Use Google Cloud Storage’s auto-scaling and load balancing features to scale the resources up or down based on demand.

Overall, testing Google Cloud Storage involves creating a test storage bucket, uploading test data objects, managing test data objects, accessing test data objects, and testing scalability. By thoroughly testing Google Cloud Storage, users can ensure that their data is being stored and managed correctly, and that they are only paying for the storage and bandwidth they use. Additionally, users can reach out to Google Cloud support for help with any technical challenges they may encounter.


F. 2023 Roadmap

????


G. 2024 Roadmap

????


H. Known Issues

While Google Cloud Storage is a robust and reliable object storage service, there are some known issues that users may encounter. Here are some of the known issues for Google Cloud Storage:

  1. Durability issues: Although Google Cloud Storage provides high durability of data objects, there have been rare instances where data loss has occurred. These instances are typically related to software bugs or hardware failures, and Google Cloud Storage provides a 99.999999999% durability guarantee.

  2. Access control issues: Users may encounter issues with access control policies, such as policies not being applied correctly or users not being able to access data objects due to incorrect permissions. These issues can typically be resolved by reviewing and adjusting access control settings in the Google Cloud Console or using the gsutil command-line tool.

  3. Performance issues: Users may encounter performance issues with Google Cloud Storage, such as slow upload or download speeds, or high latency. These issues can often be resolved by adjusting storage classes or using Google Cloud Storage’s multi-regional or regional options to optimize performance.

  4. Billing and cost issues: Users may encounter billing and cost issues with Google Cloud Storage, such as unexpected charges or incorrect usage reports. These issues can often be resolved by reviewing usage reports and monitoring billing statements in the Google Cloud Console.

  5. Data transfer issues: Users may encounter data transfer issues with Google Cloud Storage, such as issues with transferring data between storage buckets or transferring data over the internet. These issues can often be resolved by reviewing network settings and adjusting transfer options.

Overall, while these issues may impact some users, Google Cloud Storage remains a reliable and scalable object storage service that is widely used by businesses and individuals. By monitoring their Google Cloud Storage usage and reviewing their usage reports and logs, users can ensure that their Google Cloud Storage resources are secure and accessible, and that they are only paying for the resources they use. Additionally, users can reach out to Google Cloud support for help with any known issues or other technical challenges they may encounter.


[x] Reviewed by Enterprise Architecture

[x] Reviewed by Application Development

[x] Reviewed by Data Architecture