Google Monitoring

  • Author: Ronald Fung

  • Creation Date: 13 June 2023

  • Next Modified Date: 13 June 2024


A. Introduction

Cloud Monitoring collects metrics, events, and metadata from Google Cloud, Amazon Web Services (AWS), hosted uptime probes, and application instrumentation. Using the BindPlane service, you can also collect this data from over 150 common application components, on-premise systems, and hybrid cloud systems. Google Cloud’s operations suite ingests that data and generates insights via dashboards, charts, and alerts. BindPlane is included with your Google Cloud project at no additional cost.

To collect metrics data from your Compute Engine instances, create an Agent Policy that automatically installs and maintains the Google Cloud’s operations suite agents across your fleet of VMs.

Learn more.


B. How is it used at Seagen

As a biopharma research company, Seagen can benefit from using Google Cloud Monitoring for their monitoring needs. Here are some ways Seagen can use Google Cloud Monitoring:

  1. Migrating from Microsoft Azure: Seagen can migrate their existing monitoring configurations from Microsoft Azure to Google Cloud Monitoring by exporting the configurations from Azure and importing them into Google Cloud Monitoring. This can be done using the Google Cloud Monitoring API or other migration tools.

  2. Collecting and analyzing metrics: Seagen can collect and analyze metrics from their applications, virtual machines, and other resources running on Google Cloud Platform. They can use Google Stackdriver Monitoring to aggregate and search metrics, and can use Google Stackdriver Trace to analyze performance traces and metrics.

  3. Monitoring and alerting: Seagen can monitor their metrics and receive alerts in real-time using Google Stackdriver Monitoring. They can set up custom alerts based on specific metrics or resource types, and can receive notifications via email, SMS, or other channels.

  4. Managing access and permissions: Seagen can manage access and permissions to their monitoring data using Google Cloud IAM. They can define roles and permissions for different users and groups, and can control access to monitoring data based on resource type, location, or other attributes.

  5. Integrating with other Google Cloud services: Seagen can integrate their monitoring data with other Google Cloud services, such as Google Cloud Pub/Sub, Google Cloud Storage, and Google BigQuery. They can build real-time data pipelines and analytics workflows using these services, and can store and analyze monitoring data at scale.

Overall, by using Google Cloud Monitoring, Seagen can benefit from a powerful and scalable solution for their monitoring needs. With its support for migrating existing monitoring configurations, collecting and analyzing metrics, monitoring and alerting, managing access and permissions, and integrating with other Google Cloud services, Google Cloud Monitoring provides a range of options that can meet the needs of a variety of workloads and use cases.


C. Features

SLO monitoring

Automatically infer or custom define service-level objectives (SLOs) for applications and get alerted when SLO violations occur. Check out our step-by-step guide to learn how to set SLOs, following SRE best practices.

Managed metrics collection for Kubernetes and virtual machines

Google Cloud’s operations suite offers Managed Service for Prometheus for use with Kubernetes, which features self-deployed and managed collection options to simplify metrics collection, storage, and querying. For VMs, you can use the Ops Agent, which combines logging and metrics collection into a single agent that can be deployed at scale using popular configuration and management tools.

Google Cloud integration

Discover and monitor all Google Cloud resources and services, with no additional instrumentation, integrated right into the Google Cloud console.


D. Where Implemented

LeanIX


E. How it is tested

Testing Google Cloud Monitoring involves ensuring that the monitoring infrastructure is properly configured and optimized for performance, reliability, and security. Here are some steps to test Google Cloud Monitoring:

  1. Create a test environment: Create a test environment that mimics the production environment as closely as possible, including the applications, virtual machines, and other resources that generate metrics. Ensure that the monitoring infrastructure is properly configured and that the security policies are in place.

  2. Deploy the monitoring infrastructure: Deploy the monitoring infrastructure on Google Cloud Platform. Ensure that the infrastructure is properly configured and that it can communicate with other resources, such as applications or APIs.

  3. Test metric collection: Test the metric collection by generating test metric data using metrics generation tools, such as the Stackdriver Monitoring API or other metric generation tools. Ensure that the metric data is being collected properly and that there are no errors or missing data.

  4. Test metric search and analysis: Test the metric search and analysis by using metric search and analysis tools, such as the Stackdriver Monitoring Console or other metric analysis tools. Ensure that the metrics are searchable and that there are no errors or timeouts.

  5. Test metric monitoring and alerting: Test the metric monitoring and alerting by setting up custom alerts based on specific metric values or resource types, and by receiving notifications via email, SMS, or other channels. Ensure that the alerts are triggered properly and that there are no false positives or false negatives.

  6. Test metric access and permissions: Test the metric access and permissions by using Google Cloud IAM to define roles and permissions for different users and groups, and by controlling access to metrics based on resource type, location, or other attributes. Ensure that the access control policies are working as expected and that there are no unauthorized access or data breaches.

  7. Test metric integration: Test the metric integration with other Google Cloud services or third-party tools by using metric integration testing tools, such as Google Cloud Pub/Sub or Microsoft Azure Event Grid. Ensure that the metrics are integrated properly and that there are no integration issues or errors.

Overall, by thoroughly testing Google Cloud Monitoring, users can ensure that their monitoring infrastructure is properly configured and optimized for performance, reliability, and security. Additionally, users can reach out to Google Cloud support for help with any technical challenges they may encounter.


F. 2023 Roadmap

????


G. 2024 Roadmap

????


H. Known Issues

While Google Cloud Monitoring is a reliable and widely used solution for monitoring and alerting needs, there are some known issues that users may encounter. Here are some of the known issues for Google Cloud Monitoring:

  1. Metric ingestion issues: Users may encounter metric ingestion issues, such as dropped metric data or delayed metric data, especially for workloads that generate a high volume of metrics. This issue can often be resolved by using the appropriate metric ingestion settings, such as batch sizes or batch intervals, and by monitoring the metric ingestion rate.

  2. Metric search issues: Users may encounter metric search issues, such as slow queries or missing metrics, especially for workloads that require complex metric search queries or real-time metric analysis. This issue can often be resolved by using the appropriate metric search settings, such as metric filters or search indexes, and by optimizing the metric search queries.

  3. Alerting issues: Users may encounter alerting issues, such as false positives or false negatives, especially for workloads that require high accuracy or low latency alerts. This issue can often be resolved by using the appropriate alerting policies, such as threshold values or alert conditions, and by testing the alerting policies in a test environment.

  4. Security issues: Users may encounter security issues, such as unauthorized access or data breaches, especially for workloads that require high security. This issue can often be resolved by using the appropriate security policies and access controls, such as firewall rules and IAM roles.

  5. Integration issues: Users may encounter integration issues with other cloud services or third-party tools, such as data pipelines or analytics platforms. This issue can often be resolved by using industry-standard protocols and APIs to enable interoperability between different cloud services and tools.

Overall, while these issues may impact some users, Google Cloud Monitoring remains a powerful and reliable solution that is widely used by businesses and organizations around the world. By monitoring their performance and security alerts and metrics, reviewing their monitoring configuration and policies, and using best practices and industry standards, users can ensure that their monitoring infrastructure running on Google Cloud Monitoring is optimized for performance, reliability, and security. Additionally, users can reach out to Google Cloud support for help with any known issues or other technical challenges they may encounter.


[x] Reviewed by Enterprise Architecture

[x] Reviewed by Application Development

[x] Reviewed by Data Architecture