GuidesAPI ReferenceDiscussions
GuidesBlogPlatform

[deprecated] Google Cloud Composer

Setup instructions for Google Cloud Composer integration with Databand.

Installation Guide

Databand integrates with Google Cloud Composer to provide you observability over your Composer DAGs. This guide will cover platform specific steps for tracking Composer in Databand.

Collect Your Cloud Composer Details

Before integrating Cloud Composer with Databand, we need the following 2 pieces of Google Cloud Composer metadata:

  • Cloud Composer URL
  • Cloud Storage location

Both can be located in GCloud Console. If you need to create or configure a Google Cloud Composer environment, we recommend reading the Cloud Composer Getting Started documentation.

Airflow URL: GCloud Console>Composer>{composer_env_name}>Environment Configuration>Airflow web UI.
Format - https://<guid>.appspot.com

Cloud Storage Location: GCloud Console>Composer>{composer_env_name}>>Environment Configuration>>DAGs folder.
Format - gs://<bucket_name>-bucket/dags

661661

Install DBND on your Google Composer Cluster

Update your Cloud Composer environment's PYPI Packages with the following entries. You need to use the most recent Databand versions (for example, if you're running v.0.61.1, this is what you should use instead of REPLACE_WITH_DBND_VERSION). See Installing DBND for more details:

dbnd-airflow-auto-tracking==REPLACE_WITH_DBND_VERSION

Please note, saving this change to your Cloud Composer environment configuration will trigger a restart of your Airflow Scheduler!

720720

Your settings should look similar to this screenshot.

For more information on installing packages in Google Cloud Composer please see Installing Python dependencies | Cloud Composer | Google Cloud.

For Databand tracking to work properly with Airflow 2.0+, you need to disable Lazy Load plugins. This can be done using the following configuration setting: core.lazy_load_plugins=False.

The screenshot below provides an example of setting this property in your Composer.

15881588

Create Monitor DAG

To report DAG execution and dbnd metrics to Databand, you will need the databand_airflow_monitor DAG running in your Cloud Composer environment.

  1. Create databand_airflow_monitor DAG in Airflow. Please create a new file databand_airflow_monitor.py with the following dag definition and add it to your project DAGs:
from airflow_monitor.monitor_as_dag import get_monitor_dag
# This DAG is used by Databand to monitor your Airflow installation.
dag = get_monitor_dag()
  1. Deploy your new DAG and enable it in Airflow UI.

Airflow Syncer

To complete the configuration you need to define Airflow Syncer in Databand and create Airflow Connection with Databand URL and configuration params. See Apache Airflow Syncer for detailed instructions.


Did this page help you?