[deprecated] Amazon Managed Workflows Airflow
Follow the instructions on this page to configure the standard Airflow integration.
Amazon Managed Workflows is a managed Apache Airflow service that makes it easier to set up and operate end-to-end data pipelines in the AWS cloud at scale.
Installation Guide
Databand integrates with Amazon Managed Workflows (MWAA) to provide you with observability over your MWAA Airflow DAGs. This guide will cover platform-specific steps for tracking MWAA in Databand. If you need to create or configure an Amazon Managed Workflows environment, we recommend reading the Getting Started for Amazon Managed Workflows documentation.
Collect Your MWAA Details
Before integrating MWAA with Databand, we need the following 2 pieces of MWAA metadata:
- MWAA Airflow UI URL
- MWAA S3 Storage location
MWAA URL
Airflow URL can be located in AWS Console:
Go to AWS MWAA>Environments>{mwaa_env_name}>Details>Airflow web UI.
Format to use: https://<guid>.<aws_region>.airflow.amazonaws.com


MWAA S3 Bucket Location
Go to AWS MWAA>Environments>{mwaa_env_name}>DAG code in Amazon S3>S3 Bucket


Installing DBND in MWAA
In the MWAA’s S3 bucket, update your requirements.txt
file with the following lines:
dbnd-airflow-auto-tracking==REPLACE_WITH_DATABAND_VERSION
Update the requirements.txt
version in the MWAA environment configuration. Please note that saving this change to your MWAA environment configuration will trigger a restart of your Airflow Scheduler.
For more information on installing 'extras' in MWAA see Installing Python dependencies - Amazon Managed Workflows for Apache Airflow. For Databand installation details, please check Installing DBND
Installing Monitor DAG
To sync report DAG execution and DBND metrics to Databand, you will need the databand_airflow_monitor
DAG running in you MWAA environment.
- Create
databand_airflow_monitor
DAG in Airflow. Please create a new filedataband_airflow_monitor.py
with the following dag definition and add it to your project DAGs:
from airflow_monitor.monitor_as_dag import get_monitor_dag
# This DAG is used by Databand to monitor your Airflow installation.
dag = get_monitor_dag()
- Deploy your new DAG and enable it in Airflow UI.
Airflow Syncer
To complete the configuration you need to define Airflow Syncer in Databand and create Airflow Connection with Databand URL and configuration params. See Apache Airflow Syncer for detailed instructions.
Updated 5 months ago