GCP Environment
How to configure an environment for GCP.
Before You Begin
You must have the dbnd-gcp
plugin installed.
To Set up an Environment for Google Cloud Platform
- Open the project.cfg file, and add
gcp
to the list of environments.
[core]
environments = ['local', 'gcp']
- In the
[gcp]
section, set your Airflow connection ID, and optionally provide the root bucket/folder for your metadata store.
[gcp]
root = gs://databand_examples
conn_id =google_cloud_default
- To configure the default connection with your Google Cloud project ID, in the command line run the following command:
$ dbnd airflow connections --add \
--conn_id google_cloud_default \
--conn_type google_cloud \
--project_id <your project ID>
- To use the default Google Cloud credentials, run the following command:
gcloud auth application-default login
[gcp]
Configuration Section Parameter Reference
[gcp]
Configuration Section Parameter Referenceenv_label
- Set the environment type to be used. E.g. dev, int, prod.production
- This indicates that the environment is production.conn_id
- Set the cloud connection settings.root
- Determine the main data output location.local_engine
- Set which engine will be used for local executionremote_engine
- Set the remote engine for the execution of driver/taskssubmit_driver
- Enable submitting driver toremote_engine
.submit_tasks
- Enable submitting tasks to remote engine one by one.spark_config
- Determine the Spark Configuration settingsspark_engine
- Set the cluster engine to be used. E.g. local, emr (aws), dataproc (gcp), etc.hdfs
- Set the Hdfs cluster configuration settingsbeam_config
- Set the Apache Beam configuration settingsbeam_engine
- Set the Apache Beam cluster engine. E.g. local or dataflow.docker_engine
- Set the Docker job engine, e.g. docker or aws_batch
Updated 3 months ago