GuidesAPI ReferenceDiscussions
GuidesBlogPlatform

Collected Metadata

This cheat sheet will help you navigate the data collection process

This table lists pipeline and dataset metadata collected by Databand and explains how users can control what is collected.
Tracking metadata can be configured using, SDK Configuration, dbnd_config Airflow Connection Airflow Connection, or Databand UI.

MetadataDefaultDBND ConfigAirflow Syncer via UI
Source Codedisabled[tracking] track_source_code = trueSelect/Unselect 'Include source code' option at Settings->Airflow Syncers -> Add/Edit Syncer Page
Logsdisabled[log] preview_head_bytes = 8192 preview_tail_bytes = 8192Select/Unselect collect logs option at Settings->Airflow Syncers -> Add/Edit Syncer Page, provide number of KB from head and tail if logs were enabled. (Maximum 8096 KB in each)
Errorsenabledask the Databand team to switch on/offask the Databand team to switch on/off
Airflow XCOM valuesdisabled[airflow_tracking] track_xcom_values = truenot supported in UI, via dbnd_config: "airflow_tracking": { "track_xcom_values": true }
return value of Airflow Python Taskdisabled[airflow_tracking] track_airflow_execute_result=truenot supported in UI, can be done via dbnd_config : "airflow_tracking": { "track_airflow_execute_result": true }
Data Operationsexplicit reporting by a user using log_metric, log_dataframe, and log_dataset_op Check Dataset Logging for more info

Example: Enabling Code and Logs tracking

You can easily enable Code tracking and Logs tracking by Databand Service by providing the following config

[tracking]
track_source_code=True

[log]
preview_head_bytes=15360
preview_tail_bytes=15360

For Airflow Tracker you can just edit the Airflow Tracking Configuration.


Did this page help you?