GuidesAPI ReferenceDiscussions
GuidesBlogPlatform

Job Monitoring

In-depth data quality monitoring with Databand.

Databand helps data engineers guarantee reliable SLAs. You can use the solution to monitor runs, alert on failures, and do root cause analysis to find where errors are coming from.

You can use Databand to monitor pipelines of any size or complexity, across multiple orchestration systems. For example, Databand can centrally monitor DAGs across any number of orchestrator (ie. Airflow) instances, providing you a single pane of glass for all pipeline operations.

You can click into a pipeline to see metric information and trends from that particular DAG, or click into an individual run of a pipeline to drill into the details of specific execution, and understand exactly where failures are coming from and how metrics are being reported from specific tasks, functions, or jobs in your pipeline.

37743774

The Run Details page in the Databand UI, showing an error log from a Spark task within a DAG.

You can collect application logs and metadata from your pipelines including:

  • Runtime information
  • Task dependencies and lineage
  • Resource consumption
  • Code versioning information
  • Datasets that were read or written by this run

Orchestration Level Information

If you are using an orchestrator such as Apache Airflow, Databand can sync metadata from the Airflow database and provide you deeper insights and utilities for monitoring your DAG health, for example alerting on anomalous runtimes.

Task Metadata

If executing jobs in remote or distributed systems like Spark, SQL databases, or docker containers, it may be a challenge to gather the right information from your execution environment and understand how it aligns with your DAGs. This can lead to information silos that slow down debugging/RCA, or even create inconsistencies between systems. Databand can track metadata and logs from task executors, so you can access log and error information in one place.

Function Metadata

Databand can provide granular tracking of workflows to the function level, helping you drill into pipelines and instantly discover where errors arise.


Did this page help you?