Jump to Content
DatabandGuidesBlog
GuidesAPI ReferenceDiscussions
GuidesBlogPlatformDataband
Platform

Getting Started

  • Welcome to Databand
  • Databand Overview
  • Dataset Logging

Tracking SDK

  • Getting Started with DBND Tracking
    • Connecting DBND to Databand
    • Python Quickstart
    • Collected Metadata
  • Tracking Python
    • Installing Python SDK
    • Metrics
    • Datasets
    • Histograms
    • Python SDK Configuration
  • Tracking Spark/JVM Applications
    • Installing JVM SDK and Agent
    • Installing on Spark Cluster
    • Tracking PySpark
    • Tracking Spark (Scala/Java)
  • Tracking Apache Airflow
    • Installing on Airflow Cluster
    • Integrating with Apache Airflow
    • Tracking Subprocess/Remote Tasks
    • CLI for Managing Airflow Syncers
  • Tracking Snowflake
  • Tracking Redshift
  • Tracking BigQuery
  • Tracking Azkaban
  • Tracking MLFlow
  • Accessing Collected Data and Managing Databand
    • Alerting CLI
  • Tracking dbt

Databand Application

  • Dashboard & Widgets
  • Job Monitoring
    • Sticky Filters
    • Projects
  • Affected Datasets
  • Histograms
  • Task Metrics
  • Alerting
    • Automatic Alerts Creation
    • Anomaly Detection In Alerting
    • Advanced Alerts Definition
    • Setting up Slack Notifications
    • Setting up Email Notifications
  • Settings
    • Airflow Syncer Configuration
    • BigQuery Tracker Configuration
    • Personal Access Tokens

Orchestration

  • Getting Started with Orchestration
    • Installing SDK
    • Orchestration Quickstart Tutorial
    • Orchestration Examples
  • Tasks, Pipelines, Data
    • Task
    • Pipeline
    • Parameters
    • Task Inputs
    • Task Outputs
    • Task Class
    • Task Meta
  • Task Configuration
    • Introduction to Configuration
    • Defaults for Engines and Nested Tasks
    • Extending Values
    • Overriding Values
    • Extending Configurations
  • Runtime Environment Configuration
    • GCP Environment
    • AWS Environment
    • Azure Environment
    • HDFS File System
    • Logging System
  • Running Pipelines
    • Run a Task via CLI
    • Run a Task in Python
    • Run a Task in PyCharm
    • Run a Task in Jupyter
    • Task Discovery
    • Cancelling Run Execution
    • Accessing Run Results
    • Testing & Debugging
    • Run Info
  • Airflow Orchestration Integration
    • Running DBND Pipelines with Apache Airflow Executors
    • Generate Airflow DAGs
    • Native Airflow Execution
  • Kubernetes Integration
    • Kubernetes Engine Configuration
    • Setup Kubernetes Cluster
    • Submitting a Kubernetes Run from a Local Machine
    • Airflow Live Logs in Kubernetes
    • Kubernetes Troubleshooting
  • Spark Integration
    • Spark Configuration
    • Spark Input and Outputs (DataFrame)
    • Spark on Local Machine
    • Spark on AWS EMR
    • Spark on Databricks
    • Spark on GCP Dataproc
    • Spark on Livy
    • Reusing Spark Context in the Same Process
    • Add Python Project to spark submission
    • Testing Spark Tasks
  • Extending DBND SDK
    • DBND Plugins
    • Custom Target
    • Custom Value Type
    • Custom Parameter
    • Custom Decorators
    • Custom Marshaller
  • FAQ and Troubleshooting
    • Error Reporting Logs
    • dbnd-doctor Checks & Logs

Databand Self-Hosted

  • Databand Self-Hosted
  • Docker Container Deployment
    • Changelog (Breaking Changes)
    • External Monitoring Systems
    • Enable HTTPS access
  • Kubernetes Deployment
  • Alert Configuration for Self-Hosting
  • External DB for Databand MetaData
  • Authentication Providers (OAuth/SAML)
  • User management

Others

  • Databand Tracking Integrations
  • Deprecated Pages
    • [deprecated] Databand Application Overview
    • [deprecated] airflow-sandbox
    • [deprecated] Tracking Dataproc
    • [deprecated] Installing DBND on EMR Cluster
    • [deprecated] Trend Analysis
    • [deprecated] Tracking Stores Configuration
    • [deprecated] Prerequisites
    • [deprecated] Data Quality Monitoring
    • [deprecated] Python Scripts
    • [deprecated] Python Functions
    • [deprecated] Release Changelog
    • [deprecated] Custom Metrics using SQL queries
    • [deprecated] System Requirements and Supportability
    • [deprecated] JVM SDK Configuration
    • [derpecated] Getting Started with Databand
    • [deprecated] Orchestration Getting Started
    • [deprecated] Installing on Databricks Spark Cluster
    • [deprecated] Amazon Managed Workflows Airflow
    • [deprecated] Google Cloud Composer
    • [deprecated] Astronomer
    • [deprecated] Airflow Tracking Configuration
Powered by 

[deprecated] Tracking with Deequ

Suggest Edits

Please see Tracking Spark Applications

Updated 4 months ago


Did this page help you?