dbnd.log_dataframe

dbnd.log_dataframe(key: str, value: Union[pd.DataFrame, spark.DataFrame, PostgresTable, SnowflakeTable] = None, path: Optional[str] = None, operation_type: DbndTargetOperationType = DbndTargetOperationType.read, with_preview: Optional[bool] = None, with_size: Optional[bool] = None, with_schema: Optional[bool] = None, with_stats: Optional[Union[bool, str, List[str], LogDataRequest]] = None, with_histograms: Optional[Union[bool, str, List[str], LogDataRequest]] = None, raise_on_error: bool = False) None

Logs a dataframe to dbnd.

Parameters
  • key – Name of the dataframe.

  • value – The dataframe itself.

  • path – Optional target or path representing a target to connect the dataframe to.

  • operation_type – Type of the operation doing with the target - reading or writing the dataframe?

  • with_preview – True if should log a preview of the dataframe.

  • with_size – True if should log the size of the dataframe.

  • with_schema – True if should log the schema of the dataframe.

  • with_stats – True if should calculate and log stats of the dataframe.

  • with_histograms – True if should calculate and log histogram of the dataframe.

  • raise_on_error – raise if error occur.

Example:

@task
def process_customers_data(data) -> pd.DataFrame:
    log_dataframe("customers_data", data)