By using DBND, you can track input/output, data lineage, and schema metadata from your pipelines.
All you need to implement tracking is to annotate your function with a decorator.
Below is an example in a Python function, though decorators for Java and Scala functions are supported as well.
# module1.py from dbnd import task # define a function with a decorator @task def user_function(pandas_df: pd.DataFrame, counter: int, random: int): return "OK"
For certain objects passed to your functions such as Pandas DataFrames and Spark DataFrames, DBND automatically collects data set previews and schema info. This makes it easier to track data lineage and report on data quality issues.
Let us say we would like to track a function (or functions) from a module. Instead of decorating each function with
@task, you can use the
Review the following example, where
# module1.py def f1(): pass def f2(): pass def f3(): pass
module2, we have the following functions:
# module2.py from dbnd import track_functions from module1 import f1, f2, f3 track_functions(f1, f2, f3) def f4(): f1() f2() f3()
track_functions function uses functions as arguments and automatically decorates them so that you can track any function without changing your existing function code or manually adding decorators.
For an easier and faster approach, you can use the
track_module_functions function to track all functions inside a named module. So,
module2.py from the above example would look like this:
# module2.py from dbnd import track_module_functions from module1 import f1, f2, f3 import module1 track_module_functions(module1) def f4(): f1() f2() f3()
To track all functions from multiple modules, there is also
track_modules which gets modules as arguments and tracks all functions contained within those modules:
# module3.py from dbnd import track_module_functions import module1 import module2 track_modules(module1, module2) def f5(): module2.f4()
In this example,
f3 are going to be tracked although they are not used in this module.
Updated 19 days ago