Tracking MLFlow

How to get started with tracking MLFlow in Databand.

If you use MLFlow, you can duplicate Databand metrics to the MLFlow store and maintain data in the MLFlow system as well.

To Integrate MLFlow with Databand

  1. Run the following command to install the integration plugin:
pip install dbnd-mlflow

or

pip install databand[mlflow]
  1. Apply the following configuration:
[core]
# Databand store URL should be defined
databand_url=http://localhost:8081

[mlflow_tracking]
# Enable tracking to Databand store
databand_tracking=True

# Optionally, define a URI for mlflow store; mlflow.get_tracking_uri() is used by default
duplicate_tracking_to=http://mlflow-store/
  1. Run pip install dbnd-examples to install the examples, and then run the examples:
dbnd run dbnd_examples.mlflow.run_mlflow_in_dbnd_task.mlflow_tracking_in_task_example

Alternatively, set the configuration manually:

dbnd run dbnd_examples.mlflow.run_mlflow_in_dbnd_task.mlflow_tracking_in_task_example --set-config mlflow_tracking.databand_tracking=True

Task Example

The following example code shows how the logging works.

from dbnd import task
from mlflow import start_run, end_run
from mlflow import log_metric, log_param

@task
def mlflow_example():
    start_run()
    # params
    log_param("param1", randint(0, 100))
    log_param("param2", randint(0, 100))
    # metrics
    log_metric("foo1", random())
    log_metric("foo2", random())
    end_run()

Execution Flow

When you run dbnd run mlflow_example --set-config mlflow_tracking.databand_tracking=True, the following happens in the backend:

  1. Databand creates a new DBND context
  2. dbnd_on_pre_init_context hook from dbnd_mlflow is triggered
  3. A new URI is computed to be used by mlflow
    For example: `dbnd://localhost:8081?duplicate_tracking_to=http%253A%252F%252Fmlflow-store%253A80%252F
  4. The new URI is set to be used with mlflow.set_tracking_uri()
  5. mlflow_example task starts:
  6. mlflow.start_run()
  7. mlflow reads entry_points for each installed package and finds:
"dbnd = dbnd_mlflow.tracking_store:get_dbnd_store",
"dbnd+s = dbnd_mlflow.tracking_store:get_dbnd_store",
"databand = dbnd_mlflow.tracking_store:get_dbnd_store",
"databand+s = dbnd_mlflow.tracking_store:get_dbnd_store",
  1. mlflow creates TrackingStoreClient by using the new URI
  2. URI schema instructs to use dbnd_mlflow.tracking_store:get_dbnd_store
    • get_dbnd_store creates dbnd TrackingAPIClient
    • get_dbnd_store creates mlflow tracking store to duplicate tracking to
    • get_dbnd_store returns DatabandStore instance
  3. log_param()/log_metric()
    • calls to DatabandStore
    • calls to TrackingAPIClient
    • calls to mlflow tracking store to duplicate tracking to mlflow.end_run()
  4. mlflow_example ends
  5. dbnd_on_exit_context hook from dbnd_mlflow is triggered
  6. Restore the original mlflow tracking URI.

Did this page help you?