MLflow

MLflow is an open source library by the Databricks team designed for managing the machine learning lifecycle. It allows for the creation of projects, tracking of metrics, and model versioning.

Install mlflow using pip

pip install mlflow

MLflow can be used in any Spark environmnet, but the automated tracking and UI of MLflow is Databricks-Specific Functionality.

Track metrics and parameters

import mlflow
## Log Parameters and Metrics from your normal MLlib run
with mlflow.start_run():
# Log a parameter (key-value pair)
mlflow.log_param("alpha", 0.1)
# Log a metric; metrics can be updated throughout the run
mlflow.log_metric("AUC", 0.871827)
mlflow.log_metric("F1", 0.726153)
mlflow.log_metric("Precision", 0.213873)

MLflow GitHub: https://github.com/mlflow/mlflow/