Model Saving and Loading

Model Saving

Save model(s) to mounted storage

lrcvModel.save("/mnt/trainedmodels/lr")
rfcvModel.save("/mnt/trainedmodels/rf")
dtcvModel.save("/mnt/trainedmodels/dt")
display(dbutils.fs.ls("/mnt/trainedmodels/"))

Remove a model

Spark MLlib models are actually a series of files in a directory. So, you will need to recursively delete the files in model's directory, then the directory itself.

dbutils.fs.rm("/mnt/trainedmodels/dt", True)

Score new data using a trained model

Load in required libraries

from pyspark.ml.tuning import CrossValidatorModel
from pyspark.ml import PipelineModel
from pyspark.sql.functions import col, round
from pyspark.sql.types import IntegerType, FloatType

Load in the transformation pipeline

Load in the trained model

Remove unnecessary columns from the scored data

Last updated

Was this helpful?