Keeping Your Machine Learning Models on the Right Track: Getting Started with MLflow, Part 2


Credit: Alfons Morales (Unsplash)

TLDR; MLflow Model Registry allows you to keep track of different Machine Learning models and their versions, as well as tracking their changes, stages and artifacts. Companion Github Repo for this post

In our last post, we discussed the importance of tracking Machine Learning experiments, metrics and parameters. We also showed how easy it is to get started in these topics by leveraging the power of MLflow (for those who are not aware, MLflow is currently the de-facto standard platform for machine learning experiment and model management).

Keeping Your Machine Learning Models on the Right Track: Getting Started with MLflow, Part 1

In particular, Databricks makes it even easier to leverage MLflow, since it provides you with a completely managed version of the platform.

This means you don’t need to worry about the underlying infrastructure to run MLflow, and it is completely integrated with other Machine Learning features from Databricks Workspaces, such as Feature Store, AutoML and many others.

Coming back to our experiment and model management discussion, although we covered the experiment part in the last post, we still haven’t discussed how to manage the models that we obtain as part of running our experiments. This is where MLflow Model Registry comes in.

The Case for Model Registry

As the processes to create, manage and deploy machine learning models evolve, organizations need to have a central platform that allows different personas such as data scientists and machine learning engineers to collaborate, share code, artifacts and control the stages of machine learning models. Breaking this down in terms of functional requirements, we are talking about the following desired capabilities:

Now to the practical part. We will run some code to train a model and showcase MLflow Model Registry capabilities. Hereby we present two possible options for running the notebooks from this quickstarter: you can choose to run them on Jupyter Notebooks with a local MLflow instance, or in a Databricks workspace.

Jupyter Notebooks

If you want to run these examples using Jupyter Notebooks, please follow these steps:

export SYSTEM\_VERSION\_COMPAT=1 && \  
source .venv/bin/activate && \  
pip install --upgrade pip && \  
pip install wheel && \  
pip install -r requirements.txt && \  
pre-commit install

Databricks

Looking at the screenshot above, you might notice that on the first row of our table, in the models column, we have an icon which differs from the other rows. This is due to the fact that the model artifact for that specific run was registered as a model, and a new model version was created (version 1). If we click on its link, we get redirected to the following window.

In the window above, we have an overview of the model version that was registered. We can see that it has the tag prediction_works = true. We can also see that it is in Staging. Depending on which persona is accessing this data, it might be possible to manually change the stage (to promote the model to Production, for instance), or reverting it back to None.

Moreover, with Workspace Object Access Control Lists, you could limit the permissions for each type of user. Let’s say that you wish to block data scientists from transitioning model stages, while you want to allow team manager to do so. In such scenario, Data Scientists would have to request transitions to a given stage.

These transitions would then need to be approved by someone with the right permissions. Finally, all of the requests and model stage transitions are tracked in the same window (and of course, they are also available programatically).

Once a model has been transitioned to Production, it is quite simple to deploy it either as an automated job or as a Real time REST API Endpoint. But that is the topic for another post.

All the code used in this post is available on the Github repo below:

GitHub - rafaelvp-db/mlflow-getting-started: A simple repo to get started with MLflow

References