Machine Learning Model Management with MLflow
Data is the new oil and its size is growing exponentially day by day. Most of the companies are leveraging data science capabilities extensively to affect business decisions, perform audits on ML patterns, decode faults in business logic, and more. They run large number of machine learning model to produce results.
Managing ML models in production is non-trivial. The training, maintenance, deployment, monitoring, organization and documentation of machine learning (ML) models – in short model management – is a critical task in virtually all production ML use cases. Wrong model management decisions can lead to poor performance of a ML system and can result in high maintenance cost and less effective utilization. Below are the key concern for model management:
- Computational challenges: machine learning model definition and validation, decisions on model retraining, adversarial settings.
- Data management challenges: lack of a declarative abstraction for the whole ML pipeline, querying model metadata, model interpretation.
- Engineering challenges: multiple tools and frameworks make integration complex, heterogeneous skill level of users, backwards compatibility of trained Models and hard to reproduce the training result.
There are custom ML platform to address the above concerns such as FBLearner by Facebook and Michelangelo by Uber but they have their own limitations like:
- They standardize the data preparation, training and deployment loop specific to particular platform and business needs.
- They are limited to a few algorithms and frameworks.
- They tied to one company infrastructure and hard to open source.
Databricks team found above concerns as their motivation to develop MLflow as an open source and cloud agnostic machine learning model management platform. Benefits of MLflow from machine learning model management:
- Works with any ML library and language.
- They are platform independent i.e. ML models run in same way anywhere example local system or any cloud platform.
- Designed to be useful for 1 or 10000 person organisation.
Outline/Structure of the Talk
Key focus area for Machine Learning Model Management with MLflow:
- Managing ML models in production is non-trivial. What are the challenges and concerns of machine learning management lifecycle?
- What is machine learning model management?
- Motivation and concepts behind introduction of MLflow
- How to solve problem of model management using MLflow?
- MLflow components
- Realtime problem and use case
Attendees will learn below key concepts and technology:
- What is model management?
- Complexities in managing machine learning model on production environment.
- Existing solution for model management.
- Limitations of existing solution.
- How to solve it using MLflow?
- Basic understanding of MLflow.
- Insight on realtime problem and use-case
Data Scientist and Machine Learning Engineer
Prerequisites for Attendees
- Basic understanding of machine learning and its work flow
- Basic understanding model management