Machine Learning Model Management with MLflow

Background

Data is the new oil and its size is growing exponentially day by day. Most of the companies are leveraging data science capabilities extensively to affect business decisions, perform audits on ML patterns, decode faults in business logic, and more. They run large number of machine learning model to produce results.

Problem Statement

Managing ML models in production is non-trivial. The training, maintenance, deployment, monitoring, organization and documentation of machine learning (ML) models – in short model management – is a critical task in virtually all production ML use cases. Wrong model management decisions can lead to poor performance of a ML system and can result in high maintenance cost and less effective utilization. Below are the key concern for model management:

  1. Computational challenges: machine learning model definition and validation, decisions on model retraining, adversarial settings.
  2. Data management challenges: lack of a declarative abstraction for the whole ML pipeline, querying model metadata, model interpretation.
  3. Engineering challenges: multiple tools and frameworks make integration complex, heterogeneous skill level of users, backwards compatibility of trained Models and hard to reproduce the training result.

Existing Solution

There are custom ML platform to address the above concerns such as FBLearner by Facebook and Michelangelo by Uber but they have their own limitations like:

  1. They standardize the data preparation, training and deployment loop specific to particular platform and business needs.
  2. They are limited to a few algorithms and frameworks.
  3. They tied to one company infrastructure and hard to open source.

Why MLflow?

Databricks team found above concerns as their motivation to develop MLflow as an open source and cloud agnostic machine learning model management platform. Benefits of MLflow from machine learning model management:

  1. Works with any ML library and language.
  2. They are platform independent i.e. ML models run in same way anywhere example local system or any cloud platform.
  3. Designed to be useful for 1 or 10000 person organisation.
 
 

Outline/Structure of the Talk

Key focus area for Machine Learning Model Management with MLflow:

  1. Managing ML models in production is non-trivial. What are the challenges and concerns of machine learning management lifecycle?
  2. What is machine learning model management?
  3. Motivation and concepts behind introduction of MLflow
  4. How to solve problem of model management using MLflow?
  5. MLflow components
  6. Realtime problem and use case

Learning Outcome

Attendees will learn below key concepts and technology:

  1. What is model management?
  2. Complexities in managing machine learning model on production environment.
  3. Existing solution for model management.
  4. Limitations of existing solution.
  5. How to solve it using MLflow?
  6. Basic understanding of MLflow.
  7. Insight on realtime problem and use-case

Target Audience

Data Scientist and Machine Learning Engineer

Prerequisites for Attendees

  1. Basic understanding of machine learning and its work flow
  2. Basic understanding model management
schedule Submitted 1 year ago

  • Akash Tandon
    keyboard_arrow_down

    Akash Tandon - Traversing the graph computing and database ecosystem

    Akash Tandon
    Akash Tandon
    Data Engineer
    SocialCops
    schedule 1 year ago
    Sold Out!
    45 Mins
    Talk
    Intermediate

    Graphs have long held a special place in computer science’s history (and codebases). We're seeing the advent of a new wave of the information age; an age that is characterized by great emphasis on linked data. Hence, graph computing and databases have risen to prominence rapidly over the last few years. Be it enterprise knowledge graphs, fraud detection or graph-based social media analytics, there are a great number of potential applications.

    To reap the benefits of graph databases and computing, one needs to understand the basics as well as current technical landscape and offerings. Equally important is to understand if a graph-based approach suits your problem.
    These realizations are a result of my involvement in an effort to build an enterprise knowledge graph platform. I also believe that graph computing is more than a niche technology and has potential for organizations of varying scale.
    Now, I want to share my learning with you.

    This talk will touch upon the above points with the general premise being that data structured as graph(s) can lead to improved data workflows.
    During our journey, you will learn fundamentals of graph technology and witness a live demo using Neo4j, a popular property graph database. We will walk through a day in the life of data workers (engineers, scientists, analysts), the challenges that they face and how graph-based approaches result in elegant solutions.
    We'll end our journey with a peek into the current graph ecosystem and high-level concepts that need to be kept in mind while adopting an offering.