With the booming adoption of AI applications there is a need to better integrate the process of creating, updating and maintaining ML models in a standard Continuous Integration (CI) and Continuous Deployment (CD) pipeline. A CI/CD pipeline in software development provides control of releasing the right version to the right environment, ability to rollback in case of an error and ability to manage the process. CI/CD for software development is well defined and mature, but for AI/ML projects, data scientists and engineers always struggle to find ways to apply these best practices which can make their work more effective.

Through this talk, our goal is to share the best practices we followed around CI/CD which helped us to deliver AI/ML projects successfully to top enterprise customers across various domains and is soon to be published as one of the AI Reference architecture for Azure. At end of the session, data scientists and engineers will have a good understanding of DevOps process for AI/ML project and will have a head start.

Through our published code on GitHub, we will be demonstrating how to automate the end to end flow of a ML/AI solution covering data sanity test, unit test, scalable model training, model version management, model evaluation/model selection, model deployment as a scalable real-time web service, staged deployment to QA/prod and integration testing.

Outline/Structure of the Talk

  • Overview of DevOps and Key concepts (~ 3 mins)
  • What is it? Who is it designed for? (~ 3 mins)
  • Why DevOps for AI and challenges in an AI project (~ 3 mins)
  • Why should data scientists care about DevOps? (~ 3 mins)
  • Compare traditional system with ML system and why there is a need for specialized DevOps geared towards AI (~ 3 mins)
  • Overview of DevOps for AI Reference architecture, along with our GitHub repository (~ 20 mins)
    • Reference architecture covers how to build your AI project using Azure DevOps Pipelines. Things like Unit Test, Data Test, Model Evaluation test etc.
    • How to use Azure ML pipelines from Azure DevOps pipeline and do retraining on AML compute cluster in an async manner.
    • How to do conditional release across QA and Prod environments.
  • Walkthrough of DevOps for AI for customers scenarios and learnings (~8 mins)
  • Link of other collaterals and references. (~ 2 mins)

Learning Outcome

By attending the tutorial, the audience will:

  1. Understand the different challenges when working on an AI project
  2. Learn the scope of DevOps for AI and how we can benefit from it
  3. Learn about the Build and Release CI/CD pipeline in a AI/ML project
  4. Understand how to perform model training, model evaluation, selection and conditional deployment leveraging the new Azure ML Python SDK
  5. Know how to automate the retraining and operationalization of ML Models and their deployment across different computes including Azure Kubernetes service (AKS), Azure Container Instance (ACI), IoT Edge Devices, etc.
  6. Get access to the source code of the examples on how to perform DevOps for AI solution on Azure using Azure DevOps and AML SDK

Target Audience

Data Scientist, ML Engineer, AI Engineer, DevOps Engineer

Prerequisites for Attendees

Basic Data science knowledge

Basic software engineering knowledge

schedule Submitted 7 months ago

Public Feedback

comment Suggest improvements to the Speaker
  • Dr. Vikas Agrawal
    By Dr. Vikas Agrawal  ~  6 months ago
    reply Reply

    Dear Praneet and Richin: Thanks for the proposal! Could you please share with us how much of the talk is planned to be Azure specific and how much would show open source/general purpose tools or techniques that ODSC audience can go back and use in their workplace even if they do not have the luxury of using Azure? 

    Warm Regards


    • Richin Jain
      By Richin Jain  ~  6 months ago
      reply Reply

      Hi Vikas,

      The value prop of our talk is mainly in the approach we highlight and the architecture. We used Azure ML SDK and Azure DevOps to show a concrete end-to-end example of our approach. However, the concepts can be taken as is and applied to other Machine Learning frameworks like AWS Sagemaker, DataRobot using orchestration tools like MLFlow/Kubeflow, along with build tools like Jenkins, Travis etc.

      Please let us know if you have any follow up questions.

      Thanks !

  • Deepti Tomar
    By Deepti Tomar  ~  6 months ago
    reply Reply

    Hello Praneet & Richin,

    Thanks for your submission. This is an important topic.

    Could you please update the Outline/Structure with time break - up for subsections?

    Would you be sharing a specific use case from your work to explain the subsections and achieve the learning outcome? Sharing that would be really helpful to the attendees.

    Also, please share link(s) to videos of your past presentations/conferences or a short trailer video of your session.



    • Praneet Solanki
      By Praneet Solanki  ~  6 months ago
      reply Reply

      Hi Deepti,

      Thanks for the constructive feedback. We have updated the proposal with following info:

      • Added time break-up to the Outline/Structure
      • Added the link of updated slide deck (subset of the final deck)
      • Added the link to a short trailer video for this session
      • Added a section in the talk where we go through the customer ML/AI DevOps scenario and its implementation details and learnings

      Please share if you have more feedback for us.


      Praneet Solanki, SWE II Microsoft