Building End-to-End Deep Reinforcement Learning based RecSys with TensorFlow and Kubeflow

Recommendation systems (RecSys) are the core engine for any personalized experience on eCommerce and online media websites. Most of the companies leverage RecSys to increase user interaction, to enrich shopping potential and to generate upsell & cross-sell opportunities. Amazon uses recommendations as a targeted marketing tool throughout its website that contributes 35% of its total revenue generation [1]. Netflix users watch ~75% of the recommended content and artwork [2]. Spotify employs a recommendation system to update personal playlists every week so that users won’t miss newly released music by artists they like. This has helped Spotify to increase its number of monthly users from 75 million to 100 million at a time [3]. YouTube's personalized recommendation helps users to find relevant videos quickly and easily which account for around 60% of video clicks from the homepage [4].

In general, RecSys generates recommendations based on user browsing history and preferences, past purchases and item metadata. It turns out most existing recommendation systems are based on three paradigms: collaborative filtering (CF) and its variants, content-based recommendation engines, and hybrid recommendation engines that combine content-based and CF or exploit more information about users in content-based recommendation. However, they suffer from limitations like rapidly changing user data, user preferences, static recommendations, grey sheep, cold start and malicious user.

Classical RecSys algorithm like content-based recommendation performs great on item to item similarities but will only recommend items related to one category and may not recommend anything in other categories as the user never viewed those items before. Collaborative filtering solves this problem by exploiting the user's behavior and preferences over the items in recommending items to the new users. However, collaborative filtering suffers from a few drawbacks like cold start, popularity bias, and sparsity. The classical recommendation models consider the recommendation as a static process. We can solve the static recommendation on rapidly changing user data by RL. RL based RecSys captures the user’s temporal intentions and responds promptly. However, as the user action and items matrix size increases, it becomes difficult to provide recommendations using RL. Deep RL based solutions like actor-critic and deep Q-networks overcome all the aforementioned drawbacks.

However, there are two major challenges when deep RL applied for RecSys – (a) the large and dynamic action item space, and (b) the computational cost to select an optimal recommendation. The conventional Deep Q-learning architectures inputs only the state space and outputs Q-values of all actions. This architecture is suitable for the scenario with high state space and small/fixed action space, not for dynamic action space scenario, like recommender systems. The Actor-Critic architecture is preferred since it is suitable for large and dynamic action space and can also reduce redundant computation simultaneously compared to alternative architectures. It will provide recommendations considering dynamic changes in user preference, incorporating return patterns of users, increasing diversity on a large dataset thereby making it one of the most effective recommendation models.

Model building is just one component of end to end machine learning. We will also investigate the holistic view of productionizing RecSys models using Kubeflow. A healthy learning pipeline include components such as data ingestion, transformation, feature engineering, validation, hyper-parameter tuning, A/B testing and deploy. Typical challenges when creating such a system are low latency, high performance, real-time processing, scalability, model management and governance. Kubeflow on the Kubernetes engine plays crucial role in stitching training, serving, monitoring and logging components. Kubeflow pipelines (a core component of Kubeflow) makes implementation of training pipelines simple and concise without bothering on low-level details of managing a cluster. On the other hand, Kubernetes cluster takes care of system-level challenges such as scalability, latency etc by providing features such as auto-scaling, load balancing and lot more. By the end of this workshop, you will build an effective learning system which will leverage Kubeflow on Kubernetes engine to deploy high performing scalable recommendation system using deep reinforcement learning.


  1. "":
  2. "":
  3. "":
  4. "":
  5. "Deep Reinforcement Learning based Recommendation with Explicit User-Item Interactions Modelling":
  6. "Deep Reinforcement Learning for Page-wise Recommendations":
  7. "Deep Reinforcement Learning for List-wise Recommendations":

Outline/Structure of the Workshop

  1. Introduction of recommendation system.
  2. Classical approaches and algorithms for building a recommendation system.
  3. Limitations of classical approaches.
  4. Introduction of reinforcement learning.
  5. Deep reinforcement learning based algorithm for building a recommendation system.
  6. Benefits over classical approaches.
  7. Hands-on experience of building deep RL based Recsys using Tensorflow.
  8. Hands-on experience of building and deploying production-grade machine learning workflows using Kubeflow.

Learning Outcome

  1. Gain an understanding of deep reinforcement learning-based recommendation system.
  2. Hands-on experience of building deep RL based recsys using Tensorflow.
  3. Hands-on experience of building and deploying production-grade machine learning workflows using Kubeflow.
  4. Reference architecture of recommendation engines.

Target Audience

Data Scientist and Machine Learning Engineer, Data Engineers, Data Architects

Prerequisites for Attendees

  1. A basic understanding of machine learning and programming.
  2. Exposure to cloud platform.
schedule Submitted 1 month ago

Public Feedback

comment Suggest improvements to the Speaker

  • Liked Dr. Sri Vallabha Deevi

    Dr. Sri Vallabha Deevi - How to train your dragon - Reinforcement learning from scratch

    90 Mins

    Reinforcement learning helped Google's "AlphaGo" beat the world's best Go player. Have you wondered if you too can train a program to play a simple game?

    Reinforcement learning is a simple yet powerful technique that is driving many applications, from recommender systems to autonomous vehicles. It is best suited to handle situations where the behavior of the system cannot be described in simple rules. For example, a trained reinforcement learning agent can understand the scene on the road and drive the car like a human.

    In this workshop, I will demonstrate how to train a RL agent to a) cross a maze and b) play a game of Tic-Tac-Toe against an intelligent opponent with the help of plain python code. As you participate in this workshop, you will master the basics of reinforcement learning and acquire the skills to train your own dragon.

  • Liked Srikanth K S

    Srikanth K S - Actionable Rules from Machine Learning Models

    45 Mins

    Beyond predictions, some ML models provide rules to identify actionable sub-populations in support-confidence-lift paradigm. Along with making the models interpretable, rules make it easy for stakeholders to decide on the plan of action. We discuss rule-based models in production, rule-based ensembles, anchors using R package: tidyrules.