Deep Reinforcement Learning Based RecSys Using Distributed Q Table
Recommendation systems (RecSys) are the core engine for any personalized experience on eCommerce and online media websites. Most of the companies leverage RecSys to increase user interaction, to enrich shopping potential and to generate upsell & cross-sell opportunities. Amazon uses recommendations as a targeted marketing tool throughout its website that contributes 35% of its total revenue generation [1]. Netflix users watch ~75% of the recommended content and artwork [2]. Spotify employs a recommendation system to update personal playlists every week so that users won’t miss newly released music by artists they like. This has helped Spotify to increase its number of monthly users from 75 million to 100 million at a time [3]. YouTube's personalized recommendation helps users to find relevant videos quickly and easily which account for around 60% of video clicks from the homepage [4].
In general, RecSys generates recommendations based on user browsing history and preferences, past purchases and item metadata. It turns out most existing recommendation systems are based on three paradigms: collaborative filtering (CF) and its variants, content-based recommendation engines, and hybrid recommendation engines that combine content-based and CF or exploit more information about users in content-based recommendation. However, they suffer from limitations like rapidly changing user data, user preferences, static recommendations, grey sheep, cold start and malicious user.
Classical RecSys algorithm like content-based recommendation performs great on item to item similarities but will only recommend items related to one category and may not recommend anything in other categories as the user never viewed those items before. Collaborative filtering solves this problem by exploiting the user's behavior and preferences over the items in recommending items to the new users. However, collaborative filtering suffers from a few drawbacks like cold start, popularity bias, and sparsity. The classical recommendation models consider the recommendation as a static process. We can solve the static recommendation on rapidly changing user data by RL. RL based RecSys captures the user’s temporal intentions and responds promptly. However, as the user action and items matrix size increases, it becomes difficult to provide recommendations using RL. Deep RL based solutions like actor-critic and deep Q-networks overcome all the aforementioned drawbacks.
Present systems suffer from two limitations, firstly considering the recommendation as a static procedure and ignoring the dynamic interactive nature between users and the recommender systems. Also, most of the works focus on the immediate feedback of recommended items and neglecting the long-term rewards based on reinforcement learning. We propose a recommendation system that uses the Q-learning method. We use ε-greedy policy combined with Q learning, a powerful method of reinforcement learning that handles those issues proficiently and gives the customer more chance to explore new pages or new products that are not so popular. Usually while implementing Reinforcement Learning (RL) to real-world problems both the state space and the action space are very vast. Therefore, to address the aforementioned challenges, we propose the multiple/distributed Q table approaches which can deal with large state-action space and that aides in actualizing the Q learning algorithm in the recommendation and huge state-action space.
References:
- "https://www.mckinsey.com/industries/retail/our-insights/how-retailers-can-keep-up-with-consumers":https://www.mckinsey.com/industries/retail/our-insights/how-retailers-can-keep-up-with-consumers
- "https://medium.com/netflix-techblog/netflix-recommendations-beyond-the-5-stars-part-1-55838468f429":https://medium.com/netflix-techblog/netflix-recommendations-beyond-the-5-stars-part-1-55838468f429
- "https://www.bloomberg.com/news/articles/2016-09-21/spotify-is-perfecting-the-art-of-the-playlist":https://www.bloomberg.com/news/articles/2016-09-21/spotify-is-perfecting-the-art-of-the-playlist
- "https://dl.acm.org/citation.cfm?id=1864770":https://dl.acm.org/citation.cfm?id=1864770
- "Deep Reinforcement Learning based Recommendation with Explicit User-Item Interactions Modelling": https://arxiv.org/pdf/1810.12027.pdf
- "Deep Reinforcement Learning for Page-wise Recommendations": https://arxiv.org/pdf/1805.02343.pdf
- "Deep Reinforcement Learning for List-wise Recommendations": https://arxiv.org/pdf/1801.00209.pdf
- "Deep Reinforcement Learning Based RecSys Using Distributed Q Table": http://www.ieomsociety.org/ieom2020/papers/274.pdf
Outline/Structure of the Talk
- Introduction of recommendation system. [2 mins]
- Classical approaches and algorithms for building a recommendation system. [3 mins]
- Limitations of classical approaches. [ 2 mins]
- Introduction of reinforcement learning. [ 2 mins]
- Deep reinforcement learning-based algorithm for building a recommendation system. [5 mins]
- Traning methodology, use-case and result discussion. [5 mins]
- Closure / Q&A [1 min]
Learning Outcome
- Gain an understanding of deep reinforcement learning-based recommendation system.
- How to train and evaluate the RL model with distributed Q-table?
- Reference architecture of recommendation engines.
Target Audience
Data Scientist and Machine Learning Engineer, Data Engineers, Data Architects
Prerequisites for Attendees
The talk is a somewhat advanced talk and given it is a 20-minute session, the audience will need to be familiar with:
- A basic understanding of machine learning and programming.
- A basic understanding of reinforcement learning and recommendation system.
Video
Links
- We have presented the research paper on "Deep Reinforcement Learning Based RecSys Using Distributed Q Table" at International conference (IEOM) in Dubai on 12 March 2020. The paper has been sent for Scopus indexing, soon it will be available for all.
- We conducted a similar session "The Hitchhiker's Guide to Deep Learning-Based Recommenders in Production": https://learning.oreilly.com/videos/strata-data-conference/9781492050520/9781492050520-video324880 in Strata Conf SFO in 2019. In this tutorial, we will be using Deep Reinforcement learning and also include some of the latest features of Kubeflow such as notebook servers, Distributed training, artifact store, fairing, hyper-parameter tuning with Katib, Kubeflow pipelines.
- We had presented a session on "Industrialised Capsule Network for Text Analytics" in ODSC India 2019.
- We had presented a session on "Machine Learning Model Management" in Fifth Elephant India 2019. [Referene Link: https://hasgeek.com/fifthelephant/2019/proposals/bof-ml-model-management-k7kzLYtoc74YAtHqxbxWh8]
schedule Submitted 3 years ago
People who liked this proposal, also liked:
-
keyboard_arrow_down
Sharmistha Chatterjee - Machine Learning and Data Governance in Telecom Industry
20 Mins
Demonstration
Intermediate
The key aspect in solving ML problems in telecom industry lies in continuous data collection and evaluation from different categories of customers and networks so as to track and dive into varying performance metrics. The KPIs form the basis of network monitoring helping network/telecom operators to automatically add and scale network resources. Such smart automated systems are built with the objective of increasing customer engagement through enhanced customer experience and tracking customer behavior anomaly with timely detection and correction. Further the system is designed to scale and serve current LTE, 4G and upcoming 5G networks with minimal non-effective cell site visits and quick identification of Root Cause Analysis (RCA).
Network congestion has remained an ever-increasing problem. Operators have attempted a variety of strategies to match the network demand capacity with existing infrastructure, as the cost of deploying additional network capacities is expensive. To keep the cost under control, operators apply control measures to attempt to allocate bandwidth fairly among users and throttle the bandwidth of users that consume excessive bandwidth. This approach had limited success. Alternatively, techniques that utilize extra bandwidth for quality of experience (QOE) efficiency by over-provisioning the network has proved to be ineffective and inefficient due to lack of proper estimation.
The evolution of 5G networks, would lead manufacturers and telecom operators to use high-data transfer rates, wide network coverage, low latency to build smart factories using automation, artificial intelligence and Internet of Things (IoT). The application of advanced data science and AI can provide better predictive insights to improve network capacity-planning accuracy. Better network provisioning would yield better network utilization for both next-generation networks based on 5G technology and current LTE and 4G networks. Further AI models can be designed to link application throughput with network performance, prompting users to plan their daily usage based on their current location and total monthly budget.
In this talk, we will understand the current challenges in the telecom industry, the need for an AIOPS platform, and the mission held by telecom operators, communication service providers across the world for designing such AI frameworks, platforms, and best practices. We will see how increasing operator collaborations are helping to create, deploy and produce AI platforms for different AI use-cases. We will study one industrial use-case (with code) based on real-time field research to predict network capacity. In this respect, we will investigate how deep learning networks can be used to train large volumes of data at scale (millions of network cells), and how its use can help the upcoming 5G networks. We will also examine an end-to-end pipeline of hosting the scalable framework on Google Cloud. As data volume is huge and data needs to be stored in highly secured systems, we build our high-performing system with extra security features that can process millions of request in an order of few mili-secs. As the session highlights parameters and metrics in creating a LSTM based neural network, it also discusses the challenges and some of the key aspects involved in designing and scaling the system.
-
keyboard_arrow_down
Gunjan Dewan - Developing a match-making algorithm between customers and Go-Jek products!
20 Mins
Talk
Beginner
20+ products. Millions of active customers. Insane amount of data and complex domain. Come join me in this talk to know the journey we at Gojek took to predict which of our products a user is most likely to use next.
A major problem we faced, as a company, was targeting our customers with promos and vouchers that were relevant to them. We developed a generalized model that takes into account the transaction history of users and gives a ranked list of our services that they are most likely to use next. From here on, we are able to determine the vouchers that we can target these customers with.
In this talk, I will be talking about how we used recommendation engines to solve this problem, the challenges we faced during the time and the impact it had on our conversion rates. I will also be talking about the different iterations we went through and how our problem statement evolved as we were solving the problem.
-
keyboard_arrow_down
Kriti Doneria - Trust Building in AI systems: A critical thinking perspective
20 Mins
Talk
Beginner
How do I know when to trust AI,and when not to?
Who goes to jail if a self driving car kills someone tomorrow?
Do you know scientists say people will believe anything,repeated enough
Designing AI systems is also an exercise in critical thinking because an AI is only as good as its creator.This talk is for discussions like these,and more.
With the exponential increase in computing power available, several AI algorithms that were mere papers written decades ago have become implementable. For a data scientist, it is very tempting to use the most sophisticated algorithm available. But given that its applicability has moved beyond academia and out into the business world, are numbers alone sufficient? Putting context to AI, or XAI (explainable AI) takes the black box out of AI to enhance human-computer interaction. This talk shall revolve around the interpret-ability-complexity trade-off, challenges, drivers and caveats of the XAI paradigm, and an intuitive demo of translating inner workings of an ML algorithm into human understandable formats to achieve more business buy-ins.
Prepare to be amused and enthralled at the same time.
-
keyboard_arrow_down
Dr. Sri Vallabha Deevi - How to train your dragon - Reinforcement learning from scratch
20 Mins
Talk
Beginner
Reinforcement learning helped Google's "AlphaGo" beat the world's best Go player. Have you wondered if you too can train a program to play a simple game?
Reinforcement learning is a simple yet powerful technique that is driving many applications, from recommender systems to autonomous vehicles. It is best suited to handle situations where the behavior of the system cannot be described in simple rules. For example, a trained reinforcement learning agent can understand the scene on the road and drive the car like a human. In supply chain management, RL agents can make decisions on inventory ordering.
In this talk, I will demonstrate how to train a RL agent to a) cross a maze and b) play a game of Tic-Tac-Toe against an intelligent opponent c) act as a warehouse manager and learn inventory ordering; with the help of plain python code. As you participate in this talk, you will master the basics of reinforcement learning and acquire the skills to train your own dragon.
-
keyboard_arrow_down
Srikanth K S - Actionable Rules from Machine Learning Models
20 Mins
Demonstration
Intermediate
Beyond predictions, some ML models provide rules to identify actionable sub-populations in support-confidence-lift paradigm. Along with making the models interpretable, rules make it easy for stakeholders to decide on the plan of action. We discuss rule-based models in production, rule-based ensembles, anchors using R package: tidyrules.