Developing a match-making algorithm between customers and Go-Jek products!

20+ products. Millions of active customers. Insane amount of data and complex domain. Come join me in this talk to know the journey we at Gojek took to predict which of our products a user is most likely to use next.

A major problem we faced, as a company, was targeting our customers with promos and vouchers that were relevant to them. We developed a generalized model that takes into account the transaction history of users and gives a ranked list of our services that they are most likely to use next. From here on, we are able to determine the vouchers that we can target these customers with.

In this talk, I will be talking about how we used recommendation engines to solve this problem, the challenges we faced during the time and the impact it had on our conversion rates. I will also be talking about the different iterations we went through and how our problem statement evolved as we were solving the problem.

 
 

Outline/Structure of the Talk

  1. Introduction to me and GoJek: 1 Min
  2. What is customer targeting? : 1 Min
  3. Defining the problem statement: 3 Min
  4. Iterations to solve the problem:
    1. Iteration 1: Classification: 2 Min
    2. Iteration 2: Recommendation Systems: 4 Min
      1. What is a recommendation system?
      2. How does it fit into our problem?
      3. Brief detail about different ways to build a recommendation engine
  5. Challenges Faced: 5 Min
    1. Choosing between algorithms. KNN vs Matric Factorisation.
    2. Choosing the Optimisation Technique for Matrix Factorisation. SVD vs ALS
    3. Dealing with the huge size of the utility matrix and Reducing training time
    4. Dealing with implicit data. Converting implicit data into explicit data
  6. Workflow: 1 Min
  7. Impact and Results: 1 Min
  8. QnA. : 2 Min

Learning Outcome

1. Learn how to apply recommendation systems to customer targeting problems.

2. Experience the vastness of data we have at Go-Jek and how we deal with the scale.

Target Audience

Data Scientists, Product Managers

Prerequisites for Attendees

Basic knowledge about Data Science and Machine Learning algorithms is required.

schedule Submitted 6 months ago

Public Feedback

comment Suggest improvements to the Author
  • Ashay Tamhane
    By Ashay Tamhane  ~  5 months ago
    reply Reply

    Hi Gunjan, thanks for the proposal. One of the important learning outcomes you mention is "Experience the vastness of data we have at Go-Jek and how we deal with the scale". Could you kindly let us know which points in the outline cover this particular aspect? Thanks.

    • Gunjan Dewan
      By Gunjan Dewan  ~  5 months ago
      reply Reply

      Hi Ashay,

      Thanks for your comment!

      This will be covered in the 'Challenges Faced' point from the outline.

      I'll be talking about how we had to choose PySpark in order to deal with the size of the data (over python) and how that helped in reducing the training time.

       

      Thanks,

      Gunjan

      • Ashay Tamhane
        By Ashay Tamhane  ~  5 months ago
        reply Reply

        Thanks Gunjan for the clarification. Currently I see 2 mins allocated for this. I feel since most of the audience are practicing data scientists, the proposal will be stronger if you plan to spend more time on practical challenges faced and solutions. Also, it would be great if you can add some of the challenges you faced in the proposal outline. Thanks.

        • Gunjan Dewan
          By Gunjan Dewan  ~  5 months ago
          reply Reply

          Hi Ashay,

          Apologies for the delay in response. The email for this comment got lost in my inbox somehow.

          I have updated the outline with some of the challenges I plan to talk about. I have also increased the time allocated to the challenges.

          Thanks

  • Santonu Goswami
    By Santonu Goswami  ~  5 months ago
    reply Reply

    Hi Gunjan, 

    Thank you for the proposal. 

    Your proposal is well written on the background aspect. The outline/structure of the talk and the learning objectives are also clearly defined. 

    But the proposal does not provide much detail on the technical aspect such as what models you used etc. This should be updated in the proposal. 

    Thanks for uploading a very well recorded video of the talk. 

    Thanks, 

    Santonu

     

    • Gunjan Dewan
      By Gunjan Dewan  ~  5 months ago
      reply Reply

      Hi Santonu,

      Thanks for your comment!

      I have updated the proposal with the technical aspect of my talk. 

      If you feel there is anything else missing in the proposal, please let me know. I'd be happy to update it further.

      Thanks

      Gunjan

  • Natasha Rodrigues
    By Natasha Rodrigues  ~  6 months ago
    reply Reply

    Hi Gunjan,

    Thanks for your proposal! Requesting you to update the Outline/Structure section of your proposal with a time-wise breakup of how you plan to use 20 mins for the topics you've highlighted?

    Also, in order to ensure the completeness of your proposal, we suggest you go through the review process requirements

    Thanks,

    Natasha

    • Gunjan Dewan
      By Gunjan Dewan  ~  6 months ago
      reply Reply

      Hi Natasha,

      Thanks for the comment!

      I have updated the proposal with the time-wise breakup in the Outline/Structure session. Please note that these are tentative times and it might vary a bit as I work more on the presentation material.

      I have also added a link to a blog post that I wrote on "GoJek Engineering Blog" about the same topic as this talk. 

      If you feel there is anything else missing in the proposal, please let me know. I'd be happy to update it further.

      Thanks,

      Gunjan

      • Natasha Rodrigues
        By Natasha Rodrigues  ~  6 months ago
        reply Reply

        Hi Gunjan,

        Thanks a ton, will let you know if we need any more details.

        Regards,

        Natasha

  • Sujoy Roychowdhury
    By Sujoy Roychowdhury  ~  6 months ago
    reply Reply

    Any details on the technical aspects of the solution which you would discuss ?

    • Gunjan Dewan
      By Gunjan Dewan  ~  6 months ago
      reply Reply

      Hi,

      I would be discussing recommendation systems. I will be going into details of how a recommendation engine is built and how we applied it to our problem.

      Hope this answers your question. 

      • Sujoy Roychowdhury
        By Sujoy Roychowdhury  ~  6 months ago
        reply Reply

        What sort of algorithms would you be discussing within recommendation systems ? 

        • Gunjan Dewan
          By Gunjan Dewan  ~  6 months ago
          reply Reply

          Hi,

          I'll be discussing KNN and Matrix Factorization in brief. 


  • Ravi Ranjan
    keyboard_arrow_down

    Ravi Ranjan - Deep Reinforcement Learning Based RecSys Using Distributed Q Table

    Ravi Ranjan
    Ravi Ranjan
    Senior Data Scientist
    Publicis Sapient
    schedule 11 months ago
    Sold Out!
    20 Mins
    Talk
    Intermediate

    Recommendation systems (RecSys) are the core engine for any personalized experience on eCommerce and online media websites. Most of the companies leverage RecSys to increase user interaction, to enrich shopping potential and to generate upsell & cross-sell opportunities. Amazon uses recommendations as a targeted marketing tool throughout its website that contributes 35% of its total revenue generation [1]. Netflix users watch ~75% of the recommended content and artwork [2]. Spotify employs a recommendation system to update personal playlists every week so that users won’t miss newly released music by artists they like. This has helped Spotify to increase its number of monthly users from 75 million to 100 million at a time [3]. YouTube's personalized recommendation helps users to find relevant videos quickly and easily which account for around 60% of video clicks from the homepage [4].

    In general, RecSys generates recommendations based on user browsing history and preferences, past purchases and item metadata. It turns out most existing recommendation systems are based on three paradigms: collaborative filtering (CF) and its variants, content-based recommendation engines, and hybrid recommendation engines that combine content-based and CF or exploit more information about users in content-based recommendation. However, they suffer from limitations like rapidly changing user data, user preferences, static recommendations, grey sheep, cold start and malicious user.

    Classical RecSys algorithm like content-based recommendation performs great on item to item similarities but will only recommend items related to one category and may not recommend anything in other categories as the user never viewed those items before. Collaborative filtering solves this problem by exploiting the user's behavior and preferences over the items in recommending items to the new users. However, collaborative filtering suffers from a few drawbacks like cold start, popularity bias, and sparsity. The classical recommendation models consider the recommendation as a static process. We can solve the static recommendation on rapidly changing user data by RL. RL based RecSys captures the user’s temporal intentions and responds promptly. However, as the user action and items matrix size increases, it becomes difficult to provide recommendations using RL. Deep RL based solutions like actor-critic and deep Q-networks overcome all the aforementioned drawbacks.

    Present systems suffer from two limitations, firstly considering the recommendation as a static procedure and ignoring the dynamic interactive nature between users and the recommender systems. Also, most of the works focus on the immediate feedback of recommended items and neglecting the long-term rewards based on reinforcement learning. We propose a recommendation system that uses the Q-learning method. We use ε-greedy policy combined with Q learning, a powerful method of reinforcement learning that handles those issues proficiently and gives the customer more chance to explore new pages or new products that are not so popular. Usually while implementing Reinforcement Learning (RL) to real-world problems both the state space and the action space are very vast. Therefore, to address the aforementioned challenges, we propose the multiple/distributed Q table approaches which can deal with large state-action space and that aides in actualizing the Q learning algorithm in the recommendation and huge state-action space.

    References:

    1. "https://www.mckinsey.com/industries/retail/our-insights/how-retailers-can-keep-up-with-consumers":https://www.mckinsey.com/industries/retail/our-insights/how-retailers-can-keep-up-with-consumers
    2. "https://medium.com/netflix-techblog/netflix-recommendations-beyond-the-5-stars-part-1-55838468f429":https://medium.com/netflix-techblog/netflix-recommendations-beyond-the-5-stars-part-1-55838468f429
    3. "https://www.bloomberg.com/news/articles/2016-09-21/spotify-is-perfecting-the-art-of-the-playlist":https://www.bloomberg.com/news/articles/2016-09-21/spotify-is-perfecting-the-art-of-the-playlist
    4. "https://dl.acm.org/citation.cfm?id=1864770":https://dl.acm.org/citation.cfm?id=1864770
    5. "Deep Reinforcement Learning based Recommendation with Explicit User-Item Interactions Modelling": https://arxiv.org/pdf/1810.12027.pdf
    6. "Deep Reinforcement Learning for Page-wise Recommendations": https://arxiv.org/pdf/1805.02343.pdf
    7. "Deep Reinforcement Learning for List-wise Recommendations": https://arxiv.org/pdf/1801.00209.pdf
    8. "Deep Reinforcement Learning Based RecSys Using Distributed Q Table": http://www.ieomsociety.org/ieom2020/papers/274.pdf
  • Kriti Doneria
    keyboard_arrow_down

    Kriti Doneria - Trust Building in AI systems: A critical thinking perspective

    Kriti Doneria
    Kriti Doneria
    Data Science
    Practitioner
    schedule 1 year ago
    Sold Out!
    20 Mins
    Talk
    Beginner

    How do I know when to trust AI,and when not to?

    Who goes to jail if a self driving car kills someone tomorrow?

    Do you know scientists say people will believe anything,repeated enough

    Designing AI systems is also an exercise in critical thinking because an AI is only as good as its creator.This talk is for discussions like these,and more.

    With the exponential increase in computing power available, several AI algorithms that were mere papers written decades ago have become implementable. For a data scientist, it is very tempting to use the most sophisticated algorithm available. But given that its applicability has moved beyond academia and out into the business world, are numbers alone sufficient? Putting context to AI, or XAI (explainable AI) takes the black box out of AI to enhance human-computer interaction. This talk shall revolve around the interpret-ability-complexity trade-off, challenges, drivers and caveats of the XAI paradigm, and an intuitive demo of translating inner workings of an ML algorithm into human understandable formats to achieve more business buy-ins.

    Prepare to be amused and enthralled at the same time.

  • Dr. Sri Vallabha Deevi
    keyboard_arrow_down

    Dr. Sri Vallabha Deevi - How to train your dragon - Reinforcement learning from scratch

    20 Mins
    Talk
    Beginner

    Reinforcement learning helped Google's "AlphaGo" beat the world's best Go player. Have you wondered if you too can train a program to play a simple game?

    Reinforcement learning is a simple yet powerful technique that is driving many applications, from recommender systems to autonomous vehicles. It is best suited to handle situations where the behavior of the system cannot be described in simple rules. For example, a trained reinforcement learning agent can understand the scene on the road and drive the car like a human. In supply chain management, RL agents can make decisions on inventory ordering.

    In this talk, I will demonstrate how to train a RL agent to a) cross a maze and b) play a game of Tic-Tac-Toe against an intelligent opponent c) act as a warehouse manager and learn inventory ordering; with the help of plain python code. As you participate in this talk, you will master the basics of reinforcement learning and acquire the skills to train your own dragon.

  • Srikanth K S
    keyboard_arrow_down

    Srikanth K S - Actionable Rules from Machine Learning Models

    20 Mins
    Demonstration
    Intermediate

    Beyond predictions, some ML models provide rules to identify actionable sub-populations in support-confidence-lift paradigm. Along with making the models interpretable, rules make it easy for stakeholders to decide on the plan of action. We discuss rule-based models in production, rule-based ensembles, anchors using R package: tidyrules.