Building a Feature Platform to Scale Machine Learning at GO-JEK

schedule Sep 1st 10:15 AM - 11:00 AM place Grand Ball Room 2 people 60 Interested

Go-Jek, Indonesia’s first billion-dollar startup, has seen an incredible amount of growth in both users and data over the past two years. Many of the ride-hailing company's services are backed by machine learning models. Models range from driver allocation, to dynamic surge pricing, to food recommendation, and process millions of bookings every day, leading to substantial increases in revenue and customer retention.

Building a feature platform has allowed Go-Jek to rapidly iterate and launch machine learning models into production. The platform allows for the creation, storage, access, and discovery of features. It supports both low latency and high throughput access in serving, as well as high volume queries of historic feature data during training. This allows Go-Jek to react immediately to real world events.

Find out how Go-Jek implemented their feature platform, and other lessons learned scaling machine learning.

 
3 favorite thumb_down thumb_up 4 comments visibility_off  Remove from Watchlist visibility  Add to Watchlist
 

Outline/structure of the Session

  • Introduction to GO-JEK
  • Problem statement (What we set out to achieve)
  • Feature Platform
  • Machine Learning
  • Wrap up

Learning Outcome

  • You can't scale Data Science at a unicorn without scaling the tech.
  • Free yourself up from managing infrastructure by leveraging the cloud.
  • In a data rich organization, feature engineering is more important than complex models.
  • A feature platform is a necessity at scale, to avoid reimplementing functionality between projects.

Target Audience

CTO, CDO, CIO, Data Engineers, ML Engineers, Data Scientists

Prerequisite

Familiarity with basic data science terminology.

Some familiarity with types of typical cloud providers (GCP, AWS), as well as typical systems found in production systems.

schedule Submitted 5 months ago

Comments Subscribe to Comments

comment Comment on this Submission
  • Naresh Jain
    By Naresh Jain  ~  5 months ago
    reply Reply

    Thanks for the proposal, Willem. I'm very intrigued by the feature platform idea. Where can I learn more about Go-Jek's feature platform? 

    For the program committee to understand your presentation style, can you please share links to past video presentations?

    • Willem Pienaar
      By Willem Pienaar  ~  5 months ago
      reply Reply
      • There is no public information available at the moment about the feature platform.
      • I don't have any past presentations to point share, unfortunately. 
      • Naresh Jain
        By Naresh Jain  ~  5 months ago
        reply Reply

        Thanks for the prompt response. Can I please request you to share a short 1 min trailer of your talk explaining the feature platform? You can record it via your phone or any screen sharing software.

        • Willem Pienaar
          By Willem Pienaar  ~  5 months ago
          reply Reply

          Well the talk itself needs to be created first, it's still just an idea. However, I am willing to have a quick call with you to describe the feature platform in more detail. How does that sound?


  • Liked Dr. Ananth Sankar
    keyboard_arrow_down

    Dr. Ananth Sankar - The Deep Learning Revolution in Automatic Speech Recognition

    Dr. Ananth Sankar
    Dr. Ananth Sankar
    Principal Researcher
    LinkedIn
    schedule 3 months ago
    Sold Out!
    45 Mins
    Keynote
    Beginner

    In the last decade, deep neural networks have created a major paradigm shift in speech recognition. This has resulted in dramatic and previously unseen reductions in word error rate across a range of tasks. These improvements have fueled products such as voice search and voice assistants like Amazon Alexa and Google Home.

    The main components of a speech recognition system are the acoustic model, lexicon, and language model. In recent years, the acoustic model has evolved from using Gaussian mixture models to deep neural networks, resulting in significant reductions in word error rate. Recurrent neural network language models have also given improvements over the traditional statistical n-gram language models. More recently sequence to sequence recurrent neural network models have subsumed the acoustic model, lexicon, and language model into one system, resulting in a far simpler model that gives comparable accuracy to the traditional systems. This talk will outline this evolution of speech recognition technology, and close with some key challenges and interesting new areas to apply this technology.

  • Liked Dr. Ravi Mehrotra
    keyboard_arrow_down

    Dr. Ravi Mehrotra - Seeking Order amidst Chaos and Uncertainty

    45 Mins
    Keynote
    Beginner

    Applying analytics to determine an optimal answer to business decision problems is relatively easy when the future can be predicted accurately. When the business environment is very complex and the future cannot be predicted, the business problem can become intractable using traditional modeling and problem-solving techniques. How do we solve such complex and intractable business problems to find globally optimal answers in highly uncertain business environments? The talk will discuss modeling and solution techniques that allow us to find optimal solutions in highly uncertain business environments without ignoring or underestimating uncertainty for revenue management and dynamic price optimization problems that arise in the airline and hospitality industry.

  • Liked Naresh Jain
    keyboard_arrow_down

    Naresh Jain / Dr. Arun Verma / Dr. Denis Bauer / Favio Vázquez / Sheamus McGovern / Drs. Tarry Singh / Dr. Tom Starke / Dr. Veena Mendiratta - Unanswered Questions - Ask the Experts!

    45 Mins
    Keynote
    Beginner

    Through the conference, we would have heard different speaker's perspective and experience with Data Science and AI. In this closing panel, we want to step back and look at any unanswered questions that the audience would have.

  • Liked Sheamus McGovern
    keyboard_arrow_down

    Sheamus McGovern / Naresh Jain - Welcome Address

    20 Mins
    Keynote
    Beginner

    This talk will help you understand the vision behind ODSC Conference and how it has grown over the years.

  • Liked Sohan Maheshwar
    keyboard_arrow_down

    Sohan Maheshwar - It's All in the Data: The Machine Learning Behind Alexa's AI Systems

    Sohan Maheshwar
    Sohan Maheshwar
    Alexa Evangelist
    Amazon
    schedule 5 months ago
    Sold Out!
    45 Mins
    Talk
    Intermediate

    Amazon Alexa, the cloud-based voice service that powers Amazon Echo, provides access to thousands of skills that enable customers to voice control their world - whether it’s listening to music, controlling smart home devices, listening to the news or even ordering a pizza. Alexa developers use advanced natural language understanding that to use capabilities like built-in slot & intent training, entity resolution, and dialog management. This natural language understanding is powered by advanced machine learning algorithms that will be the focus of this talk.

    This session will tell you about the rise of voice user interfaces and will give an in-depth look into how Alexa works. The talk will delve into the natural language understanding and how utterance data is processed by our systems, and what a developer can do to improve accuracy of their skill. Also, the talk will discuss how Alexa hears and understands you and how error handling works.

  • Liked Bargava Subramanian
    keyboard_arrow_down

    Bargava Subramanian / Amit Kapoor - Deep Learning in the Browser: Explorable Explanations, Model Inference, and Rapid Prototyping

    45 Mins
    Demonstration
    Beginner

    The browser is the most common end-point consumption of deep learning models. It is also the most ubiquitous platform for programming available. The maturity of the client-side JavaScript ecosystem across the deep learning process—Data Frame support (Arrow), WebGL-accelerated learning frameworks (deeplearn.js), declarative interactive visualization (Vega-Lite), etc.—have made it easy to start playing with deep learning in the browser.

    Amit Kapoor and Bargava Subramanian lead three live demos of deep learning (DL) for explanations, inference, and training done in the browser, using the emerging client-side JavaScript libraries for DL with three different types of data: tabular, text, and image. They also explain how the ecosystem of tools for DL in the browser might emerge and evolve.

    Demonstrations include:

    1. Explorable explanations: Explaining the DL model and allowing the users to build intuition on the model helps generate insight. The explorable explanation for a loan default DL model allows the user to explore the feature space and threshold boundaries using interactive visualizations to drive decision making.
    2. Model inference: Inference is the most common use case. The browser allows you to bring your DL model to the data and also allows you test how the model works when executed on the edge. The demonstrated comments sentiment application can identify and warn users about the toxicity of your comments as you type in a text box.
    3. Rapid prototyping: Training DL models is now possible in the browser itself, if done smartly. The rapid prototyping image classification example allows the user to play with transfer learning to build a model specific for a user-generated image input.

    The demos leverage the following libraries in JavaScript:

    • Arrow for data loading and type inference
    • Facets for exploratory data analysis
    • ml.js for traditional machine learning model training and inference
    • deeplearn.js for deep learning model training and inference
    • Vega and Vega-Lite for interactive dashboards

    The working demos will be available on the web and as open source code on GitHub.

  • Liked Amit Kapoor
    keyboard_arrow_down

    Amit Kapoor / Bargava Subramanian - Architectural Decisions for Interactive Viz

    45 Mins
    Talk
    Beginner

    Visualization is an integral part of the data science process and includes exploratory data analysis to understand the shape of the data, model visualization to unbox the model algorithm, and dashboard visualization to communicate the insight. This task of visualization is increasingly shifting from a static and narrative setup to an interactive and reactive setup, which presents a new set of challenges for those designing interactive visualization applications.

    Creating visualizations for data science requires an interactive setup that works at scale. Bargava Subramanian and Amit Kapoor explore the key architectural design considerations for such a system and discuss the four key trade-offs in this design space: rendering for data scale, computation for interaction speed, adapting to data complexity, and being responsive to data velocity.

    • Rendering for data scale: Envisioning how the visualization can be displayed when data size is small is not hard. But how do you render interactive visualization when you have millions or billions of data points? Technologies and techniques include bin-summarise-smooth (e.g., Datashader and bigvis) and WebGL-based rendering (e.g., deck.gl).
    • Computation for interaction speed: Making the visualization reactive requires the user to have the ability to interact, drill down, brush, and link multiple visual views to gain insight. But how do you reduce the latency of the query at the interaction layer so that the user can interact with the visualization? Technologies and techniques include aggregation and in-memory cubes (e.g., hashcubes, InMEMS, and nanocubes), approximate query processing and sampling (e.g., VerdictDB), and GPU-based databases (e.g., MapD).
    • Adapting to data complexity: Choosing a good visualization design for a singular dataset is possible after a few experiments and iterations, but how do you ensure that the visualization will adapt to the variety, volume, and edge cases in the real data? Technologies and techniques include responsive visualization to space and data, handling high cardinality (e.g., Facet Dive), and multidimensional reduction (e.g., Embedding Projector).
    • Being responsive to data velocity: Designing for periodic query-based visualization refreshes is one thing, but streaming data adds a whole new level of challenge to interactive visualization. So how do you work decide between the trade-offs of real-time and near real-time data and their impact on refreshing visualization? Technologies and techniques include optimizing for near real-time visual refreshes and handling event- and time-based streams.
  • Liked Kavita Dwivedi
    keyboard_arrow_down

    Kavita Dwivedi - Social Network Analytics to enhance Marketing Outcomes in Telecom Sector

    20 Mins
    Experience Report
    Beginner

    This talk will focus on How SNA can help enhance the outcomes of Marketing Campaigns by using social network graphs .

    Social network analytics (SNA) is the process of investigating social structures through the use of network and graph theories. It characterizes networked structures in terms of nodes (individual actors, people, or things within the network) and the ties or edges (relationships or interactions) that connect them. This is emerging as an important tool to understand customer behavior and influencing his behavior. The talk will focus on the mathematics behind SNA and how SNA can help make marketing decisions for telecom operators.

    SNA use case will use telecom consumer data to establish networks based on their calling behavior like frequency, duration of calls, types of connections and thus establish major communities and influencers. By identifying key influencers and active communities marketing campaigns can be made more effective/viral. It helps in improving the adoption rate by targeting influencers with a large degree of followers. It will also touch upon how SNA helps retention rate and spread the impact of marketing campaigns. The tools used for use case is SAS SNA and Node XL for demonstration purpose. It will show how SNA helps in lifting the impact of campaigns.

    This use case will illustrate a project focused on building a SNA model using a combination of demographic/firmographic variables for companies variables and Call frequency details. The dimensions like the company you work with, the place you stay, your professional experience and position, Industry Type etc. helps add a lot more value to the social network graph. With the right combination of the dimensions and problem at hand, in our case, it was more of marketing analytics we can identify the right influencers within a network. The more dimensions we add, the network gets stronger and more effective for running campaigns.

    Looking forward to discussing the outcomes of this project with the audience and fellow speakers

  • Liked Joy Mustafi
    keyboard_arrow_down

    Joy Mustafi - The Artificial Intelligence Ecosystem driven by Data Science Community

    45 Mins
    Talk
    Intermediate

    Cognitive computing makes a new class of problems computable. To respond to the fluid nature of users understanding of their problems, the cognitive computing system offers a synthesis not just of information sources but of influences, contexts, and insights. These systems differ from current computing applications in that they move beyond tabulating and calculating based on pre-configured rules and programs. They can infer and even reason based on broad objectives. In this sense, cognitive computing is a new type of computing with the goal of more accurate models of how the human brain or mind senses, reasons, and responds to stimulus. It is a field of study which studies how to create computers and computer software that are capable of intelligent behavior. This field is interdisciplinary, in which a number of sciences and professions converge, including computer science, electronics, mathematics, statistics, psychology, linguistics, philosophy, neuroscience and biology. Project Features are Adaptive: They MUST learn as information changes, and as goals and requirements evolve. They MUST resolve ambiguity and tolerate unpredictability. They MUST be engineered to feed on dynamic data in real time; Interactive: They MUST interact easily with users so that those users can define their needs comfortably. They MUST interact with other processors, devices, services, as well as with people; Iterative and Stateful: They MUST aid in defining a problem by asking questions or finding additional source input if a problem statement is ambiguous or incomplete. They MUST remember previous interactions in a process and return information that is suitable for the specific application at that point in time; Contextual: They MUST understand, identify, and extract contextual elements such as meaning, syntax, time, location, appropriate domain, regulation, user profile, process, task and goal. They may draw on multiple sources of information, including both structured and unstructured digital information, as well as sensory inputs (visual, gestural, auditory, or sensor-provided). {A set of cognitive systems is implemented and demonstrated as the project J+O=Y}

  • Liked Dr. Rohit M. Lotlikar
    keyboard_arrow_down

    Dr. Rohit M. Lotlikar - The Impact of Behavioral Biases to Real-World Data Science Projects: Pitfalls and Guidance

    45 Mins
    Talk
    Intermediate

    Data science projects, unlike their software counterparts tend to be uncertain and rarely fit into standardized approach. Each organization has it’s unique processes, tools, culture, data and in-efficiencies and a templatized approach, more common for software implementation projects rarely fits.

    In a typical data science project, a data science team is attempting to build a decision support system that will either automate human decision making or assist a human in decision making. The dramatic rise in interest in data sciences means the typical data science project has a large proportion of relatively inexperienced members whose learnings draw heavily from academics, data science competitions and general IT/software projects.

    These data scientists learn over time that the real world however is very different from the world of data science competitions. In the real-word problems are ill-defined, data may not exist to start with and it’s not just model accuracy, complexity and performance that matters but also the ease of infusing domain knowledge, interpretability/ability to provide explanations, the level of skill needed to build and maintain it, the stability and robustness of the learning, ease of integration with enterprise systems and ROI.

    Human factors play a key role in the success of such projects. Managers making the transition from IT/software delivery to data science frequently do not allow for sufficient uncertainty in outcomes when planning projects. Senior leaders and sponsors, are under pressure to deliver outcomes but are unable to make a realistic assessment of payoffs and risks and set investment and expectations accordingly. This makes the journey and outcome sensitive to various behavioural biases of project stakeholders. Knowing what the typical behavioural biases and pitfalls makes it easier to identify those upfront and take corrective actions.

    The speaker brings his nearly two decades of experience working at startups, in R&D and in consulting to lay forth these recurring behavioural biases and pitfalls.

    Many of the biases covered are grounded in the speakers first-hand experience. The talk will provide examples of these biases and suggestions on how to identify and overcome or correct for them.

  • 20 Mins
    Experience Report
    Intermediate

    Generative Models are important techniques used in computer vision. Unlike other neural networks that are used for predictions from images, generative models can generate new images for specific objectives. This session will review several applications of generative modeling such as artistic style transfer, image generation and image translation using CNNs and GANs.

  • Liked Jyotsna Khemka
    keyboard_arrow_down

    Jyotsna Khemka / Dr. Amarpal S Kapoor - Distributed Deep Learning on Spark CPU Clusters with Intel BigDL

    Jyotsna Khemka
    Jyotsna Khemka
    Sr. Technical Consulting Engineer
    Intel
    Dr. Amarpal S Kapoor
    Dr. Amarpal S Kapoor
    Software Technical Consulting Engineer
    Intel
    schedule 2 months ago
    Sold Out!
    480 Mins
    Workshop
    Beginner

    Are you or your customers running on Apache Hadoop and Spark clusters? Are you curious on how you can run Deep Learning for inference and training datasets on your existing Hadoop or Spark CPU clusters without having to migrate your data from one setup to another – experience how easy it is to run DL on your general purpose CPU. This workshop will help you - Get Started on your journey with Intel's Big Data Deep learning Library - BigDL.

    Join AI experts at Intel for a deep-dive workshop to learn about computer vision and deep learning.

    • Technical overview of BigDL architecture and learn:
      • How it fits in the Apache Spark stack
      • Key features
      • Resources, code samples, and tips
    • Build with BigDL
      • Learn how to deploy BigDL on-premise or in the cloud to build end-to-end solutions.
      • Review case studies
      • Experience real-life demos
  • Liked Dr. Atul Singh
    keyboard_arrow_down

    Dr. Atul Singh - Relationships Matter: Mining relationships using Deep Learning

    20 Mins
    Experience Report
    Intermediate

    The desire to reduce the cognitive load on human agents for processing swathes of data in natural languages is driving the adoption of machine learning based software solutions for extracting structured information from unstructured text for a variety of use case scenarios such as monitoring Internet sites for potential terror threat and analyzing documents from disparate sources to identify potentially illegal transactions. These aforementioned software solutions for extracting structured information from unstructured text rely on the ability to identify the entities and the relationship between the entities using Natural Language Processing that has benefitted immensely from the progress in deep learning.

    The goal of this talk is to introduce relationship extraction a key plinth stone of natural language understanding, and its use for building knowledge graphs to represent structured information extracted from unstructured text. The talk demonstrates how deep learning lends itself well to the problem of relationship extraction and provides an elegant and simple solution.

  • Liked Dr. Vikas Agrawal
    keyboard_arrow_down

    Dr. Vikas Agrawal - Bring in the Lawyers: Explainable AI Driven Decision-making for the Enterprise

    45 Mins
    Case Study
    Intermediate

    Daniel Dennett (Tufts University) says “If it can’t do better than us at explaining what it’s doing, then don’t trust it.” Will I believe the machine's recommendation enough to make a serious decision? What if need to explain my decision in court or to my shareholders or to individual customers? Is high precision and recall enough? We will see some examples where integrative AI models get better and better at providing actionable intelligence such that to ignore the advice could be considered irresponsible, reckless or discriminatory. Who would be to blame if the advice given by the AI system is found erroneous or disregarded? Then, the advice given by the AI system itself becomes confidential attorney-client privileged communication, and there are real debates around giving the privilege of plausible deniability to senior leadership of corporations.

    Wouldn't it be better to provide an explanation for the recommendations, and let the humans decide whether the advice makes sense? Moreover, in some geographies like Europe (GPDR), and in industries like banking, credit cards and pharmaceuticals, the explanations for predictions (or decision rules derived from them) are required by regulatory agencies. Therefore, many of these industries limit their models to easily explainable white box algorithms like logistic regression or decision trees. What kind of explanations would it take for regulatory agencies to be willing to accept black-box algorithms such are various types of NNs for detecting fraud or money-laundering? How do we demonstrate to the end-user what the underlying relationships between the inputs and outputs are, for traditionally black-box systems? How could we influence decision-makers enough to place trust in predictions made by a model? We could begin by giving reasons, explanations, substantial insights into why a pump is about to fail in the next three days, or how a sales opportunity is likely to be a win or why an employee is leaving. Yet, if we don't make these relevant to your role, your work context, your interests, what is valuable to you and what might you lose if you make an incorrect decision, then we have not done our job as data scientists.

    Explanations are the core of the evolving relationship between humans and intelligent machines - this fosters trust. We need to be just as cautious of AI explanations as we are of each other’s—no matter how clever a machine seems. This means as a community we need to find ways of reliably explaining black-box models. David Gunning (DARPA) says.“It’s the nature of these machine-learning systems that they produce a lot of false alarms, so an intelligence analyst really needs extra help to understand why a recommendation was made."

    In this talk, we will examine what is required to explain predictions, the latest research in the area, our own findings showing how it is currently being accomplished in practice for multiple real-world use cases in the enterprise

  • 90 Mins
    Tutorial
    Advanced
    Advancements in Deep Learning seem almost unstoppable and research is the only way to make true improvements. Tarry and his team in deepkapha.ai is working relentlessly to write a few papers pertaining to Capsule Networks, automated swiping functions, and adaptations in optimizers and learning rates. Here in this lecture, we will briefly touch how research is transforming the field of AI and finally reveal two papers namely, Neuroscience and impact of Deep Learning and ARiA, a novel new NN activation function that has already proven its dominance over ReLU and Sigmoid.
  • Jared Lander
    Jared Lander
    Chief Data Scientist
    Lander Analytics
    schedule 3 months ago
    Sold Out!
    480 Mins
    Workshop
    Beginner

    Modern statistics has become almost synonymous with machine learning - a collection of techniques that utilize today's incredible computing power. Jared Lander walks you through the available methods for implementing machine learning algorithms in R and explores underlying theories such as the elastic net and boosted trees.

    • Building the design matrix
    • Penalized regression with the lasso and ridge methods
    • Fitting models with glmnet
    • Interactive visualization of the coefficient path
    • Use cross-validation to choose the optimal lambda
    • Visualize coefficients with coefplot
    • Perform binary classification with a single tree with xgboost
    • Train a boosted tree
    • Tune xgboost hyperparameters
    • Use validation data to understand performance
    • Visualize variable importance
    • Train a boosted random forest with xgboost
  • 45 Mins
    Demonstration
    Beginner

    There is a lot of buzz around computing on GPUs, particularly for training deep learning models though information about implementation is often sparse. In this talk, we provide an overview of some of the best models, such as penalized regression, boosted trees and deep neural networks. After covering some of the math behind the models we demonstrate code, mostly in R, the language of choice for data scientists. Our goal is to show how simple it is to fit these models and that training them on GPUs takes marginally more effort than training on CPUs. By the end of the talk, you should have a strong sense of how to fit models with GPUs in R.

  • Liked Drs. Tarry Singh
    keyboard_arrow_down

    Drs. Tarry Singh / Aishwary Patil - Neural Networks Deep Dive

    Drs. Tarry Singh
    Drs. Tarry Singh
    CEO, Founder & AI Neuroscience Researcher
    deepkapha.ai
    Aishwary Patil
    Aishwary Patil
    Robotics and AI Researcher
    deepkapha.ai
    schedule 4 months ago
    Sold Out!
    480 Mins
    Workshop
    Beginner

    First half of the day we will conduct a full comprehensive CNN theory lecture and discuss in large about what specific Neural Networks frameworks are used mostly such as TensorFow, PyTorch. Then the second half we build our own Neural Network from scratch (In PyTorch or TensorFlow) and if time permits also let learners play with the novel activation function that our researcher wrote a few weeks ago called ARiA. While deepkapha.ai is very busy writing some cool new algorithms, it is very likely that we may reveal deeper insights into our new paper which we are writing currently. Finally, the one-day workshop will end in a full Capsule Network lecture, the new Neural Network that is outperforming the CN (Convolutional Neural Network).

    Student Discount: Students are eligible for a flat 75% discount on this workshop and would also get a participation certificate from deepkapha.ai. To get the discount code, please email indianteam@odsc.com with a copy of your valid student ID card.

  • Liked Ujjyaini Mitra
    keyboard_arrow_down

    Ujjyaini Mitra - How to build churn propensity model where churn is single digit, in a non commital market

    45 Mins
    Case Study
    Intermediate

    When most known classification models fail to predict month on month telecom churn for a leading telecom operator, what can we do? Could there be an alternative?

  • Liked Dr. Ravi Vijayaraghavan
    keyboard_arrow_down

    Dr. Ravi Vijayaraghavan / Dr. Sidharth Kumar - Analytics and Science for Customer Experience and Growth in E-commerce

    20 Mins
    Experience Report
    Advanced

    In our talk, we will cover the broad areas where Flipkart leverages Analytics and Sciences to drive both human and machine-driven decisions. We will go deeper into one use case related to pricing in e-commerce.