AR-MDN - Associative and Recurrent Mixture Density Network for e-Retail Demand Forecasting

schedule Aug 31st 11:00 AM - 11:45 AM place Jupiter people 45 Interested

Accurate demand forecasts can help on-line retail organizations better plan their supply-chain processes. The chal- lenge, however, is the large number of associative factors that result in large, non-stationary shifts in demand, which traditional time series and regression approaches fail to model. In this paper, we propose a Neural Network architecture called AR-MDN, that simultaneously models associative fac- tors, time-series trends and the variance in the demand. We first identify several causal features and use a combination of feature embeddings, MLP and LSTM to represent them. We then model the output density as a learned mixture of Gaussian distributions. The AR-MDN can be trained end-to-end without the need for additional supervision. We experiment on a dataset of an year’s worth of data over tens-of-thousands of products from Flipkart. The proposed architecture yields a significant improvement in forecasting accuracy when compared with existing alternatives.

 
3 favorite thumb_down thumb_up 0 comments visibility_off  Remove from Watchlist visibility  Add to Watchlist
 

Outline/structure of the Session

  1. Introduction to the problem
  2. Data Sources
  3. Time Series and Tree-based models
  4. Proposed Model - Deep AR-MDN
  5. Experimental Evaluation

Learning Outcome

  1. Understanding of how deep learning can be used for business forecasting
  2. Understanding of how recurrent neural network works, although this is NOT a complete introduction of RNN, this is rather an application of it.
  3. should get some idea of how bayesian generative model works.

Target Audience

Machine learning or Applied deep learning researchers, ML/DL practitioners who want to apply deep learning to solve business forecasting problems.

Prerequisite

  1. Basic knowledge of how Recurrent Neural network/LSTM works.
  2. Understanding of maximum likelihood estimation
  3. Understanding of Gaussian mixture density and generative models.
schedule Submitted 4 months ago

Comments Subscribe to Comments

comment Comment on this Submission

    • Liked Dr. Tom Starke
      keyboard_arrow_down

      Dr. Tom Starke - Intelligent Autonomous Trading Systems - Are We There Yet?

      Dr. Tom Starke
      Dr. Tom Starke
      CEO
      AAAQuants
      schedule 4 months ago
      Sold Out!
      45 Mins
      Talk
      Intermediate

      Over the last two decades, trading has seen a remarkable evolution from open-outcry in the Wall Street pits to screen trading all the way to current automation and high-frequency trading (HFT). The success of machine learning and artificial intelligence (AI) seems like natural progression for the evolution of trading. However, unlike other fields of AI, trading has some domain specific problems that project the dream of set-it-and-forget-it money making machines still some way in the future. This talk will describe the current challenges for intelligent autonomous trading systems and provides some practical examples where machine learning is already being used in financial applications.

    • Liked Dr. Dakshinamurthy V Kolluru
      keyboard_arrow_down

      Dr. Dakshinamurthy V Kolluru - ML and DL in Production: Differences and Similarities

      45 Mins
      Talk
      Beginner

      While architecting a data-based solution, one needs to approach the problem differently depending on the specific strategy being adopted. In traditional machine learning, the focus is mostly on feature engineering. In DL, the emphasis is shifting to tagging larger volumes of data with less focus on feature development. Similarly, synthetic data is a lot more useful in DL than ML. So, the data strategies can be significantly different. Both approaches require very similar approaches to the analysis of errors. But, in most development processes, those approaches are not followed leading to substantial delay in production times. Hyper parameter tuning for performance improvement requires different strategies between ML and DL solutions due to the longer training times of DL systems. Transfer learning is a very important aspect to evaluate in building any state of the art system whether ML or DL. The last but not the least is understanding the biases that the system is learning. Deeply non-linear models require special attention in this aspect as they can learn highly undesirable features.

      In our presentation, we will focus on all the above aspects with suitable examples and provide a framework for practitioners for building ML/DL applications.

    • Liked Vincenzo Tursi
      keyboard_arrow_down

      Vincenzo Tursi - Puzzling Together a Teacher-Bot: Machine Learning, NLP, Active Learning, and Microservices

      Vincenzo Tursi
      Vincenzo Tursi
      Data Scientist
      KNIME
      schedule 4 months ago
      Sold Out!
      45 Mins
      Demonstration
      Beginner

      Hi! My name is Emil and I am a Teacher Bot. I was built to answer your initial questions about using KNIME Analytics Platform. Well, actually, I was built to point you to the right training materials to answer your questions about KNIME.

      Puzzling together all the pieces to implement me wasn't that difficult. All you need are:

      • A user interface - web or speech based - for you to ask questions
      • A text parser for me to understand
      • A brain to find the right training materials to answer your question
      • A user interface to send you the answer
      • A feedback option - nice to have but not a must - on whether my answer was of any help

      The most complex part was, of course, my brain. Building my brain required: a clear definition of the problem, a labeled data set, a class ontology, and finally the training of a machine learning model. The labeled data set in particular was lacking. So, we relied on active learning to incrementally make my brain smarter over time. Some parts of the final architecture, such as understanding and resource searching, were deployed as microservices.

    • Liked Anand Chitipothu
      keyboard_arrow_down

      Anand Chitipothu - DevOps for Data Science: Experiences from building a cloud-based data science platform

      Anand Chitipothu
      Anand Chitipothu
      Co-Founder
      Rorodata
      schedule 4 months ago
      Sold Out!
      45 Mins
      Case Study
      Beginner

      Productionizing data science applications is non trivial. Non optimal practices, the people-heavy way of the traditional approaches, the developers love for complex solutions for the sake of using cool technologies makes the situation even worse.

      There are two key ingredients required to streamline this: “the cloud” and “the right level of devops abstractions”.

      In this talk, I’ll share the experiences of building a cloud-based platform for streamlining data science and how such solutions can greatly simplify building and deploying data science and machine learning applications.

    • Liked Favio Vázquez
      keyboard_arrow_down

      Favio Vázquez - Agile Data Science Workflows with Python, Spark and Optimus

      Favio Vázquez
      Favio Vázquez
      Sr. Data Scientist
      Raken Data Group
      schedule 4 months ago
      Sold Out!
      480 Mins
      Workshop
      Intermediate

      Cleaning, Preparing , Transforming and Exploring Data is the most time-consuming and least enjoyable data science task, but one of the most important ones. With Optimus we’ve solve this problem for small or huge datasets, also improving a whole workflow for data science, making it easier for everyone. You will learn how the combination of Apache Spark and Optimus with the Python ecosystem can form a whole framework for Agile Data Science allowing people and companies to go further, and beyond their common sense and intuition to solve complex business problems.

    • Liked Dr. Manish Gupta
      keyboard_arrow_down

      Dr. Manish Gupta / Radhakrishnan G - Driving Intelligence from Credit Card Spend Data using Deep Learning

      45 Mins
      Talk
      Beginner

      Recently, we have heard success stories on how deep learning technologies are revolutionizing many industries. Deep Learning has proven huge success in some of the problems in unstructured data domains like image recognition; speech recognitions and natural language processing. However, there are limited gain has been shown in traditional structured data domains like BFSI. This talk would cover American Express’ exciting journey to explore deep learning technique to generate next set of data innovations by deriving intelligence from the data within its global, integrated network. Learn how using credit card spend data has helped improve credit and fraud decisions elevate the payment experience of millions of Card Members across the globe.

    • Liked Joy Mustafi
      keyboard_arrow_down

      Joy Mustafi - The Artificial Intelligence Ecosystem driven by Data Science Community

      45 Mins
      Talk
      Intermediate

      Cognitive computing makes a new class of problems computable. To respond to the fluid nature of users understanding of their problems, the cognitive computing system offers a synthesis not just of information sources but of influences, contexts, and insights. These systems differ from current computing applications in that they move beyond tabulating and calculating based on pre-configured rules and programs. They can infer and even reason based on broad objectives. In this sense, cognitive computing is a new type of computing with the goal of more accurate models of how the human brain or mind senses, reasons, and responds to stimulus. It is a field of study which studies how to create computers and computer software that are capable of intelligent behavior. This field is interdisciplinary, in which a number of sciences and professions converge, including computer science, electronics, mathematics, statistics, psychology, linguistics, philosophy, neuroscience and biology. Project Features are Adaptive: They MUST learn as information changes, and as goals and requirements evolve. They MUST resolve ambiguity and tolerate unpredictability. They MUST be engineered to feed on dynamic data in real time; Interactive: They MUST interact easily with users so that those users can define their needs comfortably. They MUST interact with other processors, devices, services, as well as with people; Iterative and Stateful: They MUST aid in defining a problem by asking questions or finding additional source input if a problem statement is ambiguous or incomplete. They MUST remember previous interactions in a process and return information that is suitable for the specific application at that point in time; Contextual: They MUST understand, identify, and extract contextual elements such as meaning, syntax, time, location, appropriate domain, regulation, user profile, process, task and goal. They may draw on multiple sources of information, including both structured and unstructured digital information, as well as sensory inputs (visual, gestural, auditory, or sensor-provided). {A set of cognitive systems is implemented and demonstrated as the project J+O=Y}

    • Liked Dr. Rohit M. Lotlikar
      keyboard_arrow_down

      Dr. Rohit M. Lotlikar - The Impact of Behavioral Biases to Real-World Data Science Projects: Pitfalls and Guidance

      45 Mins
      Talk
      Intermediate

      Data science projects, unlike their software counterparts tend to be uncertain and rarely fit into standardized approach. Each organization has it’s unique processes, tools, culture, data and in-efficiencies and a templatized approach, more common for software implementation projects rarely fits.

      In a typical data science project, a data science team is attempting to build a decision support system that will either automate human decision making or assist a human in decision making. The dramatic rise in interest in data sciences means the typical data science project has a large proportion of relatively inexperienced members whose learnings draw heavily from academics, data science competitions and general IT/software projects.

      These data scientists learn over time that the real world however is very different from the world of data science competitions. In the real-word problems are ill-defined, data may not exist to start with and it’s not just model accuracy, complexity and performance that matters but also the ease of infusing domain knowledge, interpretability/ability to provide explanations, the level of skill needed to build and maintain it, the stability and robustness of the learning, ease of integration with enterprise systems and ROI.

      Human factors play a key role in the success of such projects. Managers making the transition from IT/software delivery to data science frequently do not allow for sufficient uncertainty in outcomes when planning projects. Senior leaders and sponsors, are under pressure to deliver outcomes but are unable to make a realistic assessment of payoffs and risks and set investment and expectations accordingly. This makes the journey and outcome sensitive to various behavioural biases of project stakeholders. Knowing what the typical behavioural biases and pitfalls makes it easier to identify those upfront and take corrective actions.

      The speaker brings his nearly two decades of experience working at startups, in R&D and in consulting to lay forth these recurring behavioural biases and pitfalls.

      Many of the biases covered are grounded in the speakers first-hand experience. The talk will provide examples of these biases and suggestions on how to identify and overcome or correct for them.

    • Liked Akshay Bahadur
      keyboard_arrow_down

      Akshay Bahadur - Recognizing Human features using Deep Networks.

      Akshay Bahadur
      Akshay Bahadur
      SDE-I
      Symantec Softwares
      schedule 5 months ago
      Sold Out!
      20 Mins
      Demonstration
      Beginner

      This demo would be regarding some of the work that I have already done since starting my journey in Machine Learning. So, there are a lot of MOOCs out there for ML and data science but the most important thing is to apply the concepts learned during the course to solve simple real-world use cases.

      • One of the projects that I did included building state of the art Facial recognition system [VIDEO]. So for that, I referred to several research papers and the foundation was given to me in one of the courses itself, however, it took a lot of effort to connect the dots and that's the fun part.
      • In another project, I made an Emoji Classifier for humans [VIDEO] based on your hand gestures. For that, I used deep learning CNN model to achieve great accuracy. I took reference from several online resources that made me realize that the data science community is very helpful and we must make efforts to contribute back.
      • The other projects that I have done using machine learning:
        1. Handwritten digit recognition [VIDEO],
        2. Alphabet recognition [VIDEO],
        3. Apparel classification [VIDEO],
        4. Devnagiri recognition [VIDEO].

      With each project, I have tried to apply one new feature or the other to make my model a bit more efficient. Hyperparameter tuning or just cleaning the data.

      In this demonstration, I would just like to point out that knowledge never goes to waste. The small computer vision applications that I built in my college has helped me to gain deep learning computer vision task. It's always enlightening and empowering to learn new technologies.

      I recently was part of a session on ‘Solving real world applications from Machine learning’ to Microsoft Advanced Analytics User Group of Belgium as well as broadcasted across the globe (Meetup Link) [Session Recording]

    • Liked Nirav Shah
      keyboard_arrow_down

      Nirav Shah - Advanced Data Analysis, Dashboards And Visualization

      Nirav Shah
      Nirav Shah
      Founder
      OnPoint Insights
      schedule 4 months ago
      Sold Out!
      480 Mins
      Workshop
      Intermediate

      In these two training sessions ( 4 hours each, 8 hours total), you will learn to use data visualization and analytics software Tableau Public (free to use) and turn your data into interactive dashboards. You will get hands on training on how to create stories with dashboards and share these dashboards with your audience. However, the first session will begin with a quick refresher of basics about design and information literacy and discussions about best practices for creating charts as well as decision making framework. Whether your goal is to explain an insight or let your audience explore data insights, Tableau's simple drag-and-drop user interface makes the task easy and enjoyable. You will learn what's new in Tableau and the session will cover the latest and most advanced features of data preparation.

      In the follow up second session, you will learn to create Table Calculations, Level of Detail Calculations, Animations and understanding Clustering. You will learn to integrate R and Tableau and how to use R within Tableau. You will also learn mapping, using filters / parameters and other visual functionalities.

    • Liked Ujjyaini Mitra
      keyboard_arrow_down

      Ujjyaini Mitra - How to build churn propensity model where churn is single digit, in a non commital market

      45 Mins
      Case Study
      Intermediate

      When most known classification models fail to predict month on month telecom churn for a leading telecom operator, what can we do? Could there be an alternative?

    • Liked Anuj Gupta
      keyboard_arrow_down

      Anuj Gupta - Sarcasm Detection : Achilles Heel of sentiment analysis

      Anuj Gupta
      Anuj Gupta
      Independent Researcher
      -
      schedule 6 months ago
      Sold Out!
      45 Mins
      Talk
      Intermediate

      Sentiment analysis has been for long poster boy problem of NLP and has attracted a lot of research. However, despite so much work in this sub area, most sentiment analysis models fail miserably in handling sarcasm. Rise in usage of sentiment models for analysis social data has only exposed this gap further. Owing to the subtilty of language involved, sarcasm detection is not easy and has facinated NLP community.

      Most attempts at sarcasm detection still depend on hand crafted features which are dataset specific. In this talk we see some of the very recent attempts to leverage recent advances in NLP for building generic models for sarcasm detection.

      Key take aways:
      + Challenges in sarcasm detection
      + Deep dive into a end to end solution using DL to build generic models for sarcasm detection
      + Short comings and road forward

    • Liked Asha Saini
      keyboard_arrow_down

      Asha Saini - Using Open Data to Predict Market Movements

      20 Mins
      Talk
      Intermediate

      As companies progress on their digital transformation journeys, technology becomes a strategic business decision. In this realm, consulting firms such as Gartner exert tremendous influence on technology purchasing decisions. The ability of these firms to predict the movement of market players will provide vendors with competitive benefits.

      We will explore how, with the use of publicly available data sources, IT industry trends can be mimicked and predicted.

      Big Data enthusiasts learned quickly that there are caveats to making Big Data useful:

      • Data source availability
      • Producing meaningful insights from publicly available sources

      Working with large data sets that are frequently changing can become expensive and frustrating. The learning curve is steep and discovery process long. Challenges range from selection of efficient tools to parse unstructured data, to development of a vision for interpreting and utilizing the data for competitive advantages.

      We will describe how the archive of billions of web pages, captured monthly since 2008 and available for free analysis on AWS, can be used to mimic and predict trends reflected in industry-standard consulting reports.

      There could be potential opportunity in this process to apply machine learning to tune the models and to self-learn so they can optimize automatically. There are over 70 topic area reports that Gartner publishes. Having an automated tool that can analyze across all of those topic areas to help us quickly understand major trends across today’s landscape and plan for those to come would be invaluable to many organizations.

    • Liked Dr. Ravi Vijayaraghavan
      keyboard_arrow_down

      Dr. Ravi Vijayaraghavan / Dr. Sidharth Kumar - Analytics and Science for Customer Experience and Growth in E-commerce

      20 Mins
      Experience Report
      Advanced

      In our talk, we will cover the broad areas where Flipkart leverages Analytics and Sciences to drive both human and machine-driven decisions. We will go deeper into one use case related to pricing in e-commerce.

    • Liked Dr. Jennifer Prendki
      keyboard_arrow_down

      Dr. Jennifer Prendki - Recognition and Localization of Parking Signs using Deep Learning

      45 Mins
      Case Study
      Intermediate
      Drivers in large cities such as San Francisco are often the cause for a lot of traffic jams when they slow down and circle around the streets in order to attempt to decipher the meaning of the parking signs and avoid tickets. This endangers the safety of pedestrians and harms the overall transportation environment.

      In this talk, I will present an automated model developed by the Machine Learning team at Figure Eight which exploits multiple Deep Learning techniques to predict the presence of parking signs from street-level imagery and find their actual location on a map. Multiple APIs are then applied to read and extract the rules from the signs. The obtained map of the digitized parking rules along with the GPS information of a driver can be ultimately used to build functional products to help drive and park more safely.
    • Liked Dr. Jennifer Prendki
      keyboard_arrow_down

      Dr. Jennifer Prendki / Kiran Vajapey - Introduction to Active Learning

      Dr. Jennifer Prendki
      Dr. Jennifer Prendki
      VP of Machine Learning
      Figure Eight
      Kiran Vajapey
      Kiran Vajapey
      HCI Developer
      Figure Eight
      schedule 4 months ago
      Sold Out!
      480 Mins
      Workshop
      Intermediate

      The greatest challenge when building high performance model isn't about choosing the right algorithm or doing hyperparameter tuning: it is about getting high quality labeled data. Without good data, no algorithm, even the most sophisticated one, will deliver the results needed for real-life applications. And with most modern algorithms (such as Deep Learning models) requiring huge amounts of data to train, things aren't going to get better any time soon.

      Active Learning is one of the possible solutions to this dilemma, but is, quite surprisingly, left out of most Data Science conferences and Computer Science curricula. This workshop is hoping to address the lack of awareness of the Machine Learning community for the important topic of Active Learning.

      Link to data used in this course: https://s3-us-west-1.amazonaws.com/figure-eight-dataset/active_learning_odsc_india/Active_Learning_Workshop_data.zip

    • Liked Ujjyaini Mitra
      keyboard_arrow_down

      Ujjyaini Mitra - When the Art of Entertainment ties the knot with Science

      20 Mins
      Talk
      Advanced

      Apparently, Entertainment is a pure art form, but there's a huge bit that science can back the art. AI can drive multiple human intensive works in the Media Industry, driving the gut based decision to data-driven-decisions. Can we create a promo of a movie through AI? How about knowing which part of the video causing disengagement among our audiences? Could AI help content editors? How about assisting script writers through AI?

      i will talk about few specific experiments done specially on Voot Original contents- on binging, hooking, content editing, audience disengagement etc.

    • Liked Atin Ghosh
      keyboard_arrow_down

      Atin Ghosh - Multi-task learning of glaucoma diagnosis and medical image segmentation using deep learning

      20 Mins
      Talk
      Intermediate

      Glaucoma is a type of eye disease which is one of the leading causes of complete blindness. At the moment, successful glaucoma diagnosis is a very expensive, time consuming process and requires a battery of tests to confirm the disease. It is also important to detect the disease as early as possible so that treatment can be started immediately to slow down the progression of the disease since right now there is no cure for it. Due to this, there has been a lot of effort among ophthalmologists to find better ways to detect glaucoma. At the same time, deep learning and convolutional neural network (CNN) has shown tremendous promise in difficult computer vision tasks such object detection, image segmentation etc.

      Motivated by this, there has been a lot of effort to apply deep learning in medical image diagnosis, particularly in detection of Glaucoma from 3D OCT image of optical nerve head. Another important task is to assist the medical practitioners to detect Glaucoma by segmenting the tissues from the OCT image so that they have more confidence on their diagnosis.

      We have combined these two tasks i.e. image classification and segmentation in a single objective loss which we minimise to train our deep network. Also we use visual attention mechanism to focus on a particular region of the image namely RNFL thickness which is important to detect Glaucoma. We achieve AUC of 90% for glaucoma diagnosis task which is as good as human diagnosis but brings down the diagnosis duration to few seconds from few months.

    • Liked murughan palaniachari
      keyboard_arrow_down

      murughan palaniachari - AIOps - DevOps in Artificial Intelligence & Data Science

      murughan palaniachari
      murughan palaniachari
      DevOps Coach
      euromonitor
      schedule 6 months ago
      Sold Out!
      20 Mins
      Talk
      Beginner

      In this session you will learn how to adopt DevOps values, principles and practices in AI world. DevOps culture increases the collaboration among Data engineering, Data science/AI engineering, & Operations team. DevOps enables faster delivery of high quality product through process improvement & technology adoptions like Cloud, Automation, feedback loop, Self-service, and shift left security.

    • Liked Janakiram MSV
      keyboard_arrow_down

      Janakiram MSV - Accelerate Machine Learning Adoption with AutoML

      20 Mins
      Demonstration
      Beginner

      One emerging trend that's going to fundamentally change the face of ML is AutoML. It enables business analysts and developers to evolve machine learning models that can address complex scenarios. From platform companies such as Google and Microsoft to early-stage startups, AutoML is fast gaining traction. This session demonstrates how AutoML accelerates building machine learning models.