Using Open Data to Predict Market Movements

location_city Bengaluru schedule Aug 31st 02:55 - 03:15 PM place Jupiter people 60 Interested

As companies progress on their digital transformation journeys, technology becomes a strategic business decision. In this realm, consulting firms such as Gartner exert tremendous influence on technology purchasing decisions. The ability of these firms to predict the movement of market players will provide vendors with competitive benefits.

We will explore how, with the use of publicly available data sources, IT industry trends can be mimicked and predicted.

Big Data enthusiasts learned quickly that there are caveats to making Big Data useful:

  • Data source availability
  • Producing meaningful insights from publicly available sources

Working with large data sets that are frequently changing can become expensive and frustrating. The learning curve is steep and discovery process long. Challenges range from selection of efficient tools to parse unstructured data, to development of a vision for interpreting and utilizing the data for competitive advantages.

We will describe how the archive of billions of web pages, captured monthly since 2008 and available for free analysis on AWS, can be used to mimic and predict trends reflected in industry-standard consulting reports.

There could be potential opportunity in this process to apply machine learning to tune the models and to self-learn so they can optimize automatically. There are over 70 topic area reports that Gartner publishes. Having an automated tool that can analyze across all of those topic areas to help us quickly understand major trends across today’s landscape and plan for those to come would be invaluable to many organizations.

 
 

Outline/Structure of the Talk

The session will focus on a case study in which we will talk about how open data (i.e. publicly available data) can be used to gain insights on market movements, which technologies are gaining traction and which are not and predict the movement of vendors in Gartner's Magic Quadrant reports. We will also talk about how this project was implemented from downloading indexes of web crawl data, performing text analysis on distributed computing clusters to visualizations. We will cover 3 use cases and demonstrate our findings.

Learning Outcome

The audience will learn how simple text analysis in conjunction with open data can be used to uncover the hidden trends and correlations and can provide with competitive advantage. The audience will also learn the tools, technologies and techniques used in this project

Target Audience

Data Scientists, Data Engineers, Software Engineers, Data Enthusiasts

Prerequisites for Attendees

A general knowledge of web crawler, indexes, AWS core services like S3, EMR, text analytics, python/scala, working with Zeppelin notebook

schedule Submitted 2 years ago

Public Feedback


    • Liked Dr. Tom Starke
      keyboard_arrow_down

      Dr. Tom Starke - Intelligent Autonomous Trading Systems - Are We There Yet?

      Dr. Tom Starke
      Dr. Tom Starke
      CEO
      AAAQuants
      schedule 2 years ago
      Sold Out!
      45 Mins
      Talk
      Intermediate

      Over the last two decades, trading has seen a remarkable evolution from open-outcry in the Wall Street pits to screen trading all the way to current automation and high-frequency trading (HFT). The success of machine learning and artificial intelligence (AI) seems like natural progression for the evolution of trading. However, unlike other fields of AI, trading has some domain specific problems that project the dream of set-it-and-forget-it money making machines still some way in the future. This talk will describe the current challenges for intelligent autonomous trading systems and provides some practical examples where machine learning is already being used in financial applications.

    • Liked Vincenzo Tursi
      keyboard_arrow_down

      Vincenzo Tursi - Puzzling Together a Teacher-Bot: Machine Learning, NLP, Active Learning, and Microservices

      Vincenzo Tursi
      Vincenzo Tursi
      Data Scientist
      KNIME
      schedule 2 years ago
      Sold Out!
      45 Mins
      Demonstration
      Beginner

      Hi! My name is Emil and I am a Teacher Bot. I was built to answer your initial questions about using KNIME Analytics Platform. Well, actually, I was built to point you to the right training materials to answer your questions about KNIME.

      Puzzling together all the pieces to implement me wasn't that difficult. All you need are:

      • A user interface - web or speech based - for you to ask questions
      • A text parser for me to understand
      • A brain to find the right training materials to answer your question
      • A user interface to send you the answer
      • A feedback option - nice to have but not a must - on whether my answer was of any help

      The most complex part was, of course, my brain. Building my brain required: a clear definition of the problem, a labeled data set, a class ontology, and finally the training of a machine learning model. The labeled data set in particular was lacking. So, we relied on active learning to incrementally make my brain smarter over time. Some parts of the final architecture, such as understanding and resource searching, were deployed as microservices.

    • Liked Atin Ghosh
      keyboard_arrow_down

      Atin Ghosh - AR-MDN - Associative and Recurrent Mixture Density Network for e-Retail Demand Forecasting

      45 Mins
      Case Study
      Intermediate

      Accurate demand forecasts can help on-line retail organizations better plan their supply-chain processes. The chal- lenge, however, is the large number of associative factors that result in large, non-stationary shifts in demand, which traditional time series and regression approaches fail to model. In this paper, we propose a Neural Network architecture called AR-MDN, that simultaneously models associative fac- tors, time-series trends and the variance in the demand. We first identify several causal features and use a combination of feature embeddings, MLP and LSTM to represent them. We then model the output density as a learned mixture of Gaussian distributions. The AR-MDN can be trained end-to-end without the need for additional supervision. We experiment on a dataset of an year’s worth of data over tens-of-thousands of products from Flipkart. The proposed architecture yields a significant improvement in forecasting accuracy when compared with existing alternatives.

    • Liked Anand Chitipothu
      keyboard_arrow_down

      Anand Chitipothu - DevOps for Data Science: Experiences from building a cloud-based data science platform

      Anand Chitipothu
      Anand Chitipothu
      Co-Founder
      Pipal Academy
      schedule 2 years ago
      Sold Out!
      45 Mins
      Case Study
      Beginner

      Productionizing data science applications is non trivial. Non optimal practices, the people-heavy way of the traditional approaches, the developers love for complex solutions for the sake of using cool technologies makes the situation even worse.

      There are two key ingredients required to streamline this: “the cloud” and “the right level of devops abstractions”.

      In this talk, I’ll share the experiences of building a cloud-based platform for streamlining data science and how such solutions can greatly simplify building and deploying data science and machine learning applications.

    • Liked Dr. Ravi Vijayaraghavan
      keyboard_arrow_down

      Dr. Ravi Vijayaraghavan / Dr. Sidharth Kumar - Analytics and Science for Customer Experience and Growth in E-commerce

      20 Mins
      Experience Report
      Advanced

      In our talk, we will cover the broad areas where Flipkart leverages Analytics and Sciences to drive both human and machine-driven decisions. We will go deeper into one use case related to pricing in e-commerce.

    • Liked Janakiram MSV
      keyboard_arrow_down

      Janakiram MSV - Accelerate Machine Learning Adoption with AutoML

      20 Mins
      Demonstration
      Beginner

      One emerging trend that's going to fundamentally change the face of ML is AutoML. It enables business analysts and developers to evolve machine learning models that can address complex scenarios. From platform companies such as Google and Microsoft to early-stage startups, AutoML is fast gaining traction. This session demonstrates how AutoML accelerates building machine learning models.