Machine data: how to handle it better?

The rise of IoT and smart infrastructure has led to the generation of massive amounts of complex data. Traditional solutions struggle to cope with this shift, leading to a decrease in performance and an increase in cost. In this session, I will talk about time-series data, machine data, the challenges of working with this kind of data, ingestion of this data using data from NYC cabs and running real time queries to visualise the data and gather insights. By the end of this session, you will be able to set up a highly scalable data pipeline for complex time series data with real time query performance.

 
15 favorite thumb_down thumb_up 3 comments visibility_off  Remove from Watchlist visibility  Add to Watchlist
 

Outline/Structure of the Talk

High level outline of topics that will be covered in this presentation:

1. Growth of IoT and Sensor Data

2. Time-series data

3. Challenges that are posed by large volumes of time-series data

4. Showcasing and overcoming the problem: A case-study

5. Demo time: Geospatial queries on machine data, 2017 NYC cab data and visualisation on Grafana

Learning Outcome

By the end of this session, we will be able to set up a highly scalable data pipeline for complex time series data with real time query performance.

Target Audience

Developers, Managers, IoT Enthusiasts

Prerequisites for Attendees

Some knowledge of databases, data pipelines and containers will help the audiences to follow along and make the most of this talk.

schedule Submitted 4 months ago

Public Feedback

comment Suggest improvements to the Speaker
  • Kuldeep Jiwani
    By Kuldeep Jiwani  ~  2 months ago
    reply Reply

    Hi Tanay,

    IoT is an interesting space for ML enthusiasts, good that you are focusing on it.

    Just to understand more on your talk, will it be more focused on the ETL and data query/visualisations part of IoT data. Or will it also be covering some sensor event stream series / timeseries analysis of data and showcasing use of ML techniques on it.

    • Tanay Pant
      By Tanay Pant  ~  2 months ago
      reply Reply

      Hi Kuldeep,

      While application of ML techniques is out of scope for this talk, it will definitely be covering sensor stream series and time-series analysis of a massive amount of data while still being highly available. It will also cover some topic of visualisation of the data.

      • Kuldeep Jiwani
        By Kuldeep Jiwani  ~  2 months ago
        reply Reply

        Thanks for the info, sounds good.


  • Liked Dipanjan Sarkar
    keyboard_arrow_down

    Dipanjan Sarkar - Explainable Artificial Intelligence - Demystifying the Hype

    Dipanjan Sarkar
    Dipanjan Sarkar
    Data Scientist
    Red Hat
    schedule 5 months ago
    Sold Out!
    45 Mins
    Tutorial
    Intermediate

    The field of Artificial Intelligence powered by Machine Learning and Deep Learning has gone through some phenomenal changes over the last decade. Starting off as just a pure academic and research-oriented domain, we have seen widespread industry adoption across diverse domains including retail, technology, healthcare, science and many more. More than often, the standard toolbox of machine learning, statistical or deep learning models remain the same. New models do come into existence like Capsule Networks, but industry adoption of the same usually takes several years. Hence, in the industry, the main focus of data science or machine learning is more ‘applied’ rather than theoretical and effective application of these models on the right data to solve complex real-world problems is of paramount importance.

    A machine learning or deep learning model by itself consists of an algorithm which tries to learn latent patterns and relationships from data without hard-coding fixed rules. Hence, explaining how a model works to the business always poses its own set of challenges. There are some domains in the industry especially in the world of finance like insurance or banking where data scientists often end up having to use more traditional machine learning models (linear or tree-based). The reason being that model interpretability is very important for the business to explain each and every decision being taken by the model.However, this often leads to a sacrifice in performance. This is where complex models like ensembles and neural networks typically give us better and more accurate performance (since true relationships are rarely linear in nature).We, however, end up being unable to have proper interpretations for model decisions.

    To address and talk about these gaps, I will take a conceptual yet hands-on approach where we will explore some of these challenges in-depth about explainable artificial intelligence (XAI) and human interpretable machine learning and even showcase with some examples using state-of-the-art model interpretation frameworks in Python!

  • Liked Anant Jain
    keyboard_arrow_down

    Anant Jain - Adversarial Attacks on Neural Networks

    Anant Jain
    Anant Jain
    Co-Founder
    Compose Labs, Inc.
    schedule 4 months ago
    Sold Out!
    45 Mins
    Talk
    Intermediate

    Since 2014, adversarial examples in Deep Neural Networks have come a long way. This talk aims to be a comprehensive introduction to adversarial attacks including various threat models (black box/white box), approaches to create adversarial examples and will include demos. The talk will dive deep into the intuition behind why adversarial examples exhibit the properties they do — in particular, transferability across models and training data, as well as high confidence of incorrect labels. Finally, we will go over various approaches to mitigate these attacks (Adversarial Training, Defensive Distillation, Gradient Masking, etc.) and discuss what seems to have worked best over the past year.

  • Liked Dat Tran
    keyboard_arrow_down

    Dat Tran - Image ATM - Image Classification for Everyone

    Dat Tran
    Dat Tran
    Head of AI
    Axel Springer AI
    schedule 4 months ago
    Sold Out!
    45 Mins
    Talk
    Intermediate

    At idealo.de we store and display millions of images. Our gallery contains pictures of all sorts. You’ll find there vacuum cleaners, bike helmets as well as hotel rooms. Working with huge volume of images brings some challenges: How to organize the galleries? What exactly is in there? Do we actually need all of it?

    To tackle these problems you first need to label all the pictures. In 2018 our Data Science team completed four projects in the area of image classification. In 2019 there were many more to come. Therefore, we decided to automate this process by creating a software we called Image ATM (Automated Tagging Machine). With the help of transfer learning, Image ATM enables the user to train a Deep Learning model without knowledge or experience in the area of Machine Learning. All you need is data and spare couple of minutes!

    In this talk we will discuss the state-of-art technologies available for image classification and present Image ATM in the context of these technologies. We will then give a crash course of our product where we will guide you through different ways of using it - in shell, on Jupyter Notebook and on the Cloud. We will also talk about our roadmap for Image ATM.

  • Liked Rahee Walambe
    keyboard_arrow_down

    Rahee Walambe / Vishal Gokhale - Processing Sequential Data using RNNs

    Rahee Walambe
    Rahee Walambe
    Research and Teaching Faculty
    Symbiosis institute of Technology
    Vishal Gokhale
    Vishal Gokhale
    Sr. Consultant
    XNSIO
    schedule 2 months ago
    Sold Out!
    480 Mins
    Workshop
    Beginner

    Data that forms the basis of many of our daily activities like speech, text, videos has sequential/temporal dependencies. Traditional deep learning models, being inadequate to model this connectivity needed to be made recurrent before they brought technologies such as voice assistants (Alexa, Siri) or video based speech translation (Google Translate) to a practically usable form by reducing the Word Error Rate (WER) significantly. RNNs solve this problem by adding internal memory. The capacities of traditional neural networks are bolstered with this addition and the results outperform the conventional ML techniques wherever the temporal dynamics are more important.
    In this full-day immersive workshop, participants will develop an intuition for sequence models through hands-on learning along with the mathematical premise of RNNs.

  • Liked Rahee Walambe
    keyboard_arrow_down

    Rahee Walambe / Aditya Sonavane - Can AI replace Traditional Control Algorithms?

    45 Mins
    Case Study
    Beginner

    As the technology progresses, the control tasks are getting increasingly complex. Employing the targeted algorithms for such control tasks and manually tuning them by trial and error (as in case of PID), is a cumbersome and lengthy process. Additionally, methods such as PID are designed for linear systems, however, all the real world control tasks are inherently non-linear in nature. With such complex tasks, using the conventional linear control methods approximates the nonlinear system to a linear model and in effect required performance is difficult to achieve.

    The new advances in the field of AI have presented us with techniques which may help replace the traditional control algorithms. Use of AI may allow us to achieve a higher quality of control on the nonlinear process, with minimum human interaction. Thus eliminating the requirement for a skilled person to perform meager tasks of tuning control algorithms with trial and error.

    Here we consider a simple case study of a beam balancer, where the controller is used for balancing a beam on a pivot to stabilize the ball at the center of the beam. We aim to implement a Reinforcement Learning based controller as an alternative to PID. We analyze the quality and compare the performance of PID-based controller vs. a RL-based controller to better understand the suitability for real-world control tasks.