Tick Tock: What the heck is time-series data?
The rise of IoT and smart infrastructure has led to the generation of massive amounts of complex data. In this session, we will talk about time-series data, the challenges of working with time series data, ingestion of this data using data from NYC cabs and running real time queries to gather insights. By the end of the session, we will have an understanding of what time-series data is, how to build streaming data pipelines for massive time series data using Flink, Kafka and CrateDB, and visualising all this data with the help of a dashboard.
Outline/Structure of the Talk
High level outline of topics that will be covered in this presentation:
1. Growth of IoT and Sensor Data
2. Time-series data
3. Challenges that are posed by large volumes of time-series data
4. Showcasing and overcoming the problem: A case-study
5. Demo time: Creating a highly available data pipeline with Kafka, Flink and CrateDB to visualise with Grafana. We will be ingesting ~4 million records of the NYC cab data
Learning Outcome
By the end of this session, we will be able to set up a highly scalable data pipeline for complex time series data with real time query performance.
Target Audience
Developers, Managers, IoT Specialists
Prerequisites for Attendees
Some knowledge of databases, data pipelines and containers will help the audiences to follow along and make the most of this talk.
Links
Here's some highlights of my previous work:
- Talk on 'Multiplayer Games with WebXR': https://www.youtube.com/watch?v=-UCfNmB8618
- Talk on 'Learning WebVR': https://www.youtube.com/watch?v=vTntqtIh3mM
- My interview at DevFestNN: https://www.linkedin.com/feed/update/urn:li:activity:6491638347432230913/
- My books: https://www.amazon.com/Tanay-Pant/e/B015R1B9RE
- Some other misc stuff: https://devdiner.com/tag/tanay-pant
schedule Submitted 10 months ago
People who liked this proposal, also liked:
-
keyboard_arrow_down
Tanay Pant - Machine data: how to handle it better?
45 Mins
Talk
Intermediate
The rise of IoT and smart infrastructure has led to the generation of massive amounts of complex data. Traditional solutions struggle to cope with this shift, leading to a decrease in performance and an increase in cost. In this session, I will talk about time-series data, machine data, the challenges of working with this kind of data, ingestion of this data using data from NYC cabs and running real time queries to visualise the data and gather insights. By the end of this session, you will be able to set up a highly scalable data pipeline for complex time series data with real time query performance.
Public Feedback
Hello Tanay, thanks for the submission. Do you also plan to compare and contrast different architectures and explain why the recommended combination of Kafka, Flink and CrateDB works as opposed to other options that could be possible?
Hi Akshay, yes I plan to compare various different databases and architectures and present a case study on benchmarks and use-cases related to it in the talk.
Great, thanks for the clarification.