Stream All the Things!!

schedule May 14th 09:00 - 09:50 AM place Wesley Theatre people 208 Interested

Streaming data architectures aren't just "faster" Big Data architectures. They must be reliable and scalable as never before, more like microservice architectures.

This talk has three goals:

  1. Justify the transition from batch-oriented big data to stream-oriented fast data.
  2. Explain the requirements that streaming architectures must meet and the tools and techniques used to meet them.
  3. Discuss the ways that fast data and microservice architectures are converging.

Big data started with an emphasis on batch-oriented architectures, where data is captured in large, scalable stores, then processed using batch jobs. To reduce the gap between data arrival and information extraction, these architectures are now evolving to be stream oriented, where data is processed as it arrives. Fast data is the new buzz word.

These architectures introduce new challenges for developers. Whereas a batch job might run for hours, a stream processing system typically runs for weeks or months, which raises the bar for making these systems reliable and scalable to handle any contingency.

The microservice world has faced this challenge for a while. Microservices are inherently message driven, responding to requests for service and sending messages to other microservices, in turn. Hence, they are also stream oriented, in the sense that they must respond reliably to never-ending input. So, they offer guidance for how to build reliable streaming data systems. I'll discuss how these architectures are merging in other ways, too.

We'll also discuss how to pick streaming technologies based on four axes of concern:

  • Low latency: What's my time budget for handling this data?
  • High volume: How much data per unit time must I handle?
  • Data processing: Do I need machine learning, SQL queries, conventional ETL processing, etc.?
  • Integration with other tools: Which ones and how is data exchanged between them?

We'll consider specific examples of streaming tools and how they fit on these axes, including Spark, Flink, Akka Streams, and Kafka.

 
 

Learning Outcome

This talk has three goals:

  1. Justify the transition from batch-oriented big data to stream-oriented fast data.
  2. Explain the requirements that streaming architectures must meet and the tools and techniques used to meet them.
  3. Discuss the ways that fast data and microservice architectures are converging.

Target Audience

All attendees

schedule Submitted 1 year ago

Public Feedback

comment Suggest improvements to the Speaker