Building & Operating Autonomous Data Streams

The world we live in today is fed by data. From self-driving cars and route planning to fraud prevention to content and network recommendations to ranking and bidding, the world we live in today not only consumes low-latency data streams, it adapts to changing conditions modeled by that data. 


While the world of software engineering has settled on best practices for developing and managing both stateless service architectures and database systems, the larger world of data infrastructure still presents a greenfield opportunity. To thrive, this field borrows from several disciplines : distributed systems, database systems, operating systems, control systems, and software engineering to name a few. 


Of particular interest to me is the sub field of data streams, specifically regarding how to build high-fidelity nearline data streams as a service within a lean team. To build such systems, human operations is a non-starter. All aspects of operating streaming data pipelines must be automated. Come to this talk to learn how to build such a system soup-to-nuts.


Outline/Structure of the Keynote

  • Architectural Best Practices
    • The 3 planes
      • Management
      • Control
      • Data   
  • Management Plane
    • Intent Capture
    • Customer Insights
  • Control Plane Concerns
    • Building an Intelligent Control Plane
      • Self-Healer
        • Autoscaling (1-level, 2-level)
        • Self-Organization
      • Data Pipe Provisioner
      • Deployer with Automated Rollback 
      • Observability
        • Lag
        • Los
  • Data Plane Concerns
        • Performance tuning
        • Serializability -- Transaction Boundaries
        • Design Concerns
  • Development Practices & Culture
        • Team Composition
        • Team Culture - Ethos
schedule Submitted 1 month ago