Taming the Beast: Automated Testing for Complex Data Pipelines

schedule May 6th 04:15 - 04:45 PM place Green Room people 61 Interested

Massive datasets. Complex data pipelines. Machine learning. When faced with such a beast, how do you test it effectively? When your tests results are less "pass" and "fail", and more "sort of" and "not really", how do you automate testing?

Trish Khoo draws upon her experience in testing complex data systems to demonstrate proven strategies for testing in this field. Her experience working on ultra-large-scale systems at Google in Mountain View, California shaped her technical approach to testing which she applies in her work as a consultant today.

 
1 favorite thumb_down thumb_up 0 comments visibility_off  Remove from Watchlist visibility  Add to Watchlist
 

Outline/Structure of the Talk

Trish will first explain examples of the type of systems she worked on previously and how the solution applied to them. Then she will go through an detailed technical example of the solution in action, as applied to a fictional system (as all the real examples are restricted by NDA). Lastly, she shall summarise and propose future solutions.

Learning Outcome

The audience will learn a proven approach to automated testing for systems with complex data pipelines, large datasets and machine learning.

Target Audience

Technical folks working on complex data systems

Prerequisites for Attendees

n/a

schedule Submitted 4 months ago