Lake, Swamp or Puddle: Data Quality at Scale

schedule Sep 7th 03:20 - 04:05 PM place The Wave 9/F people 28 Interested

Data is a powerful tool. Data-driven systems leveraging modern analytical and predictive techniques can offer significant improvements over static or heuristic driven systems.

The question is:

  • How much can you trust your data? Data collection, processing and aggregation is a challenging task.
  • How do we build confidence in our data? Where did the data come from?
  • How was it generated? What checks have or should be applied?
  • What is affected when it all goes wrong?

This talk looks at the mechanics of maintaining data-quality at scale. Firstly looking at bad-data, what it is and where it comes from. Then diving into the techniques required to detect, avoid and ultimately deal with bad-data. At the end of this talk the audience should come away with an idea of how to design quality data-driven systems that ultimately build confidence and trust rather than inflate expectations.

1 favorite thumb_down thumb_up 0 comments visibility_off  Remove from Watchlist visibility  Add to Watchlist

Target Audience

Data analysts, software engineers, architects, technical leaders and anyone with an interest in data quality.

schedule Submitted 2 years ago

Public Feedback

comment Suggest improvements to the Speaker