The Best Data Isn’t Data: Why Experiments Are The Future of Data Science
Data is technically the plural of datum, which in Latin is the neuter past participle of dare, which means “to give”. Thus data means “givens”. Indeed, the overwhelming majority of data being analysed out there is given, i.e., the analyst can’t change it. It’s “just there” for you to analyse. You can slice and dice it, model it, act based on it, but you very likely didn’t control even partially the process that gave rise to it. In this talk I’ll try to convince you that, although this given, passive data is important, the real game changer for the future of data science is to combine it with the best possible data, which is not given at all: active data that arises as the outcome of carefully designed experiments. Just as experiments have propelled science to realms unattainable had it focused exclusively on passive observations, I expect the same to happen with data science. I’ll illustrate my arguments with real-world case studies from Ambiata’s experience in serving our clients.