Data is a Soft Science
The perception of data science, and often the way it is taught, is like this: You have some nice, tidy data, you use the latest, coolest algorithm, and you get some super clever results. You know it’s good ‘cause your r-squared value is through the roof, and you could play checkers on your confusion matrix.
But the reality is different. That nice, tidy dataset has to be wrangled out of a big, nasty production system that was built by a coffee-fueled maniac. Those cool results have to somehow be translated into a user interface in which it’s the 12th most important thing on the page, and you have to fight for every pixel. And in front of that production system, entering that data, clicking on that user interface, is a data scientist’s worst nightmare: People.
As much as we might want to believe that data science is a pure “hard” science, about writing greek letters on chalkboards and stroking our chins, the truth is that what we do is more usefully thought of as a social science. Data science is a lens for understanding human behaviour. It is a tool for communicating with people. Data is a soft science.
This talk is about how my background in Social Anthropology gave me a unique approach to doing data science. I’ll show how taking this view of data science led to some cool discoveries in some interesting projects. And I’ll talk about how, building accounting software at Xero, we’ve started on the journey towards building a “smarter” application. As we've done this, the hardest problems have not been about technical implementation, they’ve been about understanding the interface between these technologies and our users. Our data science problems at Xero, it turns out, are mostly about how to understand humans.
Outline/Structure of the Talk
Section One: What can Social Anthropology teach us about data analysis?
- A story from one of anthropology's founding myths
- From summary stats to an ethnography of data
Section Two: What does it look like to use data as a lens to human behaviour?
- A simple example: Dog breeds and their associated names
- A practical example: Business contacts
Section Three: Researching user relationships with Machine Learning products at Xero
- How much do users trust us? (And how much should they trust us?)
- What is the mental model users employ to understand these products?
- How good is good enough? (And what does "good" even mean?)
Attendees will have a more complex understanding of ways they can use data to build products and assist with decision-making. They will have a greater appreciation for the potential for data to give insight into human behaviour, and the risks of facile data gathering, compared to human-centered research.
Data Scientists, Product Owners, Designers, and User Experience Researchers