From Zero to Tensorflow: Building an Analytics Dept.
Day 1: one engineer vs. a heap of time-series data on a 1990s-era database
Four years on, there's 8 of us, we run TensorFlow analytics on a Hadoop cluster to detect subtle signs of a potential breakdown on earthmoving equipment. We've prevented million-dollar component failures, and reduced a lot of "parasite" stoppages.
This talk details the strategy and lessons learned from building an analytics department from scratch, in particular:
- Many analytics depts. were created as a "Flavour of the month". How do you approach this perception, survive and go beyond?
- Choosing the right projects to create a credible and sellable offering as quickly as possible to build your reputation.
- Expectation management, and choosing projects: Dealing with those who think "it won't work", and those who think you can solve all problems,
- Growing from a "start-up in a large company" to a more mature group. Change management, scaling, velocity, etc.
- Approach to R&D and launching new projects, dealing with the "shiny toys"
Outline/Structure of the Case Study
Background & context: The application is somewhat different from the more typical applications of data analytics (like marketing or finance). The first part gives an overview of earthmoving equipment, Type of data generated by earthmoving machinery, applications of data analytics: RHM (Remote Health Monitoring), productivity improvement. We also focus on the challenges that are specific to this field.
At the beginning: Give an overview of what existed when the department started, expectations and requirements. There was more data than we needed, so choosing the right project (from many) was important. Strategic decisions and vision in the early days. Choosing between supervised (look for specific fault patterns) and unsupervised (anomaly detection) methods: the pros and cons of each in terms of deployment speed and quality of results.
Growth: from small and lean "skunk-works" to a more mature group: structuring the workflow, task management. Increasing reliability, making work more efficient, putting safety nets, planning... but losing that "start-up" high velocity. As we mature, we plan more, and the more we plan, the more we realise things will take time ... and stakeholders get cold feet! Dealing with the deluge of information: the more time you spend building analytics ... the more you find things. Who reviews and who actions these numerous insights? Working out which are the most important/urgent pieces of information that your analytics suite generates.
R&D and technical work: As time goes, how do you keep innovating when the workload taken by ongoing maintenance becomes sizeable? You have processes to keep things in check, but these can also be barriers to innovation.
Choosing the right level of technicality for each project: case studies on how to avoid the "over-engineered analytic" traps. Difference between aiming for the solution vs. aiming for a better solution.
Stakeholder management: the "pie in the sky people", the "naysayers", those who wanted it by yesterday, those who want you to write an Excel macro for their admin task, those who have no idea what you do - but somehow ended up being in charge of your team, etc. Most of all, the challenges of showing upper management what you do and, the value you bring... in a non-technical manner.
The barriers and communications issues: what happens when a fitter explains to a statistician how a hydraulic pilot circuit check valve fails. How a problem that's always simple in their minds unvariably turns out to be more complex.
Picking your projects and solutions: Do you go chasing "big hitters": million-dollar failures that only happen once in a blue moon, or do you attack the flurry of small-impact, but high frequency issues?
Identifying and working out the tradeoffs between the "high value but narrow focus" projects and the "wide-reaching standardised solutions" that all the business can benefit from. Finding economies of scales in your stakeholder's requests: exchanging their request for a "simple and narrow-scope" analytic, for one that offers economies of scales...which also turns it it into a larger and longer project!
Change management: I.T. moving into a big data platform. Transitioning from R to Python (and my two cents in the controversial "language wars" debate)
What next? Technologies and directions in the works: edge and IoT analytics onboard the earthmoving machinery, moving to streaming analytics, and use of the insights.
Lessons learned in initiating and growing an analytics team, in a un-conventional application.
A strategy to demonstrate value to the business quickly when data is ready to use.
An R&D strategy that worked for us
Lessons in stakeholder management
The talk will have some case studies where knowledge of data analysis techniques would be useful, but the talk is largely aimed at analytics managers and strategists