How Much Data do you _really_ need for Deep Learning?
A common assumption is that we need significant amounts of data in order to do deep learning. Many companies wanting to adopt AI find themselves stuck in the “data gathering” phase and as a result delaying the use of AI to gain competitive advantage in their business. But how much data is enough? Can we get by with less?
In this talk we will explore the impact on our results when we use different amounts of data to train a classification model. It is actually possible to get by with much less data than we might expect. We will discuss why this might be so, in which particular areas this applies, and how we can use these ideas to improve how we train, deploy and engage end-users in our models.
Outline/Structure of the Talk
- Show of hands: How much data is needed to obtain a certain result
- Provide surprising answer!
- Transfer learning
- What is it?
- When can we use it?
- Does this work only for transfer-learning?
- How is it related to the loss function?
- What does it tell us about how to think of the “confidence” of ML predictions?
- Common misconceptions of “confidence”
- Wrap up with possible solutions to the questions raised, and open areas of investigation.
- By using transfer learning, we can specialise models with surprisingly small amounts of data, implications:
- Great news for adopting AI,
- Building “always-learning” workflows is wise.
- Perils of thinking about “confidence” on ML predictions
- State of the art techniques for ML deployment pipelines
Leaders, Managers, Data Scientists, Programmers.
Prerequisites for Attendees
schedule Submitted 1 month ago
People who liked this proposal, also liked:
Juliet Hougland - How to Experiment QuicklyJuliet HouglandData Platform Engineering ManagerStitch Fix
schedule 1 month agoSold Out!
The ‘science’ in data science refers to the underlying philosophy that you don’t know what works for your business until you make changes and rigorously measure impact. Rapid experimentation is a fundamental characteristic of high functioning data science teams. They experiment with models, business processes, user interfaces, marketing strategies, and anything else they can get their hands on. In this talk I will discuss what data platform tooling and organizational designs support rapid experimentation in data science teams.