Introduction to Anomaly Detection in Data

There are always some students in a classroom who either outperform the other students or failed to even pass with a bare minimum when it comes to securing marks in subjects. Most of the times, the marks of the students are generally normally distributed apart from the ones just mentioned. These marks can be termed as extreme highs and extreme lows respectively. In Statistics and other related areas like Machine Learning, these values are referred to as Anomalies or Outliers.
The very basic idea of anomalies is really centered around two values - extremely high values and extremely low values. Then why are they given importance? In this session, we will try to investigate questions like this. We will see how they are created/generated, why they are important to consider while developing machine learning models, how they can be detected. We will also do a small case study in Python to even solidify our understanding of anomalies.

Outline/Structure of the Talk

  • A dive into the wild: Anomalies in the real world
  • Find the odd ones out: Anomalies in data
  • Generation of anomalies in data
  • Different types of anomalies
  • How anomalies affect the performance of an ML model
  • Utilizing anomalies in ML models
  • A case study of anomaly detection in Python

Learning Outcome

By the end of this session, the attendees will have a good background on anomalies and they will also have an idea about the basics techniques to tackle anomalies (along with the introduction to PyOD).

Target Audience

Data Science Enthusiasts, Data Science Practitioners, Machine Learning Beginners

Prerequisites for Attendees

Basic familiarity with Machine Learning would be ideal

schedule Submitted 9 months ago

Public Feedback

comment Suggest improvements to the Speaker
  • Anoop Kulkarni
    By Anoop Kulkarni  ~  8 months ago
    reply Reply

    Thanks for your proposal. Is this more of a theoretic takedown of anomalies or would there be different real-life datasets used to showcase these anomalies?

    The latter can be added as part of the proposal, if not already. In either case, an update may be required.


    • Sayak Paul
      By Sayak Paul  ~  8 months ago
      reply Reply

      Thanks for your comment Anoop. The very first section of the proposal discusses the real-life examples and hence the name: A dive into the wild: Anomalies in the real world

      And the talk is going to cover both the theoretical aspects and how to approach anomalies in a programmatic way in this section A case study of anomaly detection in Python. I would request you to take a quick look at the accompanying blog post that I did for FloydHub: