Five Key Pitfalls in Data Analysis

Data Science is all about deriving actionable insights through data analysis.
There is no denying the fact that such insights have a tremendous business value.
But what if -
Some crucial data has been left out of consideration ?
Wrong inferences have been drawn during analysis ?
Results have been graphically misrepresented?
Imagine the adverse impact on your business if you take wrong decisions based on such cases.

In this talk we will discuss the following 5 key pitfalls to lookout for in the data analysis results before you take any decisions based on them
1. Selection Bias
2. Survivor Bias
3. Confounding Effects
4. Equating Correlation to Causation
5. Misleading Visualizations

These are some of the most common points that are overlooked by the beginners in Data Science.

The talk will draw upon many examples from real life situations to illustrate these points.

 
2 favorite thumb_down thumb_up 0 comments visibility_off  Remove from Watchlist visibility  Add to Watchlist
 

Outline/Structure of the Talk

  • Goal of Data Science (1 min.)
  • Data Wrangling - Absolutely Necessary but Not Sufficient (2 mins.)
  • Issues Not Addressed by Data Wrangling (2 mins)
  • Selection Bias (5 mins.)
  • Survivor Bias (5 mins.)
  • Confounding Effects (5 mins.)
  • Equating Correlation to Causation (5 mins.)
  • Misleading Visualizations (5 mins.)
  • Conclusions (5 mins.)
  • Q & A (5 mins.)

[Please Note: The slides for this talk are under preparation and will be shared with reviewers at the earliest. Meanwhile I have populated the Slide section with a link to my presentation in Agile India conference to give reviewers an idea of how I normally organize my slides. The Video section has a link to a free preview video of my course "What is Data Science?" hosted in Udemy Platform. I don't have the video for the topic I am proposing here.]

Learning Outcome

  1. A beginner will be able to avoid the pitfalls discussed in this talk and produce more authentic data analysis results.
  2. By understanding these pitfalls a decision maker will be able to constructively question the analyst's finding and avoid taking wrong and costly decisions.

Target Audience

Data Science Beginners; People who take data-driven decisions;

Prerequisites for Attendees

No Prerequisites.

schedule Submitted 1 week ago

Public Feedback

comment Suggest improvements to the Speaker

  • 20 Mins
    Demonstration
    Advanced

    In this digital era when the attention span of customers is reducing drastically, for a marketer it is imperative to understand the following 4 aspects more popularly known as "The 4R's of Marketing" if they want to increase our ROI:

    - Right Person

    - Right Time

    - Right Content

    - Right Channel

    Only when we design and send our campaigns in such a way, that it reaches the right customers at the right time through the right channel telling them about stuffs they like or are interested in ... can we expect higher conversions with lower investment. This is a problem that most of the organizations need to solve for to stay relevant in this age of high market competition.

    Among all these we will put special focus on appropriate content generation based on targeted user base using Markov based models and do a quick hack session.

    The time breakup can be:

    5 mins : Difference between Martech and traditional marketing. The 4R's of marketing and why solving for them is crucial

    5 mins : What is Smart Segments and how to solve for it, with a short demo

    5 mins : How marketers use output from Smart Segments to execute targeted campaigns

    5 mins: What is STO, how it can be solved and what is the performance uplift seen by clients when they use it

    5 mins: What is Channel Optimization, how it can be solved and what is the performance uplift seen by clients when they use it

    5 mins: Why sending the right message to customers is crucial, and introduction to appropriate content creation

    15 mins: Covering different Text generation nuances, and a live demo with walk through of a toy code implementation