The Art of Effective Visualization of Multi-dimensional Data - A hands-on Approach

schedule Sep 1st 02:00 - 02:45 PM place Jupiter people 107 Interested

Descriptive Analytics is one of the core components of any analysis life-cycle pertaining to a data science project or even specific research. Data aggregation, summarization and visualization are some of the main pillars supporting this area of data analysis. However, dealing with multi-dimensional datasets with typically more than two attributes start causing problems, since our medium of data analysis and communication is typically restricted to two dimensions. We will explore some effective strategies of visualizing data in multiple dimensions (ranging from 1-D up to 6-D) using a hands-on approach with Python and popular open-source visualization libraries like matplotlib and seaborn. We will also do a brief coverage on excellent R visualization libraries like ggplot if we have time.

BONUS: We will also look at ways to visualize unstructured data with several dimensions including text, images and audio!

 
8 favorite thumb_down thumb_up 6 comments visibility_off  Remove from Watchlist visibility  Add to Watchlist
 

Outline/Structure of the Tutorial

The talk is usually a 90 minutes session but we will be covering it in the scheduled 45 minute session focusing on the main aspects of effective data visualization with the grammar of graphics, leveraging popular open-source frameworks in Python and also as a bonus cover visualization in unstructured data including text, audio and images.

Note: All the code and resources will be shared and open-sourced for your benefit! So you don't need to take extensive notes and can focus on the presentation\talk.

Outline:

  1. Introduction
    • What is Data Visualization?
    • Why Data Visualization?
  2. Motivation
    • Why Effective Data Visualization
  3. Effective Multi-dimensional Data Visualization
    • Whirlwind tour of the grammar of graphics
  4. Visualization tools and frameworks
    • General tools & frameworks
    • Python visualization frameworks
    • R visualization frameworks
  5. Visualizing Structured Data
    • Univariate analysis and visualizations
    • Multivariate analysis and visualizations
    • Visualizing from 1-D up to 6-D
  6. BONUS: Visualizing Unstructured Data
    • Text
    • Images
    • Audio
  7. Final words

Learning Outcome

  • Take a glance at the major data visulization frameworks
  • Get a clear understanding of univariate and multi-variate visualization
  • Learn effective strategies for visualizing data using the grammar of graphics
  • Get a clear perspective on which visualization techniques work best based on specific scenarios
  • Strategies for visualizing structured and unstructured data with actual examples

Target Audience

Data Enthusiasts, BI Developers, Data Scientists, Data Analysts

Prerequisites for Attendees

Knowledge of Python basics and data visualization techniques might be good but not essential since we will cover them during this session.

schedule Submitted 9 months ago

Public Feedback

comment Suggest improvements to the Speaker
  • Naresh Jain
    By Naresh Jain  ~  9 months ago
    reply Reply

    Thank you, DJ. I went through the article and I like how nicely (step-by-step) it shows the power of visualization of multi-dimensional data.

    Can I please request you to share a video from any of your past presentations? This will help the program committee to understand your presentation style.

    • Dipanjan Sarkar
      By Dipanjan Sarkar  ~  9 months ago
      reply Reply

      Thank you Naresh for the kind words. My recent presentations were unfortunately company (Intel) conferences\sponsored hence the videos are not publicly available. However if needed we can have a call or I can also meet\sync-up with you & team in Bangalore in case you wanted more details around my presentation style etc. Feel free to reach out as needed!

      • Naresh Jain
        By Naresh Jain  ~  9 months ago
        reply Reply
        If possible, can you please record a quick trailer of your talk (1 min) and post it on your proposal?
        --00000000000020f3fc056f5dfb1d--
        • Dipanjan Sarkar
          By Dipanjan Sarkar  ~  9 months ago
          reply Reply

          Hey Naresh, I am traveling out of town this week so would have prefered a skype video call maybe :) but I have recorded a brief intro in my mobile device, kindly excuse the choppy background noises and video. Hope this works for you. More than welcome to get on a call later if needed. Do let me know in case you have issues in accessing this video.

          Video link

           

          • Naresh Jain
            By Naresh Jain  ~  9 months ago
            reply Reply
            Thank you for the prompt response, DJ. This is more than sufficient.
            --0000000000008a941d056f645ed5--
            • Dipanjan Sarkar
              By Dipanjan Sarkar  ~  9 months ago
              reply Reply

              Sure no problem Naresh, thanks! open to refining and improving the structure of the talk based on your & team's feedback also as needed.


  • Jared Lander
    Jared Lander
    Chief Data Scientist
    Lander Analytics
    schedule 8 months ago
    Sold Out!
    480 Mins
    Workshop
    Beginner

    Modern statistics has become almost synonymous with machine learning - a collection of techniques that utilize today's incredible computing power. Jared Lander walks you through the available methods for implementing machine learning algorithms in R and explores underlying theories such as the elastic net and boosted trees.

    • Building the design matrix
    • Penalized regression with the lasso and ridge methods
    • Fitting models with glmnet
    • Interactive visualization of the coefficient path
    • Use cross-validation to choose the optimal lambda
    • Visualize coefficients with coefplot
    • Perform binary classification with a single tree with xgboost
    • Train a boosted tree
    • Tune xgboost hyperparameters
    • Use validation data to understand performance
    • Visualize variable importance
    • Train a boosted random forest with xgboost
  • Liked Dipanjan Sarkar
    keyboard_arrow_down

    Dipanjan Sarkar - Human Interpretable Machine Learning  — The Need and Importance of Model Interpretation (with hands-on examples)

    Dipanjan Sarkar
    Dipanjan Sarkar
    Data Scientist
    Red Hat
    schedule 9 months ago
    Sold Out!
    45 Mins
    Talk
    Beginner

    The field of Machine Learning has gone through some phenomenal changes over the last decade. In the industry, the main focus of data science or machine learning is more ‘applied’ rather than theoretical and effective application of these models on the right data to solve complex real-world problems is of paramount importance.

    A machine learning model by itself consists of an algorithm which tries to learn latent patterns and relationships from data without hard-coding fixed rules. Hence, explaining how a model works to the business always poses its own set of challenges. In this talk, I will be covering the need and importance of human interpretable machine learning approaches, look at effective strategies for model interpretation and several hands-on examples. Detailed coverage of open-source frameworks for machine learning model interpretation will also be one of the major focus areas. Examples will be showcased in Python.

  • Liked Dipanjan Sarkar
    keyboard_arrow_down

    Dipanjan Sarkar - Unleash the Power of Deep Learning with Transfer Learning

    Dipanjan Sarkar
    Dipanjan Sarkar
    Data Scientist
    Red Hat
    schedule 9 months ago
    Sold Out!
    45 Mins
    Talk
    Intermediate

    Transfer learning is a machine learning \ deep learning technique where knowledge gained during training in one set of machine learning problem can be used to train other similar types of problems. This is an extremely useful approach to leveraging pre-trained models to solve real-world problems having constraints and limitations of less data availability.

    This talk will cover essentials around deep learning and transfer learning concepts. The various methodologies of transfer learning. We will then look at diverse ways of how transfer learning can be applied in the real-world on complex problems around the following areas.

    • Computer Vision
    • Natural Language Processing
    • Audio Categorization

    We will briefly look at a multitude of real-world case studies and problems around the preceding areas like text classification, image classification, image captioning, style transfer and audio classification.

  • Liked joydeep bhattacharjee
    keyboard_arrow_down

    joydeep bhattacharjee - Cutting edge NLP with fastText

    joydeep bhattacharjee
    joydeep bhattacharjee
    Software Engineer
    Nineleaps
    schedule 9 months ago
    Sold Out!
    45 Mins
    Talk
    Intermediate

    FastText has been open-sourced by Facebook in 2016 and with its release, it became the fastest and most cutting edge library for text classification and word representation. It includes the implementation of two extremely important methodologies in NLP i.e Continuous Bag of Words and Skip-gram model. FastText performs exceptionally well with supervised as well as unsupervised learning.