The Art of Effective Visualization of Multi-dimensional Data - A hands-on Approach

location_city Bengaluru schedule Sep 1st 02:00 - 02:45 PM IST place Jupiter people 107 Interested

Descriptive Analytics is one of the core components of any analysis life-cycle pertaining to a data science project or even specific research. Data aggregation, summarization and visualization are some of the main pillars supporting this area of data analysis. However, dealing with multi-dimensional datasets with typically more than two attributes start causing problems, since our medium of data analysis and communication is typically restricted to two dimensions. We will explore some effective strategies of visualizing data in multiple dimensions (ranging from 1-D up to 6-D) using a hands-on approach with Python and popular open-source visualization libraries like matplotlib and seaborn. We will also do a brief coverage on excellent R visualization libraries like ggplot if we have time.

BONUS: We will also look at ways to visualize unstructured data with several dimensions including text, images and audio!


Outline/Structure of the Tutorial

The talk is usually a 90 minutes session but we will be covering it in the scheduled 45 minute session focusing on the main aspects of effective data visualization with the grammar of graphics, leveraging popular open-source frameworks in Python and also as a bonus cover visualization in unstructured data including text, audio and images.

Note: All the code and resources will be shared and open-sourced for your benefit! So you don't need to take extensive notes and can focus on the presentation\talk.


  1. Introduction
    • What is Data Visualization?
    • Why Data Visualization?
  2. Motivation
    • Why Effective Data Visualization
  3. Effective Multi-dimensional Data Visualization
    • Whirlwind tour of the grammar of graphics
  4. Visualization tools and frameworks
    • General tools & frameworks
    • Python visualization frameworks
    • R visualization frameworks
  5. Visualizing Structured Data
    • Univariate analysis and visualizations
    • Multivariate analysis and visualizations
    • Visualizing from 1-D up to 6-D
  6. BONUS: Visualizing Unstructured Data
    • Text
    • Images
    • Audio
  7. Final words

Learning Outcome

  • Take a glance at the major data visulization frameworks
  • Get a clear understanding of univariate and multi-variate visualization
  • Learn effective strategies for visualizing data using the grammar of graphics
  • Get a clear perspective on which visualization techniques work best based on specific scenarios
  • Strategies for visualizing structured and unstructured data with actual examples

Target Audience

Data Enthusiasts, BI Developers, Data Scientists, Data Analysts

Prerequisites for Attendees

Knowledge of Python basics and data visualization techniques might be good but not essential since we will cover them during this session.



schedule Submitted 4 years ago

  • 480 Mins

    Modern statistics has become almost synonymous with machine learning - a collection of techniques that utilize today's incredible computing power. Jared Lander walks you through the available methods for implementing machine learning algorithms in R and explores underlying theories such as the elastic net and boosted trees.

    • Building the design matrix
    • Penalized regression with the lasso and ridge methods
    • Fitting models with glmnet
    • Interactive visualization of the coefficient path
    • Use cross-validation to choose the optimal lambda
    • Visualize coefficients with coefplot
    • Perform binary classification with a single tree with xgboost
    • Train a boosted tree
    • Tune xgboost hyperparameters
    • Use validation data to understand performance
    • Visualize variable importance
    • Train a boosted random forest with xgboost
  • Dipanjan Sarkar

    Dipanjan Sarkar - Human Interpretable Machine Learning  — The Need and Importance of Model Interpretation (with hands-on examples)

    45 Mins

    The field of Machine Learning has gone through some phenomenal changes over the last decade. In the industry, the main focus of data science or machine learning is more ‘applied’ rather than theoretical and effective application of these models on the right data to solve complex real-world problems is of paramount importance.

    A machine learning model by itself consists of an algorithm which tries to learn latent patterns and relationships from data without hard-coding fixed rules. Hence, explaining how a model works to the business always poses its own set of challenges. In this talk, I will be covering the need and importance of human interpretable machine learning approaches, look at effective strategies for model interpretation and several hands-on examples. Detailed coverage of open-source frameworks for machine learning model interpretation will also be one of the major focus areas. Examples will be showcased in Python.

  • Dipanjan Sarkar

    Dipanjan Sarkar - Unleash the Power of Deep Learning with Transfer Learning

    45 Mins

    Transfer learning is a machine learning \ deep learning technique where knowledge gained during training in one set of machine learning problem can be used to train other similar types of problems. This is an extremely useful approach to leveraging pre-trained models to solve real-world problems having constraints and limitations of less data availability.

    This talk will cover essentials around deep learning and transfer learning concepts. The various methodologies of transfer learning. We will then look at diverse ways of how transfer learning can be applied in the real-world on complex problems around the following areas.

    • Computer Vision
    • Natural Language Processing
    • Audio Categorization

    We will briefly look at a multitude of real-world case studies and problems around the preceding areas like text classification, image classification, image captioning, style transfer and audio classification.

  • joydeep bhattacharjee

    joydeep bhattacharjee - Cutting edge NLP with fastText

    45 Mins

    FastText has been open-sourced by Facebook in 2016 and with its release, it became the fastest and most cutting edge library for text classification and word representation. It includes the implementation of two extremely important methodologies in NLP i.e Continuous Bag of Words and Skip-gram model. FastText performs exceptionally well with supervised as well as unsupervised learning.