Discovering/Visualizing latent structure of complex data through dimensionality reduction using deep learning techniques

Visualizing complex data has always been a major problem in the field of data science , as complex data sets often lack comprehensibility , which make them extremely difficult to interpret for people who aren’t data scientists . Training deep learning models on these datasets greatly aggravates this problem as the models tend to become very opaque and hard to interpret.

This problems is especially acute in medical datasets , where the black box nature of deep learning algorithms is a cause for concern and doctors have difficulty believing the predictions of the model. It is more easy to interpret the models and build trust if they are coupled with visualizations which could be easily explained to the doctors.

Visualizing complex datasets though , is a very arduous task because of the huge number of features in the set and the sparsity of the datasets. This problem can be tackled by using variations of deep autoencoders/VAE's/VAE-GANs' etc. which can figure out the latent structure in complex data sets and map it to a 3-D plane which can be readily understood.

 
 

Outline/Structure of the Talk

  • Why do we need to visualize data
  • Problems in visualizing complex data
  • Traditional Dimentionality reduction techniques
  • 5 Minutes for the first 3 sections
  • Why these dont work very well on complex real world data
  • How autoencoders can help ? Advantages and Limitations
  • 5 Minutes for the next 2 sections
  • How to use autoencoders (various types and architectures) for dimentionality reduction
  • Example of usage on a complex medical data set (part of my current research at Leeds)
  • 10 Minutes for the last sections

Learning Outcome

The attendes will have a solid overview of traditional dimentionality reduction techiniques and their limitation and how autoencoders/VAE's/VAE'GAN's come into to picture to tackle some of the problems

Target Audience

medical professionals , AI researchers , AI consultants , Data Visualization Consultants , Managment

Prerequisites for Attendees

Familiarity with dimentionality reduction techniques and some lingo of statistics . Knowledge of deep learning and autoencoders and GAN's is a plus.

schedule Submitted 4 months ago

Public Feedback

comment Suggest improvements to the Speaker
  • Ashay Tamhane
    By Ashay Tamhane  ~  3 months ago
    reply Reply

    Hi Yash, thanks for the proposal. You mention the need to explain deep learning algorithm's output based on the dataset visualisation. Can you elaborate on how do you connect the model output to the input dataset in terms of interpreting the results?

    • Yash Deo
      By Yash Deo  ~  3 months ago
      reply Reply

      Hey Ashay ,

      Most of the research here involves complex operations on medical data . In most of our dataset's its impossible to distill the output down to a product of a few features and tell someone "this is why the model is predicting a high risk of sepsis etc" .

      Hence the focus is instead on converting the complex data into a visualizable latent space which can be explained to nurses/doctors. For example ,let us look at the plots i get after i used these techniques over complex data taken from the ICU's.

      1.  Plot for a patient who ends up becoming septic [ https://drive.google.com/open?id=1JtM8dhfQcPy0UsNvQhzmvZ38Em7RZniZ ]  , [ 
      2.  Plot for a patient who does not become septic [https://drive.google.com/open?id=1ScT-7ubrP0U7VLvgmz98D6LjyXpbCL-z]

      There clearly is an observable trend which patients who become septic seems to follow , we can then easily train the nurses who will be able to look at these plots updating in realtime to identify when a patient is on the road to become septic just by looking at the plots. 

      I know this explanation may be a bit vague but hopefully my point has come across . Let me know if you have any more questions

      • Ashay Tamhane
        By Ashay Tamhane  ~  3 months ago
        reply Reply

        Understood. Thanks for the detailed response and the plots.

  • Kuldeep Jiwani
    By Kuldeep Jiwani  ~  3 months ago
    reply Reply

    The idea of visualisation is not perfectly clear. Are you referring to using AutoEncoders, VAEs, and then do you plan to map them to a 3 dimensional embedding later and use those vectors for visualisation. Or plan to building a separate 3-D manifold for visualisation

    • Yash Deo
      By Yash Deo  ~  3 months ago
      reply Reply

      Hey Kuldeep ,

      Yes , we use Autoencoder , VAE's etc and map them to a 3 dimentional encoding layer and use the features from that layer to visualize.

      • Kuldeep Jiwani
        By Kuldeep Jiwani  ~  3 months ago
        reply Reply

        Sounds good, Thanks

  • Natasha Rodrigues
    By Natasha Rodrigues  ~  4 months ago
    reply Reply

    Hi Yash,

    Thanks for your proposal! Requesting you to update the Outline/Structure section of your proposal with a time-wise breakup of how you plan to use 20 mins for the topics you've highlighted?

    To help the program committee understand your proposal a little better, can you add the slides for your proposal.

    Also, in order to ensure the completeness of your proposal, we suggest you go through the review process requirements.

    Thanks,

    Natasha

    • Yash Deo
      By Yash Deo  ~  4 months ago
      reply Reply

      Hey Natasha ,

      Have updated the timeline , will be able to upload the slides in some time.

       

      Thanks