Semantic Segmentation is a key component in many practical applications, and we humans have an innate understanding of the world around us where if someone points to something we can immediately say what that object is. Reframing, given an image we can say what each pixel in that image belongs to, this ability is important in making many decisions and thus transferring this understanding to Machines is crucial as we want machines to understand the world around it and take decisions based on its environment. Semantic Segmentation or labelling each pixel in an image is how we make a machine understand its surroundings based on an image.

In this talk, I briefly go over the techniques that were being used in the past and are used now and some techniques that are taking the light away, from State of the Art models like DeepLabv3+

 
 

Outline/Structure of the Tutorial

In this presentation, I will talk about the world of Semantic Segmentation and how algorithms and approaches have evolved over time.

Below are some key points that I wish to cover:

  • What is Semantic Segmentation? [10 mins]
    • An introduction to Semantic
    • Applications
    • A very brief overview of the approaches covered before Deep Learning
  • FCN - Fully Convolutional Neural Network for Semantic Segmentation [15 mins]
    • Walkthrough to implement the Architecture along with the code
    • Results achieved using this architecture
  • UNet [15 mins]
    • Architecture Explained along with code
    • Results achieved using this architecture
  • PSPNet [15 mins]
    • Architecture Explained of PSPNet along with the code
    • Results achieved (PSPNet SOTA results on real time semantic segmentation)
  • DeepLab [15 mins]
    • Explaining how the architecture of DeepLabv1 - v3+ evovled
    • Walkthrough the code to implement the architecture
    • Results achieved using this architecture
  • Looking at the Future - EncNet and FastFCN [20 mins]
    • Architecture Design Changes
    • Promising State of the Art Results
  • Conclusion

Note: Timings mentioned are might be updated.

Learning Outcome

I think people who attend this will go out with a sound knowledge of what's going on in the Semantic Segmentation World and which architecture to go for when building their next Semantic Segmentation Project.

Target Audience

Anybody with the given prerequisites

Prerequisites for Attendees

  • Basic Understanding of Deep Learning and Machine Learning
  • What Aritficial Neural Networks are and how they work
  • What Convolutional Neural Networks are and how they work
schedule Submitted 5 months ago

Public Feedback

comment Suggest improvements to the Speaker
  • Kuldeep Jiwani
    By Kuldeep Jiwani  ~  4 months ago
    reply Reply

    Hi Arunava,

    You have described the outline pretty well above.

    Just wanted to clarify a few things. You would be first comparing the different architectures theoretically / conceptually then you are also planning to show the difference amongst all the various architectures by running through a real example applied in all the different architectures.

    • Arunava Chakraborty
      By Arunava Chakraborty  ~  4 months ago
      reply Reply

      Hi Kuldeep!

      Thanks for getting back! As I said in my proposal I will be walking through the code which practically explains the architecuture and also take snippets from the respective papers to explain the motivation behind the design of each architecture and why an architecture was tweaked in a way as shown (to a certain extent as explained by the authours of the paper)

      About your second point, where in you ask, that whether I will be *running* each architecture and going over like some real world example (like Cityscapes dataset) to show how each architecture performs, I would really like to do so, but due to computational limits I am not able to train and tweak such large architectures as DeepLabv3 (and predecessors).

      P.S. I got an email which asked me to update my proposal with some real world example that I have applied semantic segmentation on, being a student, I haven't yet done so. So, I kind of thought that my proposal was rejected. And I completely agree with that email that with all the infomation available on the internet why would one come to an event and pay to hear someone speak if he/she is not talking about some real world experience he/she has faced in his/her day2day life.

      Anyways point is, I have been currently working on Adversarial Machine Learning (attacks) and this is one serious field to consider as the state of the art classifiers fail when inputs are even slightly perturbed resulting in a total misclassification! I have worked on reproducing 4 attacks as of now and can demonstrate each of those 4 attacks in 2-3 lines of code. I am currently working on reproducing more attacks and testing each net and how robust it is against these attacks. I have benchmarked 2 nets (Lenet and Alexnet) on the most simple dataset of all MNIST. (here) and  have observed how each of those 4 attacks is performing on each one of these nets. (and how miserably these nets fail with even the most simplest attacks)

      This i believe is a real world scenario cause, I have used one of these attacks to perform an attack on the IBM Image Classification algorithm (Watson API) and have made it perform poorly, (image attached)

      I believe, in the coming months I will have a few more attacks implemented and I can live demo each attack on a range of classification algorithms (Lenet to GoogleNet) and actually show the original image and the perturbed image.

      If this looks good, and I believe this is a real issue and it really is, let me know, I can change my proposal.

       

       

      Adversarial Attack on IBM Watson

      • Dipanjan Sarkar
        By Dipanjan Sarkar  ~  4 months ago
        reply Reply

        Don't worry about being a student, I think your proposal is very much in line with the current trends which haven't been showcased very prominently except research papers and maybe few examples here and there. Following are some options I see

        1. Can you implement these models on standard open datasets out there having some practical significance (doesn't have to be a so-called industry project). If you can and showcase how they work it would make a lot of sense

        2. I totally understand the computational limitations and do you think you can train it on a deep learning cluster maybe in your university\college or maybe on freeish resources like colab (though I have doubts considering this is segmentation). If you can then maybe you can showcase whatever you have mentioned in your current proposal

        3. You can convert your proposal to maybe a talk on the importance of building robust classifiers i.e showcase adversarial attacks on models like you mentioned in your comment.

         

        And you can just train the models beforehand if possible and do a demo\code walkthrough. You don't need to really run them on the day of the conference in case you were wondering.

        • Arunava Chakraborty
          By Arunava Chakraborty  ~  4 months ago
          reply Reply

          Hey,

          Thanks to both of you for all the advices and words :) Really appreciated. And sorry for getting back late. I took some time out to prepare this proposal [here] This proposal is on adversarial Machine Learning, and covers the points and attacks and defences I am willing to go over. If needed, I am open to add more attacks and defences. I have mentioned some defences which I know of and have been proven to work well (as seen in NeurIPS Adversarial Machine Learning Challenge 2017 (Defense Track))

          Do let me know what you think of the proposal, if you like it, I can update this current one with the Adversarial one. As I have been saying that I will show the attacks perform live. I have attached a video. [here] Do look at it (at 1.5x) and let me know.

          Okay, now about the Semantic Segmentation proposal, I can give a detailed talk about what each architecture above specified does and all technical details mentioned in the paper. And if its just a talk I can add 1 or 2 architectures more.

          About showing the models work, I probably can get some pretrained models from internet to show how each works. About training on Colab, I tried so with CityScapes dataset and DeepLabv3 it took me 2hrs30mins on 1 epoch (LOL). Not saying its not possible, it is, I have trained it for 15 epochs (for days as Colabs session got disconnected now and then)

          Lastly, since I am working on Adversarial Machine Learning, I would prefer it. Let me know. I would update my proposal accordingly.

          Edit: My orginal message got trimmed somehow, to just after `1 epoch`. I am updating this comment after 5 hours of original posting.

      • Kuldeep Jiwani
        By Kuldeep Jiwani  ~  4 months ago
        reply Reply

        Hi Arunava,

        Good to hear a detailed reply from you and glad to see your passion for the subject. The conference is an open forum where ideas and knowledge could be shared, people come from far away places and invest their valuable time to gain some knowledge.

        Our concern is just that we present audience not just with abstracts from research paper but explain the conceptual difference via examples. So the the real refers to actual application of segmentation on images and showing how different architecture manifest in different results.

        Being a student can't be an excuse of not covering technical depth in a conference, your description looks good in terms of technical depth thats why your proposal is still under consideration. Its just the program committee wants to make sure the audience are able to grasp the topic with sufficient supporting material examples to understand it.

        You have mentioned about adversarial attacks, which is also an interesting research area. See how you can refactor your content and bring the best part out for the audience. I agree with Dipanjan that you can either present it like a talk with all the detailed explanations. You always have support for resources via Google Colab just use them and bring your best stuff out.

        Best of luck