Evolution Of Image Recognition And Object Segmentation: From Apes To Machines

From a long time, we have thought of how we could harness the amazing gift of vision because we could achieve greatness to new heights and open up endless possibilities like cars that can drive themselves. Along the path of harnessing this power, we have found numerous algorithms. In this talk, we will cover and see all the latest trends in this field, the architectures of each algorithm and evolution of different algorithms of image recognition task. we will cover it all from The dinosaur age of Image recognition to the cyborg age of object segmentation and further, CNNs to R-CNNs to Mask-RCNN. A close analysis performance-wise of these models

 
11 favorite thumb_down thumb_up 4 comments visibility_off  Remove from Watchlist visibility  Add to Watchlist
 

Outline/Structure of the Talk

In the presentation, I will be talking about the broadest(In terms of application), most common yet most undiscovered field of Machine Learning and AI. Image Recognition!

This talk will describe the relatively newer algorithms to enhance Image recognition and object detection tasks.

The talk will kick off from the earliest times in the dinosaur age of Image Recognition through the industrial age and then in the end towards the cyborg age of this algorithm and its relation with ML

On a very high level, the presentation will cover:

  • Image recognition using a Neural Network
    • An Introduction and straight dive in
    • The problems it solved
    • The architecture explained
    • The problems with this algorithm
  • Image recognition using a CNN
    • A Simple Outline
    • problems solved
    • The architecture explained
    • problem with this algorithm
  • object detection using R-CNN
    • A Small Warmup and Simple Sketch
    • Problems it had solved and eased
    • The architecture explained
    • The problems with this algorithm
  • object detection using Fast R-CNN
    • An Outline
    • Problems it had solved
    • The architecture explained
    • The problems with this algorithm
  • Object detection using Faster R-CNN
    • The grand entrance
    • Problems it had solved
    • explanation of the architecture
    • The problems with this algorithm
  • Object detection and segmentation using Mask R-CNN
    • The FAIR entrance
    • Problems it had solved
    • explanation of the architecture
    • A small Demo
  • Q & A Session

Learning Outcome

In this talk, The audience will learn about all the latest trends and evolution of different algorithms of the image recognition task and how it changed over the years to such a great extent where we even map out the small pixels of the body we want to detect and not just detect but pinpoint the object with all its pixels.

The audience will learn all the theory about the Different Algorithms used for Image recognition all the way from simple image recognition to object segmentation.

Target Audience

Mostly some data scientists , ML enthusiasts and people who want to learn about image recognition and object detection.

Prerequisites for Attendees

Some basics about neural networks is recommended but not necessary.

schedule Submitted 6 months ago

Public Feedback

comment Suggest improvements to the Speaker
  • Kuldeep Jiwani
    By Kuldeep Jiwani  ~  3 months ago
    reply Reply

    Hi Aamir,

    Can you add some more technical details on the content you wish to cover in the talk.

    Will this be a literature survey of available CNNs, like a collection of research summary?
    Or you will be walking through some use cases and show how different variants of CNNs work and how they compare with each other.

    Also please post a small video of yours speaking on either the summary of talk or one of the sub-topics.

    • Aamir Nazir
      By Aamir Nazir  ~  3 months ago
      reply Reply

      Hello Kuldeep,

      Yes, in this talk, I will be taking some use-cases and showing how all the models work in each use-case and also the best one for each. Also, I will be talking over the performance and accuracy of each model while comparing them for different use-cases.

      I will also be running a sample video with these models in action to see how each one works in a real-life situation.

      And Yes, I will be uploading a sample talk soon.

      Thanking You,

      Aamir Nazir

      • Kuldeep Jiwani
        By Kuldeep Jiwani  ~  3 months ago
        reply Reply

        Thanks Aamir for the details, it looks good.

        Another quick question, do you also wish to cover on the algorithmic difference amongst various CNNs choices.

        Will wait for the video, thanks.

        • Aamir Nazir
          By Aamir Nazir  ~  3 months ago
          reply Reply

          Hi Kuldeep,

          Yes, I will be covering the entire NN Architecture too and compare their architectures.

          Thank You,

          Aamir Nazir


  • Liked Dr. Vikas Agrawal
    keyboard_arrow_down

    Dr. Vikas Agrawal - Non-Stationary Time Series: Finding Relationships Between Changing Processes for Enterprise Prescriptive Systems

    45 Mins
    Talk
    Intermediate

    It is too tedious to keep on asking questions, seek explanations or set thresholds for trends or anomalies. Why not find problems before they happen, find explanations for the glitches and suggest shortest paths to fixing them? Businesses are always changing along with their competitive environment and processes. No static model can handle that. Using dynamic models that find time-delayed interactions between multiple time series, we need to make proactive forecasts of anomalous trends of risks and opportunities in operations, sales, revenue and personnel, based on multiple factors influencing each other over time. We need to know how to set what is “normal” and determine when the business processes from six months ago do not apply any more, or only applies to 35% of the cases today, while explaining the causes of risk and sources of opportunity, their relative directions and magnitude, in the context of the decision-making and transactional applications, using state-of-the-art techniques.

    Real world processes and businesses keeps changing, with one moving part changing another over time. Can we capture these changing relationships? Can we use multiple variables to find risks on key interesting ones? We will take a fun journey culminating in the most recent developments in the field. What methods work well and which break? What can we use in practice?

    For instance, we can show a CEO that they would miss their revenue target by over 6% for the quarter, and tell us why i.e. in what ways has their business changed over the last year. Then we provide the prioritized ordered lists of quickest, cheapest and least risky paths to help turn them over the tide, with estimates of relative costs and expected probability of success.

  • Liked Dipanjan Sarkar
    keyboard_arrow_down

    Dipanjan Sarkar - Explainable Artificial Intelligence - Demystifying the Hype

    Dipanjan Sarkar
    Dipanjan Sarkar
    Data Scientist
    Red Hat
    schedule 7 months ago
    Sold Out!
    45 Mins
    Tutorial
    Intermediate

    The field of Artificial Intelligence powered by Machine Learning and Deep Learning has gone through some phenomenal changes over the last decade. Starting off as just a pure academic and research-oriented domain, we have seen widespread industry adoption across diverse domains including retail, technology, healthcare, science and many more. More than often, the standard toolbox of machine learning, statistical or deep learning models remain the same. New models do come into existence like Capsule Networks, but industry adoption of the same usually takes several years. Hence, in the industry, the main focus of data science or machine learning is more ‘applied’ rather than theoretical and effective application of these models on the right data to solve complex real-world problems is of paramount importance.

    A machine learning or deep learning model by itself consists of an algorithm which tries to learn latent patterns and relationships from data without hard-coding fixed rules. Hence, explaining how a model works to the business always poses its own set of challenges. There are some domains in the industry especially in the world of finance like insurance or banking where data scientists often end up having to use more traditional machine learning models (linear or tree-based). The reason being that model interpretability is very important for the business to explain each and every decision being taken by the model.However, this often leads to a sacrifice in performance. This is where complex models like ensembles and neural networks typically give us better and more accurate performance (since true relationships are rarely linear in nature).We, however, end up being unable to have proper interpretations for model decisions.

    To address and talk about these gaps, I will take a conceptual yet hands-on approach where we will explore some of these challenges in-depth about explainable artificial intelligence (XAI) and human interpretable machine learning and even showcase with some examples using state-of-the-art model interpretation frameworks in Python!

  • Liked Akshay Bahadur
    keyboard_arrow_down

    Akshay Bahadur - Minimizing CPU utilization for deep networks

    Akshay Bahadur
    Akshay Bahadur
    SDE-I
    Symantec Softwares
    schedule 6 months ago
    Sold Out!
    45 Mins
    Demonstration
    Beginner

    The advent of machine learning along with its integration with computer vision has enabled users to efficiently to develop image-based solutions for innumerable use cases. A machine learning model consists of an algorithm which draws some meaningful correlation between the data without being tightly coupled to a specific set of rules. It's crucial to explain the subtle nuances of the network along with the use-case we are trying to solve. With the advent of technology, the quality of the images has increased which in turn has increased the need for resources to process the images for building a model. The main question, however, is to discuss the need to develop lightweight models keeping the performance of the system intact.
    To connect the dots, we will talk about the development of these applications specifically aimed to provide equally accurate results without using much of the resources. This is achieved by using image processing techniques along with optimizing the network architecture.
    These applications will range from recognizing digits, alphabets which the user can 'draw' at runtime; developing state of the art facial recognition system; predicting hand emojis, developing a self-driving system, detecting Malaria and brain tumor, along with Google's project of 'Quick, Draw' of hand doodles.
    In this presentation, we will discuss the development of such applications with minimization of CPU usage.

  • Liked Maryam Jahanshahi
    keyboard_arrow_down

    Maryam Jahanshahi - Applying Dynamic Embeddings in Natural Language Processing to Analyze Text over Time

    Maryam Jahanshahi
    Maryam Jahanshahi
    Research Scientist
    TapRecruit
    schedule 7 months ago
    Sold Out!
    45 Mins
    Case Study
    Intermediate

    Many data scientists are familiar with word embedding models such as word2vec, which capture semantic similarity of words in a large corpus. However, word embeddings are limited in their ability to interrogate a corpus alongside other context or over time. Moreover, word embedding models either need significant amounts of data, or tuning through transfer learning of a domain-specific vocabulary that is unique to most commercial applications.

    In this talk, I will introduce exponential family embeddings. Developed by Rudolph and Blei, these methods extend the idea of word embeddings to other types of high-dimensional data. I will demonstrate how they can be used to conduct advanced topic modeling on datasets that are medium-sized, which are specialized enough to require significant modifications of a word2vec model and contain more general data types (including categorical, count, continuous). I will discuss how my team implemented a dynamic embedding model using Tensor Flow and our proprietary corpus of job descriptions. Using both categorical and natural language data associated with jobs, we charted the development of different skill sets over the last 3 years. I will specifically focus the description of results on how tech and data science skill sets have developed, grown and pollinated other types of jobs over time.

  • Liked Aamir Nazir
    keyboard_arrow_down

    Aamir Nazir - DeepMind Alpha Fold 101

    Aamir Nazir
    Aamir Nazir
    Researcher
    -
    schedule 4 months ago
    Sold Out!
    45 Mins
    Talk
    Intermediate

    "Today we’re excited to share DeepMind’s first significant milestone in demonstrating how artificial intelligence research can drive and accelerate new scientific discoveries. With a strongly interdisciplinary approach to our work, DeepMind has brought together experts from the fields of structural biology, physics, and machine learning to apply cutting-edge techniques to predict the 3D structure of a protein based solely on its genetic sequence." source: https://deepmind.com/blog/alphafold/

    Over the past five decades, scientists have been able to determine shapes of proteins in labs using experimental techniques like cryo-electron microscopy, nuclear magnetic resonance or X-ray crystallography, but each method depends on a lot of trial and error, which can take years and cost tens of thousands of dollars per structure. This is why biologists are turning to AI methods as an alternative to this long and laborious process for difficult proteins.

    Recently released by Deepmind, Alpha fold, beat top pharmaceutical companies with 100K+ employees like Pfizer, Novartis, etc. at predicting protein structures in the CASP13 challenge. It outperformed all the other competitors and emerged first with a huge difference of correctly predicting 25 proteins correctly whereas the second place winner only predicted 9 of them correctly and that too with only 29K of the 129K present data about different proteins

    This research is the greatest breakthrough in this field which will be able to predict how proteins fold for the formation of different types of proteins for different functions. This is important because this could lead to a better understanding and possibly a cure for diseases like Alzheimer's, mad cow's disease etc. because these diseases are believed to be caused due to malfunction in the folding of the proteins in the body.

    The architecture for the network was simple, on a high level it constituted of residual convolutional neural network and gradient descent to optimize full protein features in the end.

    The audience from this talk will be able to learn about how to reproduce the architecture of the Alpha Fold and also some basics about how different proteins strands affect the body and function of the proteins. This talk will be mostly on the technical side of the Alpha Fold.

  • Liked Aamir Nazir
    keyboard_arrow_down

    Aamir Nazir - All-out Deep Learning - 101

    Aamir Nazir
    Aamir Nazir
    Researcher
    -
    schedule 6 months ago
    Sold Out!
    45 Mins
    Talk
    Beginner

    In This Talk, We will be discussing different problems and The different focus areas of Deep Learning. This Session will focus on intermediate learners looking to learn deeper in Deep Learning. We, Will, Be taking the different Tasks and seeing which deep Neural Network Architecture can solve this problem and also learn about the different neural network architectures for the same task.