Indian Sign Language Recognition (ISLAR)

Sample this – two cities in India; Mumbai and Pune, though only 80kms apart have a distinctly varied spoken dialect. Even stranger is the fact that their sign languages are also distinct, having some very varied signs for the same objects/expressions/phrases. While regional diversification in spoken languages and scripts are well known and widely documented, apparently, this has percolated in sign language as well, essentially resulting in multiple sign languages across the country. To help overcome these inconsistencies and to standardize sign language in India, I am collaborating with the Centre for Research and Development of Deaf & Mute (an NGO in Pune) and Google. Adopting a two-pronged approach: a) I have developed an Indian Sign Language Recognition System (ISLAR) which utilizes Artificial Intelligence to accurately identify signs and translate them into text/vocals in real-time, and b) have proposed standardization of sign languages across India to the Government of India and the Indian Sign Language Research and Training Centre.

As previously mentioned, the initiative aims to develop a lightweight machine-learning model, for 14 million speech/hearing impaired Indians, that is suitable for Indian conditions along with the flexibility to incorporate multiple signs for the same gesture. More importantly, unlike other implementations, which utilize additional external hardware, this approach, which utilizes a common surgical glove and a ubiquitous camera smartphone, has the potential of hardware-related savings at an all-India level. ISLAR received great attention from the open-source community with Google inviting me to their India and global headquarters in Bangalore and California, respectively, to interact with and share my work with the TensorFlow team.

 
 

Outline/Structure of the Demonstration

Outline

  • Background of the problem - understanding the problems faced by the deaf and mute community. [2 mins]
    • 14 million people in India have speech and hearing impairment.
    • Current solutions are neither scalable nor ubiquitous.
  • Defining a strong problem statement [2 min]
  • Key aspects while designing the application.[8 mins]
    • Building a low resource consuming machine learning model that can be deployed on the edge. [1 min]
    • Eliminate the need for external hardware. [1 min]
    • Phase 0: Localizing just hand gestures.[2 mins]
    • Phase 1: Adding your facial key points along with hand localization. [2mins]
    • Phase 2: Adding sequential information to each frame for carrying the context thus, enabling the model to pick up the entire context of the conversation.[2 mins]
  • Getting resources from Google and TensorFlow.
  • Results and conclusion [1 min]
  • Future aspects [1 min]

Demonstrations

  • Preparation [1 min]
  • ISLAR Phase 0 [1 min]
  • ISLAR Phase 1 [1 min]
  • Presentation at Google, Bangalore [1 min]
  • Presentation at Google, California [2 mins]

Learning Outcome

By the end of the session, the audience will have a clearer understanding of the problems being faced by an underrepresented community in India therefore, catalyzing the thought process of the attendees to address social issues in India as well as other developing countries.

Target Audience

Machine Learning enthusiasts as well as virtuosos.

Prerequisites for Attendees

None

schedule Submitted 6 months ago

Public Feedback

comment Suggest improvements to the Author
  • Ravi Balasubramanian
    By Ravi Balasubramanian  ~  5 months ago
    reply Reply

    Hi Akshay,

     

    Thanks for the very interesting proposal. 

    1. In one of your earlier slides, you mention about different sign dialects, similar to language dialects. Can you please add examples of few of these sign dialects, say between two cities or regions when they are communicating the same to illuminate the audience?

    2. It would be nice if you can add details how this specific work shall help the deaf and mute community? How are you envisioning them using it? The slide deck attached doesn't clearly make the connection from problem statement to your ML work to post application usage.

    Thanks,

    • Akshay Bahadur
      By Akshay Bahadur  ~  5 months ago
      reply Reply

      Hi Ravi,

      Thanks for liking my proposal.

      1. Sure. I think it makes sense to add the information regarding the difference in dialects. Also, I would address other problems i.e., Sign Language is not considered a language it is considered as a skill in India. I will do it while formalizing the problem statement.
      2. I understand that the ppt doesn't cover a lot of aspects since I prepared this for Google and had to follow a very strict time frame. To give you a brief, the idea behind the project was to
        • Address the issues faced by the deaf and mute community in India.
        • Developing a sign language recognition system for running on low resources
        • Working with the Indian government to unify and enforce a universal sign language within India.
  • Ashay Tamhane
    By Ashay Tamhane  ~  5 months ago
    reply Reply

    Thanks Akshay for a very interesting problem statement. Could you please clarify which ML/Image processing techniques will you be discussing in the talk for your solution?

    • Akshay Bahadur
      By Akshay Bahadur  ~  5 months ago
      reply Reply

      Hey.
      The techniques which I will discuss.

      • Identifying and removing background noise from images.
        • Filter and Blurring
        • ROI extraction
      • Hand gesture detection and localization using
        • Semantic Color Segmentation
        • On-Device, Real-Time Hand Tracking with MediaPipe 
      • Facial keypoint tracking
        • Haar Cascades
        • dlib face tracking
        • custom face tracking module (lightweight)
      • Machine Learning algorithm
        • Simple DNN
        • Simple CNN (static gestures)
        •  Rolling prediction averaging for working with Videos
      • Resource - centric Machine Learning
        • Post-training Quantization
        • Model Pruning.
      • Sujoy Roychowdhury
        By Sujoy Roychowdhury  ~  5 months ago
        reply Reply

        Given it is a 20 minute talk do you think so many approaches is too ambitous  ?

        • Akshay Bahadur
          By Akshay Bahadur  ~  5 months ago
          reply Reply

          Yes, that's true.

          I have attached the outline below. So I won't be able to deep dive given a 20 min window.

          Outline

          • Background of the problem - understanding the problems faced by the deaf and mute community. [2 mins]
            • 14 million people in India have speech and hearing impairment.
            • Current solutions are neither scalable nor ubiquitous.
          • Defining a strong problem statement [2 min]
          • Key aspects while designing the application.[8 mins]
            • Building a low resource consuming machine learning model that can be deployed on the edge. [1 min]
            • Eliminate the need for external hardware. [1 min]
            • Phase 0: Localizing just hand gestures.[2 mins]
            • Phase 1: Adding your facial key points along with hand localization. [2mins]
            • Phase 2: Adding sequential information to each frame for carrying the context thus, enabling the model to pick up the entire context of the conversation.[2 mins]
          • Getting resources from Google and TensorFlow.
          • Results and conclusion [1 min]
          • Future aspects [1 min]

          Demonstrations

          • Preparation [1 min]
          • ISLAR Phase 0 [1 min]
          • ISLAR Phase 1 [1 min]
          • Presentation at Google, Bangalore [1 min]
          • Presentation at Google, California [2 mins]

          But, if we can have a bigger window (40 mins in ODSC 2018 and 2019). I could cover more elements.

          But for 20 mins, I will select the most sophisticated method (which provides max efficiency).
  • Natasha Rodrigues
    By Natasha Rodrigues  ~  5 months ago
    reply Reply

    Hi Akshay,

    Thanks for your proposal! Requesting you to update the Outline/Structure section of your proposal with a time-wise breakup of how you plan to use 20 mins for the topics you've highlighted?

    To help the program committee understand your proposal a little better can you provide your slides as well.

    Thanks,

    Natasha

    • Akshay Bahadur
      By Akshay Bahadur  ~  5 months ago
      reply Reply

      Hi Natasha,

      I have updated the outline with a time-wise breakup of the topics.

      I am going to need some time for preparing the slides for ODSC 2020, given that my session is selected.
      However, I have presented the same (less mature) work at Google, Bangalore [slides] and Google, California [slides].
      Let me know if it works for the committee.

       

      • Natasha Rodrigues
        By Natasha Rodrigues  ~  5 months ago
        reply Reply

        Hi Akshay,

        Thanks, this will do, kindly update the same in the Slides section of your proposal. Will let you know if we need more details.

        Regards,

        Natasha  

        • Akshay Bahadur
          By Akshay Bahadur  ~  5 months ago
          reply Reply

          Hi Natasha,


          I have made the changes as per your comments.
          Let me know if you need any more details.

          Regards,
          Akshay Bahadur


  • Kuldeep Singh
    keyboard_arrow_down

    Kuldeep Singh - Simplify Experimentation, Deployment and Collaboration for ML and AI Models

    20 Mins
    Demonstration
    Intermediate

    Machine Learning and AI are changing or would say have changed the way how businesses used to behave. However, the Data Science community is still lacking good practices for organizing their projects and effectively collaborating and experimenting quickly to reduce “time to market”.

    During this session, we will learn about one such open-source tool “DVC”
    which can help you in helping ML models shareable and reproducible.
    It is designed to handle large files, data sets, machine learning models, metrics as well as code

  • Kriti Doneria
    keyboard_arrow_down

    Kriti Doneria - Trust Building in AI systems: A critical thinking perspective

    Kriti Doneria
    Kriti Doneria
    Data Science
    Practitioner
    schedule 1 year ago
    Sold Out!
    20 Mins
    Talk
    Beginner

    How do I know when to trust AI,and when not to?

    Who goes to jail if a self driving car kills someone tomorrow?

    Do you know scientists say people will believe anything,repeated enough

    Designing AI systems is also an exercise in critical thinking because an AI is only as good as its creator.This talk is for discussions like these,and more.

    With the exponential increase in computing power available, several AI algorithms that were mere papers written decades ago have become implementable. For a data scientist, it is very tempting to use the most sophisticated algorithm available. But given that its applicability has moved beyond academia and out into the business world, are numbers alone sufficient? Putting context to AI, or XAI (explainable AI) takes the black box out of AI to enhance human-computer interaction. This talk shall revolve around the interpret-ability-complexity trade-off, challenges, drivers and caveats of the XAI paradigm, and an intuitive demo of translating inner workings of an ML algorithm into human understandable formats to achieve more business buy-ins.

    Prepare to be amused and enthralled at the same time.

  • Kuldeep Singh
    keyboard_arrow_down

    Kuldeep Singh - Leverage Docker, Kubernetes and Kubeflow for DS, ML and AI Workflow and Workload

    20 Mins
    Demonstration
    Intermediate

    DS, ML, and AI have moved very far from just running the models only at your local machine. Nowadays models are running in production and helping the business at decision making, which in turn increased the expectations for continuously running the models and making the changes online, but remember running this at a large scale is no easy task.
    During this session, we will learn about one such approach with Docker, Kubernetes and Kubeflow which can help us not only in developing but also deploying models at scale, allow us to use distributed setup and Hyperparameter tuning