Automatic Speech Recognition - behind the Scenes

Alexa, Google, Cortana, Watson.... these are household names today. Speech has become an important mode of digital usage. Both as a developer and as a user you would have imagined as to how this works. Automatic Speech Recognition is the buzz word. This presentation gives a sneak peek into how Audio gets converted into text and also explains the technologies and algorithm that go behind it.

Outline/Structure of the Talk

Automatic Speech recognition (ASR)

  • Use-cases of ASR
  • Difficulty faced in building ASR systems
  • History of ASR
  • Introduction to Deep Learning
  • ASR process and models
  • Different types of ASR systems
  • Customising ASRs
  • Demo of ASR

Learning Outcome

This presentation aims at introducing the audience to the basics of Automatic Speech Recognition. It will also get into some deep learning basics to get the point across. So overall the audience with get a good understanding of the steps involved in converting Speech to Text

Target Audience

The session is generic in nature and caters to any kind of developer who is interested in knowing how Automatic Speech Recognition (ASR) works.

Prerequisites for Attendees

There are no pre-requisites to this session. It is a general session for anyone with a technical background.

schedule Submitted 1 year ago

Public Feedback

comment Suggest improvements to the Speaker
  • Vishal Gokhale
    By Vishal Gokhale  ~  1 year ago
    reply Reply

    Thanks for the proposal, Poornima!
    This is an interesting topic. 

    1. From what I understand you intend to focus on the conversion (speech to text) logic, rather than just getting the job done with tools/libraries. 
      Personally, I feel that is quite important ! So kudos for choosing to do so!
      If the above understanding is correct, can you please help the program committee get a feel of the depth to which you intend to cover this topic. 
      I believe, it would be important to expose the limitations of commonly used tools/libraries along with the reasons.  
      Feel free to change (increase/decrease) the talk-duration according to the points you intend to cover.

    2. The slides and the video are not accessible publicly.
      Can you please upload both to publicly accessible platforms like slideshare and youtube?

  • Liked Riya Mary Roy

    Riya Mary Roy / Poornima T A - Build an Omni-Channel Experience with Artificial Intelligence

    90 Mins

    This is an era of giving an omni-channel experience, that is, one that gives different integration points like web-browser/mobile/phone-call with personalisation. Artificial Intelligence, namely virtual agents/conversational bots have jumped in here to give that seamless experience. Learn how to build an omnichannel conversational bot that can converse in Indian language and can handle complex natural language queries or long tail questions, which will give you a head start on the potent customer experience. We will take you through the services and code required to quickly build an agent that can function over different channels like Web App, Mobile App, Slack, Facebook Messenger, Skype, Alexa and IVR.