schedule Aug 31st 11:00 AM - 11:45 AM place Grand Ball Room 1 people 63 Interested

In evolutionary history, the evolution of sensory organs and brain plays very important role for species to survive and prosper. Extending human’s abilities to achieve a better life, efficient and sustainable world is a goal of artificial intelligence. Although recent advances in machine learning enable machines to perform as good as, or even better than human in many intelligent tasks including automatic speech recognition, there are still many aspects to be addressed to bridge the semantic gap and achieve seamless interaction with machines. Auditory intelligence is a key technology to enable natural man machine interaction and expanding human’s auditory ability. In this talk, I am going to address three aspects of it:

(1) non-speech audio recognition,

(2) video highlight detection,

(3) one technology to surpassing human’s auditory ability, namely source separation.

 
2 favorite thumb_down thumb_up 1 comment visibility_off  Remove from Watchlist visibility  Add to Watchlist
 

Outline/structure of the Session

1. What auditory intelligence?

2. Non-speech audio recognition

3. Video highlight detection

4. Audio source separation

Learning Outcome

You will know how important to analyze audio signal and what can we benefit from it.

You will also see the state-of-the-art music source separation performance, automatic video highlight detection demo in my talk.

Target Audience

Those who are interested in application of deep learning for audio and research in Sony.

Prerequisite

Topics I will mention in my talk are published as journal or conference papers. However, I will explain basic ideas and not go into too much detail. So you do not need to have expertise in machine learning nor deep learning.

schedule Submitted 5 months ago

Comments Subscribe to Comments

comment Comment on this Submission
  • Vishal Gokhale
    By Vishal Gokhale  ~  5 months ago
    reply Reply

    Thanks for the proposal, Naoya ! :-)
    This is a very interesting topic.

    Can you please share links to videos of your previous talks?


    • Liked Dr. Dakshinamurthy V Kolluru
      keyboard_arrow_down

      Dr. Dakshinamurthy V Kolluru - ML and DL in Production: Differences and Similarities

      45 Mins
      Talk
      Beginner

      While architecting a data-based solution, one needs to approach the problem differently depending on the specific strategy being adopted. In traditional machine learning, the focus is mostly on feature engineering. In DL, the emphasis is shifting to tagging larger volumes of data with less focus on feature development. Similarly, synthetic data is a lot more useful in DL than ML. So, the data strategies can be significantly different. Both approaches require very similar approaches to the analysis of errors. But, in most development processes, those approaches are not followed leading to substantial delay in production times. Hyper parameter tuning for performance improvement requires different strategies between ML and DL solutions due to the longer training times of DL systems. Transfer learning is a very important aspect to evaluate in building any state of the art system whether ML or DL. The last but not the least is understanding the biases that the system is learning. Deeply non-linear models require special attention in this aspect as they can learn highly undesirable features.

      In our presentation, we will focus on all the above aspects with suitable examples and provide a framework for practitioners for building ML/DL applications.

    • Liked Swapan Rajdev
      keyboard_arrow_down

      Swapan Rajdev - Conversational Agents at Scale: Retrieval and Generative approaches

      Swapan Rajdev
      Swapan Rajdev
      CTO
      Haptik
      schedule 4 months ago
      Sold Out!
      45 Mins
      Talk
      Beginner

      Conversational Agents (Chatbots) are machine learning programs that are designed to have conversation with a human to help them fulfill a particular task. In recent years people have been using chatbots to communicate with business, help get daily tasks done and many more.

      With the emergence of open source softwares and online platforms building a basic conversational agent has become easier but making them work across multiple domains and handle millions of requests is still a challenge.

      In this talk I am going to talk about the different algorithms used to build good chatbots and the challenges faced to run them at scale in production.

    • Liked Mahesh Balaji
      keyboard_arrow_down

      Mahesh Balaji - Deep Learning in Medical Image Diagnostics

      Mahesh Balaji
      Mahesh Balaji
      Sr. Director
      Cognizant
      schedule 4 months ago
      Sold Out!
      45 Mins
      Talk
      Intermediate

      Convolutional Neural Networks are revolutionizing the field of Medical Imaging analysis and Computer Aided Diagnostics. Medical images from X-Rays, CT, MRI, retinal scans to digitized biopsy slides are an integral part of a patient’s EHR. Current manual analysis and diagnosis by human radiologists, pathologists are prone to undue delays, erroneous diagnosis and can therefore benefit from deep learning based AI for quantitative, standardized computer aided diagnostic tools.

      In this session, we will review the state of the art in medical imaging and diagnostics, important tasks like classification, localization, detection, segmentation and registration along with CNN architectures that enable these. Further, we will briefly cover data augmentation techniques, transfer learning and walkthrough two casestudies on Diabetic Retinopathy and Breast Cancer Diagnosis. Finally, we discuss inherent challenges from sourcing training data to model interpretability.

    • Liked Dr. Rohit M. Lotlikar
      keyboard_arrow_down

      Dr. Rohit M. Lotlikar - The Impact of Behavioral Biases to Real-World Data Science Projects: Pitfalls and Guidance

      45 Mins
      Talk
      Intermediate

      Data science projects, unlike their software counterparts tend to be uncertain and rarely fit into standardized approach. Each organization has it’s unique processes, tools, culture, data and in-efficiencies and a templatized approach, more common for software implementation projects rarely fits.

      In a typical data science project, a data science team is attempting to build a decision support system that will either automate human decision making or assist a human in decision making. The dramatic rise in interest in data sciences means the typical data science project has a large proportion of relatively inexperienced members whose learnings draw heavily from academics, data science competitions and general IT/software projects.

      These data scientists learn over time that the real world however is very different from the world of data science competitions. In the real-word problems are ill-defined, data may not exist to start with and it’s not just model accuracy, complexity and performance that matters but also the ease of infusing domain knowledge, interpretability/ability to provide explanations, the level of skill needed to build and maintain it, the stability and robustness of the learning, ease of integration with enterprise systems and ROI.

      Human factors play a key role in the success of such projects. Managers making the transition from IT/software delivery to data science frequently do not allow for sufficient uncertainty in outcomes when planning projects. Senior leaders and sponsors, are under pressure to deliver outcomes but are unable to make a realistic assessment of payoffs and risks and set investment and expectations accordingly. This makes the journey and outcome sensitive to various behavioural biases of project stakeholders. Knowing what the typical behavioural biases and pitfalls makes it easier to identify those upfront and take corrective actions.

      The speaker brings his nearly two decades of experience working at startups, in R&D and in consulting to lay forth these recurring behavioural biases and pitfalls.

      Many of the biases covered are grounded in the speakers first-hand experience. The talk will provide examples of these biases and suggestions on how to identify and overcome or correct for them.

    • Liked Anuj Gupta
      keyboard_arrow_down

      Anuj Gupta - Sarcasm Detection : Achilles Heel of sentiment analysis

      Anuj Gupta
      Anuj Gupta
      Independent Researcher
      -
      schedule 6 months ago
      Sold Out!
      45 Mins
      Talk
      Intermediate

      Sentiment analysis has been for long poster boy problem of NLP and has attracted a lot of research. However, despite so much work in this sub area, most sentiment analysis models fail miserably in handling sarcasm. Rise in usage of sentiment models for analysis social data has only exposed this gap further. Owing to the subtilty of language involved, sarcasm detection is not easy and has facinated NLP community.

      Most attempts at sarcasm detection still depend on hand crafted features which are dataset specific. In this talk we see some of the very recent attempts to leverage recent advances in NLP for building generic models for sarcasm detection.

      Key take aways:
      + Challenges in sarcasm detection
      + Deep dive into a end to end solution using DL to build generic models for sarcasm detection
      + Short comings and road forward

    • Liked Dr. Arun Verma
      keyboard_arrow_down

      Dr. Arun Verma - Extracting Embedded Alpha Factors From Alternative Data Using Statistical Arbitrage and Machine Learning

      45 Mins
      Case Study
      Intermediate

      The high volume and time sensitivity of news and social media stories requires automated processing to quickly extract actionable information. However, the unstructured nature of textual information presents challenges that are comfortably addressed through machine learning techniques.

    • Liked joydeep bhattacharjee
      keyboard_arrow_down

      joydeep bhattacharjee - Cutting edge NLP with fastText

      joydeep bhattacharjee
      joydeep bhattacharjee
      Software Engineer
      Nineleaps
      schedule 4 months ago
      Sold Out!
      45 Mins
      Talk
      Intermediate

      FastText has been open-sourced by Facebook in 2016 and with its release, it became the fastest and most cutting edge library for text classification and word representation. It includes the implementation of two extremely important methodologies in NLP i.e Continuous Bag of Words and Skip-gram model. FastText performs exceptionally well with supervised as well as unsupervised learning.

    • Liked Harish Kashyap K
      keyboard_arrow_down

      Harish Kashyap K - Probabilistic Graphical Models (PGMs) for Fraud Detection and Risk Analysis.

      45 Mins
      Talk
      Advanced

      PGMs are generative models that are extremely useful to model stochastic processes. I shall talk about how fraud models, credit risk models can be built using Bayesian Networks. Generative models are great alternatives to deep neural networks, which cannot solve such problems. This talk focuses on Bayesian Networks, Markov Models, HMMs and their applications. Many areas of ML need to explain causality. PGMs offer nice features that enable causality explanations.