This talk focuses on the topic of querying industry grade big data systems. Enterprises have vast amount of information spread across structured data stores (relational databases, data warehouses, etc.). Descriptive analytics over this data is limited to experts familiar with complex querying languages (e.g., Structured Query Language) as well as metadata and schema associated with such large datastores. The ability to convert natural language questions to SQL statements would make descriptive analytics and reporting much easier and widespread. Problem of automatically converting natural language questions to SQL is well studied, viz., Natural Language Interface to Databases (NLIDB). We present our work on an end-to-end (E2E) system focussed on NLIDB.

We describe two main aspects of E2E NLIDB systems: i) Converting natural language to structured language and ii) understanding natural language. There is a plenitude of applications of such E2E systems across domains e.g., healthcare, finance, logistics, etc.

 
 

Outline/Structure of the Talk

  • Background (1 min)
  • Problem Statement (2 mins)
    • Specific questions
      • Unstructured to Structured language
      • Understanding natural language
  • Challenges (2 mins)
    • Main challenges for e.g.
      • Problem complexity
      • Ambiguous human language
  • Solutions/Methodology (6 mins)
    • Pipeline
    • Different Models
  • Experiments (4 mins)
  • Findings (3 mins)
  • Summary and Conclusion (2 mins)

Learning Outcome

As the talk will first cover the basics of the topic, and present the background, problem statement and then the core topics in an incremental manner, there will be multiple learning outcomes as listed below:

  • Participants will learn on the topic of querying industry grade big data system
  • Participants will learn an overview of an end-to-end system to translate unstructured language to a structured query
  • Participants will experience application of natural language processing and understanding to handle and interpret user queries effectively
  • Participants will gain knowledge on the findings of the system built within out enterprise to query industry grade big data system

Target Audience

As this topic is of eminent interest there are two main types of audience: Participants interested to learn more on: i) How to query databases and records (from huge structured data stores) effectively? and ii) How to handle user generated natural language queries for industry grade big data systems ? Participants interested or working in the direction of natural language processing (NLP) and understanding, and its applications for big data systems.

Prerequisites for Attendees

We will start with the basics and then move towards the core aspects of the talk. All interested people are welcome to attend. There is no specific prerequisite for attendees, however a bit of knowledge on natural language processing and databases will be helpful.

schedule Submitted 6 months ago

Public Feedback

comment Suggest improvements to the Author
  • Ashay Tamhane
    By Ashay Tamhane  ~  5 months ago
    reply Reply

    Hi Piyush, thanks for the proposal. Could you clarify what NLP models have you utilised for solving this problem?

    • Piyush Arora
      By Piyush Arora  ~  5 months ago
      reply Reply

      Hello Akshay,

      This talk is based on our End2End system deployed to address the task of querying big databases. We have a horizontal pipeline where NLP and ML models are the main focus. Related to NLP main aspects are natural language understanding, where we focus on understanding user queries, paraphrasing and further entities extraction and identification.

      More details of the initial model which are shared at public level is available at:  https://dl.acm.org/doi/10.1145/3371158.3371198

      I hope that helps to answer the query.

      Best Regards

      Piyush

      • Ashay Tamhane
        By Ashay Tamhane  ~  5 months ago
        reply Reply

        Yes, thanks Piyush!

  • Natasha Rodrigues
    By Natasha Rodrigues  ~  5 months ago
    reply Reply

    Hi Piyush,

    Thanks for your proposal and your voice-over video, however to help the program committee understand your presentation style, can you provide a link to your past recording or record a small 1-2 mins trailer of your talk and share the link to the same?

    Thanks,

    Natasha

    • Piyush Arora
      By Piyush Arora  ~  5 months ago
      reply Reply

      Hello Natasha,

      I had a thorough look it seems all other recording are just voice based, so I recored a 2-3 mins video after receiving your message, with screen recording and video cam on to give a better idea of the presentation style using Cisco webex.

      By default setting cisco webex saves the video in the cloud, I will share the recording as soon as I receive the video at my end. I wanted to mention same, as I was wondering if I won't be able to share the video after the deadline?  Or will there be some other mechanism to share the recording.

      Best Regards

      Piyush

      • Naresh Jain
        By Naresh Jain  ~  5 months ago
        reply Reply

        Hi Piyush,

        Don't worry, you would be able to update the video link even after the submission is closed.

        • Piyush Arora
          By Piyush Arora  ~  5 months ago
          reply Reply

          Thanks Naresh and Natasha,

          I have attached a sample recording (about 3 mins) at https://www.computing.dcu.ie/~parora/ODSC-Recording.mp4 . I hope that seems okay and works to give a sense of the presentation style.

          Best Regards

          Piyush

          • Natasha Rodrigues
            By Natasha Rodrigues  ~  5 months ago
            reply Reply

            Hi Piyush,

            Thanks for this, kindly update the same in the video/links section of your proposal.

            Regards,

            Natasha 

            • Piyush Arora
              By Piyush Arora  ~  5 months ago
              reply Reply

              Hello Natasha,

              I have updated the video in the video section of the proposal.

              Best,

              Piyush 

    • Piyush Arora
      By Piyush Arora  ~  5 months ago
      reply Reply

      Thanks for the feedback Natasha,

      Sure, I will upload the recording in a while.

      Best Regards

      Piyush