The Natural Language Decathlon: A Multitask Challenge for NLP
Deep learning has significantly improved state-of-the-art performance for natural language processing (NLP) tasks, but each one is typically studied in isolation. The Natural Language Decathlon (decaNLP) is a new benchmark for studying general NLP models that can perform a variety of complex, natural language tasks. By requiring a single system to perform ten disparate natural language tasks, decaNLP offers a unique setting for multitask, transfer, and continual learning.and is publicly available on github in order to use for tasks like Question Answering, Machine Translation, Summarization, Sentiment Analysis etc.
Outline/Structure of the Talk
- Introduction to DecaNLP
- Targeted NLP Tasks
- Open Source Collaboration on github
- Patents / Publications in NLP, Computer Vision, AI.
People will be able to understand different problems of NLP like:
1. Question Answering
2. Machine Translation
4. Natural Language Inference
5. Sentiment Analysis
6. Semantic Role Labeling
7. Relation Extraction
8. Goal-Oriented Dialogue
9. Semantic Parsing
10. Commonsense Reasoning
People will know about a unified Framework provided by decaNLP to solve different NLP tasks mentioned above.
People having basic knowledge of NLP, Machine Learning and Deep Learning.
Prerequisites for Attendees
Read basic stuff about NLP, Machine Learning, Deep Learning.
schedule Submitted 8 months ago
People who liked this proposal, also liked:
Ishita Mathur - How GO-FOOD built a Query Semantics Engine to help you find food fasterIshita MathurData ScientistGO-JEK Tech
schedule 8 months agoSold Out!
Context: The Search problem
GOJEK is a SuperApp: 19+ apps within an umbrella app. One of these is GO-FOOD, the first food delivery service in Indonesia and the largest food delivery service in Southeast Asia. There are over 300 thousand restaurants on the platform with a total of over 16 million dishes between them.
Over two-thirds of those who order food online using GO-FOOD do so by utilising text search. Search engines are so essential to our everyday digital experience that we don’t think twice when using them anymore. Search engines involve two primary tasks: retrieval of documents and ranking them in order of relevance. While improving that ranking is an extremely important part of improving the search experience, actually understanding that query helps give the searcher exactly what they’re looking for. This talk will show you what we are doing to make it easy for users to find what they want.
GO-FOOD uses the ElasticSearch stack with restaurant and dish indexes to search for what the user types. However, this results in only exact text matches and at most, fuzzy matches. We wanted to create a holistic search experience that not only personalised search results, but also retrieved restaurants and dishes that were more relevant to what the user was looking for. This is being done by not only taking advantage of ElasticSearch features, but also developing a Query semantics engine.
Query Understanding: What & Why
This is where Query Understanding comes into the picture: it’s about using NLP to correctly identify the search intent behind the query and return more relevant search results, it’s about the interpretation process even before the results are even retrieved and ranked. The semantic neighbours of the query itself become the focus of the search process: after all, if I don’t understand what you’re trying to ask for, how will I give you what you want?
In the duration of this talk, you will learn about how we are taking advantage of word embeddings to build a Query Understanding Engine that is holistically designed to make the customer’s experience as smooth as possible. I will go over the techniques we used to build each component of the engine, the data and algorithmic challenges we faced and how we solved each problem we came across.
Joy Mustafi / Aditya Bhattacharya - Person Identification via Multi-Modal Interface with Combination of Speech and Image DataJoy MustafiFounder and PresidentMUST ResearchAditya BhattacharyaAI ResearcherMUST Research
schedule 10 months agoSold Out!
Having multiple modalities in a system gives more affordance to users and can contribute to a more robust system. Having more also allows for greater accessibility for users who work more effectively with certain modalities. Multiple modalities can be used as backup when certain forms of communication are not possible. This is especially true in the case of redundant modalities in which two or more modalities are used to communicate the same information. Certain combinations of modalities can add to the expression of a computer-human or human-computer interaction because the modalities each may be more effective at expressing one form or aspect of information than others. For example, MUST researchers are working on a personalized humanoid built and equipped with various types of input devices and sensors to allow them to receive information from humans, which are interchangeable and a standardized method of communication with the computer, affording practical adjustments to the user, providing a richer interaction depending on the context, and implementing robust system with features like; keyboard; pointing device; touchscreen; computer vision; speech recognition; motion, orientation etc.
There are six types of cooperation between modalities, and they help define how a combination or fusion of modalities work together to convey information more effectively.
- Equivalence: information is presented in multiple ways and can be interpreted as the same information
- Specialization: when a specific kind of information is always processed through the same modality
- Redundancy: multiple modalities process the same information
- Complimentarity: multiple modalities take separate information and merge it
- Transfer: a modality produces information that another modality consumes
- Concurrency: multiple modalities take in separate information that is not merged
Computer - Human Modalities
Computers utilize a wide range of technologies to communicate and send information to humans:
- Vision - computer graphics typically through a screen
- Audition - various audio outputs
Adaptive: They MUST learn as information changes, and as goals and requirements evolve. They MUST resolve ambiguity and tolerate unpredictability. They MUST be engineered to feed on dynamic data in real time.
Interactive: They MUST interact easily with users so that those users can define their needs comfortably. They MUST interact with other processors, devices, services, as well as with people.
Iterative and Stateful: They MUST aid in defining a problem by asking questions or finding additional source input if a problem statement is ambiguous or incomplete. They MUST remember previous interactions in a process and return information that is suitable for the specific application at that point in time.
Contextual: They MUST understand, identify, and extract contextual elements such as meaning, syntax, time, location, appropriate domain, regulation, user profile, process, task and goal. They may draw on multiple sources of information, including both structured and unstructured digital information, as well as sensory inputs (visual, gestural, auditory, or sensor-provided).
Multi-Modal Interaction: https://www.youtube.com/watch?v=jQ8Gq2HWxiA
Gesture Detection: https://www.youtube.com/watch?v=rDSuCnC8Ei0
Speech Recognition: https://www.youtube.com/watch?v=AewM3TsjoBk
Assignment (Hands-on Challenge for Attendees)
Real-time multi-modal access control system for authorized access to work environment - All the key concepts and individual steps will be demonstrated and explained in this workshop, and the attendees need to customize the generic code or approach for this assignment or hands-on challenge.