Member since 1 year
I am part of Intuit AI team. Prior to this, I was heading ML efforts for Huawei Technologies, Freshworks, Chennai and Airwoot, Delhi. I did my masters in theoretical computer science from IIIT Hyderabad and I dropped out of my Phd from IIT Delhi to work with startups.
I am a regular speaker at ML conferences like Pydata, Nvidia forums, Fifth Elephant, Anthill. I have also conducted a bunch of workshop attended by machine learning practitioners. I am also the co-organizer for one of the early Deep Learning meetups in Bangalore. I am also Editor of "Anthill-2018" - deep learning focused conference by HasGeek.
Continuous Learning Systems: Building ML systems that keep learning from their mistakesAnuj GuptaScientistIntuit
schedule 2 weeks agoSold Out!
Won't it be great to have ML models that can update their “learning” as and when they make mistake and correction is provided in real time? In this talk we look at a concrete business use case which warrants such a system. We will take a deep dive to understand the use case and how we went about building a continuously learning system for text classification. The approaches we took, the results we got.
For most machine learning systems, “train once, just predict thereafter” paradigm works well. However, there are scenarios when this paradigm does not suffice. The model needs to be updated often enough. Two of the most common cases are:
- When the distribution is non-stationary i.e. the distribution of the data changes. This implies that with time the test data will have very different distribution from the training data.
- The model needs to learn from its mistakes.
While (1) is often addressed by retraining the model, (2) is often addressed using batch update. Batch updation requires collecting a sizeable number of feedback points. What if you have much fewer feedback points? You need model that can learn continuously - as and when model makes a mistake and feedback is provided. To best of our knowledge there is a very limited literature on this.
NLP BootcampAnuj GuptaScientistIntuit
schedule 2 weeks agoSold Out!
Recent advances in machine learning have rekindled the quest to build machines that can interact with outside environment like we human do - using visual clues, voice and text. An important piece of this trilogy are systems that can process and understand text in order to automate various workflows such as chat bots, named entity recognition, machine translation, information extraction, summarization, FAQ system, etc.
A key step towards achieving any of the above task is - using the right set of techniques to represent text in a form that machine can understand easily. Unlike images, where directly using the intensity of pixels is a natural way to represent the image; in case of text there is no such natural representation. No matter how good is your ML algorithm, it can do only so much unless there is a richer way to represent underlying text data. Thus, whatever NLP application you are building, it’s imperative to find a good representation for your text data.
In this bootcamp, we will understand key concepts, maths, and code behind the state-of-the-art techniques for text representation. We will cover mathematical explanations as well as implementation details of these techniques. This bootcamp aims to demystify, both - Theory (key concepts, maths) and Practice (code) that goes into building these techniques. At the end of this bootcamp participants would have gained a fundamental understanding of these schemes with an ability to implement them on datasets of their interest.
This would be a 1-day instructor-led hands-on training session to learn and implement an end-to-end deep learning model for natural language processing.
A Hands-on Introduction to Natural Language Processing
Data is the new oil and unstructured data, especially text, images and videos contain a wealth of information. However, due to the inherent complexity in processing and analyzing this data, people often refrain from spending extra time and effort in venturing out from structured datasets to analyze these unstructured sources of data, which can be a potential gold mine. Natural Language Processing (NLP) is all about leveraging tools, techniques and algorithms to process and understand natural language-based data, which is usually unstructured like text, speech and so on. In this workshop, we will be looking at tried and tested strategies, techniques and workflows which can be leveraged by practitioners and data scientists to extract useful insights from text data.
Being specialized in domains like computer vision and natural language processing is no longer a luxury but a necessity which is expected of any data scientist in today’s fast-paced world! With a hands-on and interactive approach, we will understand essential concepts in NLP along with extensive case- studies and hands-on examples to master state-of-the-art tools, techniques and frameworks for actually applying NLP to solve real- world problems. We leverage Python 3 and the latest and best state-of- the-art frameworks including NLTK, Gensim, SpaCy, Scikit-Learn, TextBlob, Keras and TensorFlow to showcase our examples.
In my journey in this field so far, I have struggled with various problems, faced many challenges, and learned various lessons over time. This workshop will contain a major chunk of the knowledge I’ve gained in the world of text analytics and natural language processing, where building a fancy word cloud from a bunch of text documents is not enough anymore. Perhaps the biggest problem with regard to learning text analytics is not a lack of information but too much information, often called information overload. There are so many resources, documentation, papers, books, and journals containing so much content that they often overwhelm someone new to the field. You might have had questions like ‘What is the right technique to solve a problem?’, ‘How does text summarization really work?’ and ‘Which are the best frameworks to solve multi-class text categorization?’ among many other questions! Based on my prior knowledge and learnings from publishing a couple of books in this domain, this workshop should help readers avoid the pressing issues I’ve faced in my journey so far and learn the strategies to master NLP.
This workshop follows a comprehensive and structured approach. First it tackles the basics of natural language understanding and Python for handling text data in the initial chapters. Once you’re familiar with the basics, we cover text processing, parsing and understanding. Then, we address interesting problems in text analytics in each of the remaining chapters, including text classification, clustering and similarity analysis, text summarization and topic models, semantic analysis and named entity recognition, sentiment analysis and model interpretation. The last chapter is an interesting chapter on the recent advancements made in NLP thanks to deep learning and transfer learning and we cover an example of text classification with universal sentence embeddings.
Sarcasm Detection : Achilles Heel of sentiment analysisAnuj GuptaScientistIntuit
schedule 1 year agoSold Out!
Sentiment analysis has been for long poster boy problem of NLP and has attracted a lot of research. However, despite so much work in this sub area, most sentiment analysis models fail miserably in handling sarcasm. Rise in usage of sentiment models for analysis social data has only exposed this gap further. Owing to the subtilty of language involved, sarcasm detection is not easy and has facinated NLP community.
Most attempts at sarcasm detection still depend on hand crafted features which are dataset specific. In this talk we see some of the very recent attempts to leverage recent advances in NLP for building generic models for sarcasm detection.
Key take aways:
+ Challenges in sarcasm detection
+ Deep dive into a end to end solution using DL to build generic models for sarcasm detection
+ Short comings and road forward
No more submissions exist.
No more submissions exist.