Sr. Lead Data Scientist
Envestnet | Yodlee
Member since 11 months
Shibsankar is working at Envestnet | Yodlee as “Senior Lead Data Scientist”. Prior to that, he had done master’s from Indian Institute of Science, Bangalore and worked at Microsoft Research and Capgemini. He had been awarded “40 Under 40 Data Scientists” by Analytics India Magazine for demonstrating expertise in foundational Machine Learning and Analytics, particularly in Deep Learning, Generative Models and Deep Reinforcement Learning.
Algorithms that learn to solve tasks by watching (one) Youtube videoSamiran RoySr. Lead Data SciencesEnvestnet | YodleeShibsankar DasSr. Lead Data ScientistEnvestnet | Yodlee
schedule 2 months agoSold Out!
Two branches of AI - Deep Learning, and Reinforcement Learning are now responsible for many real-world applications. Machine Translation, Speech Recognition, Object Detection, Robot Control, and Drug Discovery - are some of the numerous examples.
Both approaches are data hungry - DL requires many examples of each class, and RL needs to play through many episodes to learn a policy. Contrast this to human intelligence. A small child can typically see an image just once, and instantly recognize it in other contexts and environments. We seem to possess an innate model/representation of how the world works, which helps us grasp new concepts and adapt to new situations fast. Humans are excellent one/few shot learners. We are able to learn complex tasks by observing and imitating other humans (eg: cooking, dancing or playing soccer) - despite having a different point of view, sense modalities, body structure, mental facility.
Humans may be very good at picking up novel tasks, but Deep RL agents surpass us in performance. Once a Deep RL has learned a good representation , it is easy to surpass human performance in complex tasks like Go, Dota 2, and Starcraft. We are biologically limited by time, memory and computation (A computer can be made to simulate thousands of plays in a minute).
RL struggles with tasks that have sparse rewards. Take an example of a soccer playing robot - controlled by applying a torque to each one of its joints. The environment rewards you when it scores a goal. If the policy is initialized randomly (we apply a random torque to each joint, every few milliseconds) the probability of the robot scoring a goal is negligible - it won't even be able to learn how to stand up. In tasks requiring long term planning or low-level skills, getting to that initial reward can prove impossible. These situations have the potential to greatly benefit from a demonstration - in this case showing the robot how to walk and kick - and then letting it figure out how to score a goal.
We have an abundance of visual data on humans performing various tasks, in the public domain, in the form of videos from sources like YouTube. In Youtube alone, 400 hours of videos are uploaded every minute, and it is easy to find demonstration videos for any skill imaginable. What if we could harness this by designing agents that could learn how to perform tasks - just by watching a video clip?
Imitation Learning, also known as apprenticeship learning, teaches an agent a sequence of decisions through demonstration, often by a human expert. It has been used in many applications such as teaching drones how to fly and autonomous cars how to drive - It relies on domain engineered features - or extremely precise representations such as mocap . Directly applying imitation learning to learn from videos proves challenging, there is a misalignment of representation between the demonstrations and the agent’s environment. For example: How can a robot sensing its world through a 3d point cloud - learn from a noisy 2d video clip of a soccer player dribbling?
Leveraging recent advances in Reinforcement Learning, Self Supervised Learning and Imitation Learning   , We present a technical deep dive into an end to end framework which:
1) Has prior knowledge about the world intelligence through Self-Supervised Learning - A relatively new area which seeks to build efficient deep learning representations from unlabelled data but training on a surrogate task. The surrogate task can be rotating an image and predicting the rotation angle or cropping two patches of the image, and predicting their relative tasks - or a combination of several such objectives.
2) Has the ability to align the representation of how it senses the world, with that of the video - allowing it to learn diverse tasks from video clips.
3) Has the ability to reproduce a skill, from only a single demonstration - using applied techniques from imitation learning
Semi-Supervised Insight generation from petabyte scale Text dataSamiran RoySr. Lead Data SciencesEnvestnet | YodleeShibsankar DasSr. Lead Data ScientistEnvestnet | Yodlee
schedule 2 months agoSold Out!
Existing state-of-the-art supervised methods in Machine Learning require large amounts of annotated data to achieve good performance and generalization. However, manually constructing such a training data set with sentiment labels is a labor-intensive and time-consuming task. With the proliferation of data acquisition in domains such as images, text and video, the rate at which we acquire data is greater than the rate at which we can label them. Techniques that reduce the amount of labelled data needed to achieve competitive accuracies are of paramount importance for deploying scalable, data-driven, real-world solutions. Semi-Supervised Learning algorithms generally provide a way of learning about the structure of the data from the unlabelled examples, alleviating the need for labels.
At Envestnet | Yodlee, we have deployed several advanced state-of-the-art Machine Learning solutions which process millions of data points on a daily basis with very stringent service level commitments. A key aspect of our Natural Language Processing solutions is Semi-supervised learning (SSL): A family of methods that also make use of unlabelled data for training – typically a small amount of labelled data with a large amount of unlabelled data. Pure supervised solutions fail to exploit the rich syntactic structure of the unlabelled data to improve decision boundaries.
There is an abundance published work in the field - but few papers have succeeded in showing significantly better results than state-of-the-art supervised learning. Often, methods have simplifying assumptions that fail to transfer to real-world scenarios. There is a lack of practical guidelines for deploying effective SSL solutions. We attempt to bridge that gap by sharing our learning from successful SSL models deployed in production.
We will talk about best practices and challenges in deploying SSL solutions in NLP - We shall cover:
- Our findings while working on SSL.
- Techniques which have worked for us, and which have not
- Which SSL method is suitable to solve a given use-case.
- How to deal with different distributions for labelled and unlabelled data
- How to quantify the effectiveness of each point in our training data
- How to build a feedback loop that chooses points for training that result in the greatest accuracy boosts and
- The effect of relative sizes of labelled and unlabelled data
No more submissions exist.
No more submissions exist.