Reinforcement Learning: Demystifying the hype to successful enterprise applications
In 2014, Google accquired DeepMind, a small, london-based AI startup for $500 million. DeepMind was conducting research on AI that would learn to play computer games in a fashion similar to humans. In 2015, Deepmind published a paper in Nature, describing a learning algorithm called Deep-Q-Learning which was able to achieve superhuman performance on a diverse range of Atari 2600 games. They achieved this without any domain specific engineering - The algorithm took only the raw game images as input, and was guided by the game score. Believed by many to be the first steps in Artificial General Intelligence, DeepMind achieved this by pioneering the fusion of two fields of research - Reinforcement Learning(RL) and Deep Learning.
RL is a learning paradigm inspired by operant conditioning which closely mimics the human learning process. It shifts focus from ML based pattern recognition solutions to learning through trial and error via interaction with an environment, guided by a reward signal or reinforcement. Imagine an agent teaching itself how to steer by navigating the streets of Grand Theft Auto - and transferring this knowledge to a driverless car. Think of team of autonomous robots collaborating to outwit their opponents in a game of Robot Soccer. Any practical real-world application suffers from the curse of dimensionality (A camera mounted on a robot feeding it a 64*64 grayscale image will have 256^(4096) input possibilities). A Deep Neural Network automatically learns compact and efficient feature representations from noisy, high-dimensional sensory inputs in its hidden layers, giving RL algorithms the edge to scale up and give incredible results in dynamic and complex domains.
The most notable example of this is AlphaGo Zero - the latest version of AlphaGo, the first computer program to defeat a world champion at the game of Go (Also called Chinese Checkers). AlphaGo Zero uses RL to learn by playing games against itself, starting from completely random play, and quickly surpasses human expert performance. Not only is the game extremely complex (A 19*19 Go board can represent 10^170 states of play), accomplished Go players often struggle to evaluate whether a certain move is good or bad. Most AI researchers were astonished by this feat, as it was speculated that it would take atleast a decade for a computer to play Go at an expert human level.
RL, which was largely confined to academia for several decades is now beginning to see some successful applications and products in the industry, in fields such as robotics, automated trading systems, manufacturing, energy, dialog systems and recommendation engines. For most companies, it is an exciting prospect due to the AI hype, but very few organizations have identified use cases where RL may play a valuable role. In reality, RL is best suited for a niche class of problems where it can help automate some tasks(or augment a human expert). The focus of this presentation will be to give a practical introduction to the RL Setting, how to formulate problems into RL, and presenting successful use cases in the industry.
Outline/Structure of the Case Study
• An introduction to Deep Reinforcement Learning - It's history and successes (5 minutes)
• Formulating a problem in the framework of RL (10 minutes)
• Combining RL with Deep Neural Networks in practice (10 minutes)
• Discussing specific use cases in the Industry (15 minutes)
• Conclusion, and best places to learn RL (5 minutes)
• A concrete understanding of the RL framework, and formulating a problem in its context
• A broad understanding of how deep neural networks can be applied to RL problems
• Understanding industry specific classes of problems where RL can be successfully applied
Data Scientists, managers, industry experts who want to find problem statements to successfully apply RL in enterprise applications.
Prerequisites for Attendees
Basic familiarity with machine learning and deep learning.
Cursory reading of Reinforcement Learning (Self Driving Cars / AlphaGo) would be helpful too.