3 ways reinforcement learning is changing the world around you
Sahika Genc, senior scientist with Amazon AI, writes about three important ways reinforcement learning is used in the real world, and explains how you can get hands on with reinforcement learning.
Sahika Genc is a senior scientist with Amazon AI. Her team works on reinforcement learning (RL) algorithms for Amazon Sagemaker, which provides every developer and data scientist with the ability to build, train, and deploy machine learning models quickly. Genc also leads the science team on AWS DeepRacer, which enables developers to have a way to get hands-on with RL, experiment, and learn through autonomous driving.
In this article, Genc discusses three important ways reinforcement learning is used in the real world, and explains how you can get hands on with reinforcement learning.
When the concept of reinforcement learning was first introduced in the 1950s, there were two themes – the first focused on developing learning methods via a trial-and-error process, while the other provided a more theoretical framework to solve optimal control problems. These practical and theoretical methods merged in the 1980s to give birth to reinforcement learning as a more formalized field of study and development.
At the time, luminaries like Richard Sutton and Andrew Barto highlighted theories like optimal control and dynamic programming, and identified key component ideas, such as temporal difference learning, dynamic programming, and function approximation.
Fast forward to the 2000s, where deep learning gave reinforcement learning a massive boost by eliminating the need to manually configure features, and use raw sensor data (such as the pixels of an image rather than a segmented image).
But what exactly is reinforcement learning?
As opposed to supervised learning (which uses labeled training data) or unsupervised learning (where you draw inferences from input data without labeled responses), reinforcement learning involves a system making short-term decisions while optimizing for a longer-term goal through trial and error. Deep learning is used to make mathematical representations of important variables, while the reinforcement learning agent learns the actions needed to maximize rewards over a longer period of time.
Here are three applications of reinforcement learning that are changing our world in profound ways:
1. Recommendation systems.
Reinforcement learning has obvious advantages in developing recommendation systems for news feeds, products or videos. In this case, the goal of the system is to personalize product recommendations.
The state of a system changes constantly as users interact with it. This makes supervised learning less than ideal for recommendation systems, as you would constantly need additional infrastructure for deploying recurring model updates. On the other hand, systems that use reinforcement learning can continually update recommendations based on user feedback. Deep learning provides mathematical representations of the product, consumer interest, and consumer satisfaction. The reinforcement learning agent can personalize the content to each individual based on their preferences over a period of time, in a way that maximizes the reward over the long term.
In recent years, there has been an increased uptake in deep reinforcement learning for use cases such as push notifications, faster video loading by pre-fetching content and for delivering product recommendations. Visit the Amazon Sagemaker notebook on recommendation systems to get a deep dive on reinforcement learning in action.
2. Energy Smart Grids
According to the International Energy Agency (IEA), global energy consumption grew by 2.3% in 2018 – twice as fast as the average over the last ten years. Reinforcement learning has outperformed advanced control systems traditionally used for energy optimization for applications like datacenter cooling and select smart grid applications.
Energy systems interact with the environment in complex and non-linear ways. Traditional formula-based engineering and human intuition cannot adapt to rapidly changing conditions like the weather. It is impossible to come up with rules and heuristics for every operating scenario. A general intelligence framework is needed to understand the data center’s interactions with the environment.
Deep reinforcement learning has been used to extract knowledge from past consumption patterns, production time series and available forecasts to tailor energy distribution for datacenters and buildings. Here, deep learning is used to make mathematical representations of complex thermodynamic equations. By seeking reward maximization, the reinforcement learning agent learns the right actions to take (such as which systems to turn on and off) over the course of entire days, weeks, months and years. See the Amazon Sagemaker notebook for energy use cases to get hands on with practical applications of reinforcement learning.
Most of the industrial robots used in environments like manufacturing floors are blind. This is because image sensing has not been a commodity until recent times. However, there has been an increase in the use of image data from camera, LIDAR or radar sensors.
Consequently, deep reinforcement learning can be used to train robots to take actions such as picking up or moving objects in warehouses and factories. In this scenario, deep learning is used to interpret images by looking at every pixel, while reinforcement learning agents learn how to make the right decisions over a period of time based on which action was successful. The Amazon Sagemaker notebook is a great place to get started with reinforcement learning and robotics.
There are still many challenges we must work through. These have to do with not only the high volume, but the high dimensionality of data, which can make it challenging to design responsive systems. In addition, be it for recommender systems or energy grids, both the data and relationships between variables can change over time. This can make it incredibly difficult to avoid concept drift.
Finally, the moral of the story of Midas is applicable to machine learning. Be careful what you wish for. There can be a huge gap between the intended reward and stated reward – and you can find your system maximizing for end states that aren’t entirely desirable.
In many ways, it’s still early days when it comes to Deep Reinforcement Learning. There’s no better time to get on board. With AWS DeepRacer, you now have a way to get hands-on with RL, experiment, and learn through autonomous driving. You can get started with the virtual car and tracks in the cloud-based 3D racing simulator. For a real-world experience, you can deploy your trained models onto AWS DeepRacer and race your friends, or take part in the global AWS DeepRacer League. Visit the AWS DeepRacer page to get started.