Machine learning has long been one of the most popular approaches to training AI systems. Reinforcement learning is rapidly gaining popularity in the business world, but these are fundamentally different techniques and each is better suited for certain types of problems. Understanding when to use each is key to building effective AI.
Machine learning is a broad term, but it typically refers to training algorithms on labelled data, where the labels provide the correct answers. For example, a churn prediction model would be trained on thousands (or even millions) of examples of past contract renewal behaviour. By examining these example inputs and outputs, machine learning algorithms can detect patterns and make predictions on new unlabeled data. The goal is for the algorithm to generalise beyond the training data to make accurate classifications, predictions, or decisions.
Reinforcement learning takes a different approach. The algorithm is not shown correct answers but instead must discover them through trial and error. The algorithm performs actions within an environment and receives positive or negative rewards in return. Over time, the reinforcement learning agent seeks to maximize its total reward through its actions.
For example, a robot vacuum cleaner would use reinforcement learning to figure out how to efficiently clean a room through its own experience bumping into things and assessing the results. The key difference from machine learning is that the agent must independently determine the ideal behaviour through environmental feedback, rather than being trained on ideal behaviour examples.
In general, machine learning excels at well-defined tasks where lots of training data is available. For example, machine learning algorithms can classify images once shown many labelled examples, or translate languages given many bilingual text examples. The downside of these algorithms is that they can often be difficult to explain, even with modern ML explainability techniques.
Reinforcement learning shines when an agent needs to determine ideal behaviour through experience. It is great for developing policies to maximise rewards in complex environments. Applications like recommendation systems, robotics and gaming often use reinforcement learning. The downside of reinforcement learning is it typically requires extensive training periods and can be computationally expensive. Additionally, there are far fewer real examples of Reinforcement Learning in the data science field so building them requires a high level of expertise.
Understanding the strengths and weaknesses of machine learning versus reinforcement learning is key to selecting the right approach. Machine learning requires training data while reinforcement learning relies on environmental rewards and penalties. When in doubt, start with machine learning first given its speed and scalability. But for problems requiring independent discovery of complex behaviour policies, it may be time to turn to reinforcement learning.