What is Reinforcement Learning?

If you’re new to machine learning, you’ve probably heard about supervised learning, unsupervised learning, and other types of machine learning techniques. So, in the field of machine learning, there is reinforcement learning. In this article, I will explain reinforcement learning.

What is Reinforcement learning?

Machine Learning includes the field of reinforcement learning. It’s all about taking the right steps to maximise your benefit in a given circumstance. It is used by a variety of software and computers to determine the best feasible action or path in a given situation.

How Reinforcement learning works?

This strategy gives the agent positive values for desired acts and negative ones for undesirable behaviours. This trains the agent to seek maximal overall reward over the long run in order to arrive at the best possible outcome.

The agent is prevented from pausing on smaller tasks by these long-term objectives. The agent eventually learns to steer well clear of the bad and look for the good. Artificial intelligence has adopted this teaching strategy

In this example, there is a dog as shown in the picture. Let’s say you have to train your dog. Let’s say you are teaching it to sit. In the second picture, the dog is punished for not sitting, and it learns from that punishment. Let’s assume that after many training sessions, the dog is now fully trained. When we tell the dog to sit, it obeys and is rewarded.

Types of Reinforcement learning

The reinforcement learning algorithm is divided into two sorts of problems:

Positive Reinforcement

Positive reinforcement learning means doing something to improve the likelihood of the desired behaviour occurring again. It has a beneficial effect on the agent’s behaviour and raises the strength of the conduct.
This form of reinforcement can last a long period, but too much positive reinforcement might result in an overflow of states, which can lessen the effects.

Negative Reinforcement

Negative reinforcement learning is the polar opposite of positive reinforcement learning in that it enhances the likelihood of the given behaviour recurring by avoiding the negative situation.
Depending on the context and conduct, it may be more successful than positive reinforcement, although it only offers reinforcement for the bare minimum of activity.

Models in Reinforcement learning

There are 3 most use models in reinforcement learning:

State-action-reward-state-action (SARSA)

This reinforcement learning technique begins by providing a policy to the agent. The policy is simply a probability that informs it of the chances that particular activities will result in rewards or favourable situations.


This technique to reinforcement learning is the polar opposite of the previous one. Because the agent is not given a policy, its investigation of its surroundings is more self-directed.

Deep Q-Networks

In addition to reinforcement learning approaches, these algorithms use neural networks. They employ reinforcement learning’s self-directed environment exploration. The neural network learns a random sample of previous helpful behaviours to predict future ones.

Applications of Reinforcement learning

  • Robotics
  • Game Playing

If you like my article and efforts towards the community, you may support and encourage me, by simply buying coffee for me


well I have good news for you I would be bringing some more articles to explain machine learning models with codes so leave a comment and tell me how excited are you about this



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Aviral Bhardwaj

Aviral Bhardwaj

One of the youngest writer and mentor on AI-ML & Technology.