Underfitting and Overfitting

Understanding Underfitting and Overfitting in Machine Learning

Aviral Bhardwaj
3 min readJul 7, 2022

When we discuss the Machine Learning model, we are referring to its performance and accuracy, which are referred to as prediction errors. Consider that we are creating a machine learning model. A model is deemed to be excellent if it correctly predicts any new input data from the issue area. Overfitting and Underfitting are two factors that contribute to the poor performance of machine learning systems.

In this article, we will discuss what underfitting, overfitting, and perfect fit are in machine learning, how to reduce them, and why they occur.

What is Underfitting?

Underfitting occurs when a model has not learnt the patterns in the training data well and is unable to generalise adequately on the new data. An underfit model performs poorly on training data and produces incorrect predictions. Underfitting happens when there is a large bias and a low variance.

Reasons for Underfitting

  1. Low variance and high bias
  2. The amount of the training dataset utilised is insufficient.
  3. The model is rather simplistic.
  4. Training data has not been cleaned and contains noise.

Techniques to reduce underfitting:

  1. Increase the model’s complexity.
  2. Expand the amount of features by undertaking feature engineering.
  3. Take out the noise from the data.
  4. To improve outcomes, increase the number of epochs or the period of training.

What is Overfitting?

Overfitting occurs when a model performs exceptionally well on training data but poorly on test data (fresh data). In this situation, the machine learning model learns the details and noise in the training data, which has a detrimental impact on the model’s performance on test data. Overfitting can occur as a result of low bias and large variation.

Reasons for Overfitting are

  1. Low bias and high variance
  2. The model is rather complicated.
  3. The amount of training data

Techniques to reduce overfitting:

  1. Enhance training data.
  2. Model complexity should be reduced.
  3. Early termination during the training phase (have an eye over the loss over the training period as soon as loss begins to increase stop training).
  4. Regularization of the Ridge and Lasso
  5. To combat overfitting in neural networks, use dropout.

What Is a Good Fit In Machine Learning?

To find the best fit model, examine the performance of a machine learning model with training data over time. As the algorithm learns, the model’s error on the training data decreases, as does the error on the test dataset. If you train the model for too long, it may acquire unnecessary details and noise in the training set, leading to overfitting. To attain a good fit, you must stop training at the point where the error begins to rise.

Underfit, Overfit and Perfect fit data comparison

If you like my article and efforts towards the community, you may support and encourage me, by simply buying coffee for me

Conclusion

well I have good news for you I would be bringing some more articles to explain machine learning models with codes so leave a comment and tell me how excited are you about this.

--

--

Aviral Bhardwaj

One of the youngest writer and mentor on AI-ML & Technology.