Health Insurance Price Prediction — Random Forest Regressor

Using model Random Forest Regressor we would be making a health insurance price prediction model which will predict the price the person will be charged when buying health insurance.

Aviral Bhardwaj
4 min readFeb 6, 2022

As in the previous article, I have given you an introduction to Random Forest now I will tell you how to make a Random Forest Regressor model in this article with some lines of codes.

so let’s start

the first step is we need to download the dataset and then apply the dataset to the model. you can download or copy data from the URL —

Importing the libraries

Now we will import pandas and NumPyas shown below. If your system does not have these libraries installed, you may get them using the pip command.

Data preparation

Now we’ll read the data using Pandas and save it in a variable called dataso we don’t have to call it again and again. Using the head command, we can view the first 5 components of the data; if you want to see more, enter the number inside the bracket.

now as some data columns are strings(words) so first we have to convert them into integers and we will do it by encoding techniques by using a label encoder

so we have to encode 2 columns () — sex and smoker

Defining X and Y

And now I have created a list in which I think it will be the decidingfactor() for a Health Insurance i.e. age, sex, bmi, children, and smoker and assign it to variable features now I will pass features in the dataset and store it as x and the price of the car as y.

I believe that these x parameters are more appropriate, and if you want to modify the parameters because you believe they are relevant, you can do so.

Making the model

Splitting the Dataset

We must first import the test train split from sklearn model selection before splitting our model dataset into a train and test dataset.

Model

here we will be making a random forest regressor model because we want our value to be an integer value and naming our model regressor

Predicting

nowhere I have made a variable named prediction which will predict the values of our test data.

we also can predict entered values simply by adding the values to the list manually.

Here the values I have enter are —

age=19

sex=Female=0

bmi=27.900

children=0

smoker=Yes=0

which is the first value of our dataset and the actual value of price is 16884.92400 and our model predicts 17032.580588 which is very close

and also, we can predict values from our dataset i.e. here I am predicting first 5 values from dataset.

Accuracy

Now we’ll look at our model’s accuracy our model’s Mean Absolute Error (MAE) is

source code

you can go check the link for full code

Conclusion

in the article I have given you information and codes on how to make a Health insurance price prediction model using random forest regressor model with source code I would be making more exciting models for you so stay connected.

--

--

Aviral Bhardwaj

One of the youngest writer and mentor on AI-ML & Technology.