Building Random Forest on a data-set comprising of missing(NA) values

0 votes

I have a modified "iris" dataset comprising of missing values:

iris1$Sepal.Length[c(1,3,57,103)]<-NA

 and i want to build the "Random Forest" algorithm on top of it:

randomForest(Species~Sepal.Length,data=iris1)

But i get this error:

Error in na.fail.default(list(Species = c(1L, 1L, 1L, 1L, 1L, 1L, 1L,  : missing values in object

Is there a way i can build the "random forest" algorithm on top of it?

Apr 2, 2018 in Data Analytics by nirvana
• 3,060 points

edited Apr 2, 2018 by nirvana 110 views

1 answer to this question.

0 votes

You have two options, either impute the missing values or omit the missing values.

If you want to impute the missing values in the predictor data, you can use rfImpute() function from randomForest package.

You can run the below command which will impute the missing values in the predictor data:

rfImpute(Species~.,data=iris1)->iris1

Now you can go ahead and use the randomForest function to build the "random Forest" algorithm on top of the iris1 dataset:

randomForest(Species~Sepal.Length,data=iris1)

If there are only few missing values in your data-set you can go ahead and remove them using na.omit() function:

na.omit(iris1)->iris1

After removing the missing values, you can go ahead and build the randomForest function on top of the "iris1" dataset:

randomForest(Species~Sepal.Length,data=iris1)
answered Apr 2, 2018 by Bharani
• 4,550 points

Related Questions In Data Analytics

0 votes
1 answer
0 votes
1 answer

How to remove rows with missing values (NAs) in a data frame?

You can use complete.cases in the following ...READ MORE

answered Apr 13, 2018 in Data Analytics by darklord
• 6,140 points
4,091 views
0 votes
1 answer

How to subset rows containing NA in a chosen column of a data frame?

I would suggest you, to never to ...READ MORE

answered Apr 26, 2018 in Data Analytics by kappa3010
• 2,020 points
83 views
0 votes
1 answer

Extract a subset of a data frame based on a condition involving a field

Here are the two main approaches. I ...READ MORE

answered Jun 18, 2018 in Data Analytics by CodingByHeart77
• 3,680 points
259 views
0 votes
2 answers

Transforming a key/value string into distinct rows in R

We would start off by loading the ...READ MORE

answered Mar 26, 2018 in Data Analytics by Bharani
• 4,550 points
57 views
0 votes
1 answer

Finding frequency of observations in R

You can use the "dplyr" package to ...READ MORE

answered Mar 26, 2018 in Data Analytics by Bharani
• 4,550 points
101 views
0 votes
1 answer

Left Join and Right Join using "dplyr"

The below is the code to perform ...READ MORE

answered Mar 26, 2018 in Data Analytics by Bharani
• 4,550 points
94 views
0 votes
1 answer

Plotting multiple graphs on the same page in R

If you want to plot 4 graphs ...READ MORE

answered Mar 27, 2018 in Data Analytics by Bharani
• 4,550 points
44 views
+1 vote
2 answers

Finding number of missing values and removing those missing values from a data-frame

To find number of missing values for ...READ MORE

answered 6 days ago in Data Analytics by anonymous
34 views
+1 vote
2 answers

Custom Function to replace missing values in a vector with the mean of values

Try this. lapply(a,function(x){ifelse(is.na(x),mean(a,na.rm = TRUE) ...READ MORE

answered 6 days ago in Data Analytics by anonymous
63 views