Building Random Forest on a data-set comprising of missing NA values

0 votes

I have a modified "iris" dataset comprising of missing values:

iris1$Sepal.Length[c(1,3,57,103)]<-NA

 and i want to build the "Random Forest" algorithm on top of it:

randomForest(Species~Sepal.Length,data=iris1)

But i get this error:

Error in na.fail.default(list(Species = c(1L, 1L, 1L, 1L, 1L, 1L, 1L,  : missing values in object

Is there a way i can build the "random forest" algorithm on top of it?

Apr 3, 2018 in Data Analytics by nirvana
• 3,130 points

edited Apr 3, 2018 by nirvana 1,043 views

1 answer to this question.

0 votes

You have two options, either impute the missing values or omit the missing values.

If you want to impute the missing values in the predictor data, you can use rfImpute() function from randomForest package.

You can run the below command which will impute the missing values in the predictor data:

rfImpute(Species~.,data=iris1)->iris1

Now you can go ahead and use the randomForest function to build the "random Forest" algorithm on top of the iris1 dataset:

randomForest(Species~Sepal.Length,data=iris1)

If there are only few missing values in your data-set you can go ahead and remove them using na.omit() function:

na.omit(iris1)->iris1

After removing the missing values, you can go ahead and build the randomForest function on top of the "iris1" dataset:

randomForest(Species~Sepal.Length,data=iris1)
answered Apr 3, 2018 by Bharani
• 4,660 points

Related Questions In Data Analytics

0 votes
1 answer
0 votes
2 answers

How to remove rows with missing values (NAs) in a data frame?

Hi, The below code returns rows without ...READ MORE

answered Aug 20, 2019 in Data Analytics by anonymous
• 33,030 points
14,438 views
0 votes
2 answers

How to subset rows containing NA in a chosen column of a data frame?

You can give this a try. subset(dataframe, is.na(dataframe$col2)) ...READ MORE

answered Aug 21, 2019 in Data Analytics by anonymous
• 33,030 points
9,848 views
0 votes
1 answer

Extract a subset of a data frame based on a condition involving a field

Here are the two main approaches. I ...READ MORE

answered Jun 19, 2018 in Data Analytics by CodingByHeart77
• 3,740 points
16,010 views
0 votes
2 answers

Transforming a key/value string into distinct rows in R

We would start off by loading the ...READ MORE

answered Mar 26, 2018 in Data Analytics by Bharani
• 4,660 points
842 views
0 votes
1 answer

Finding frequency of observations in R

You can use the "dplyr" package to ...READ MORE

answered Mar 26, 2018 in Data Analytics by Bharani
• 4,660 points
5,547 views
0 votes
1 answer

Left Join and Right Join using "dplyr"

The below is the code to perform ...READ MORE

answered Mar 27, 2018 in Data Analytics by Bharani
• 4,660 points
858 views
0 votes
1 answer

Plotting multiple graphs on the same page in R

If you want to plot 4 graphs ...READ MORE

answered Mar 27, 2018 in Data Analytics by Bharani
• 4,660 points
1,195 views
+1 vote
2 answers

Finding number of missing values and removing those missing values from a data-frame

To find number of missing values for ...READ MORE

answered Aug 14, 2019 in Data Analytics by anonymous
874 views
+1 vote
2 answers

Custom Function to replace missing values in a vector with the mean of values

Try this. lapply(a,function(x){ifelse(is.na(x),mean(a,na.rm = TRUE ...READ MORE

answered Aug 14, 2019 in Data Analytics by anonymous
1,647 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP