Calculating accuracy of prediction of rpart model

0 votes

I have this modified iris data-set which comprises of first 100 rows i.e only the 'setosa' and 'versicolor' species. I have randomized the rows using sample() function:

iris1[sample(nrow(iris1)),]->iris1

The i've divided the data-set in 65:35 ratio using the sample.split function from caTools package:

sample.split(iris1$Species,SplitRatio = 0.65)->mysplit
subset(iris1,mysplit==T)->train
subset(iris1,mysplit==F)->test

Following which i've built the rpart model on top of "train" set and predicted the values on "test" set:

rpart(Species~.,data=train)->mod1
predict(mod1,test,type = "class")->result1

Now, i would want to find the accuracy of prediction on the test set, how can i do that?

Apr 4, 2018 in Data Analytics by nirvana
• 3,060 points

edited Apr 4, 2018 by nirvana 1,275 views

1 answer to this question.

0 votes

Your first task would be to build a confusion matrix for the actual values and predicted value, you can do that using the table() function:

table(test$Species,result1)

This would give you the below confusion matrix:

             result1
             setosa versicolor
  setosa         18          0
  versicolor      0         18

Now, you can find out the accuracy of prediction by dividing the correctly predicted results upon all the results:

(18+18)/(18+18+0+0)

This would give you an accuracy of 100%

You can also use the "confusionMatrix()" function from the caret package:

confusionMatrix(table(test$Species,result1))


This would be the result:

Accuracy : 1         
                 95% CI : (0.9026, 1)
    No Information Rate : 0.5       
    P-Value [Acc > NIR] : 1.455e-11  

answered Apr 4, 2018 by Bharani
• 4,550 points

Related Questions In Data Analytics

0 votes
1 answer

Building a Time series prediction model on web login timestamp

I had done something similar and ran ...READ MORE

answered Dec 7, 2018 in Data Analytics by Upasana
• 8,530 points
305 views
+1 vote
2 answers

Custom Function to replace missing values in a vector with the mean of values

Try this. lapply(a,function(x){ifelse(is.na(x),mean(a,na.rm = TRUE) ...READ MORE

answered Aug 14 in Data Analytics by anonymous
76 views
+1 vote
2 answers
0 votes
1 answer

Building Random Forest on a data-set comprising of missing(NA) values

You have two options, either impute the ...READ MORE

answered Apr 2, 2018 in Data Analytics by Bharani
• 4,550 points
129 views
0 votes
2 answers

Transforming a key/value string into distinct rows in R

We would start off by loading the ...READ MORE

answered Mar 26, 2018 in Data Analytics by Bharani
• 4,550 points
63 views
0 votes
1 answer

Finding frequency of observations in R

You can use the "dplyr" package to ...READ MORE

answered Mar 26, 2018 in Data Analytics by Bharani
• 4,550 points
127 views
0 votes
1 answer

Left Join and Right Join using "dplyr"

The below is the code to perform ...READ MORE

answered Mar 26, 2018 in Data Analytics by Bharani
• 4,550 points
103 views
0 votes
1 answer

Plotting multiple graphs on the same page in R

If you want to plot 4 graphs ...READ MORE

answered Mar 27, 2018 in Data Analytics by Bharani
• 4,550 points
50 views
+1 vote
3 answers

Problem with installation of Wordcloud in anaconda

Using Anaconda Python 3.6 version For Windows ...READ MORE

answered Aug 7, 2018 in Data Analytics by Priyaj
• 56,540 points
3,553 views
+1 vote
3 answers

Integration of Google Collaboratory with github

You can use SSH protocol to connect ...READ MORE

answered Aug 7, 2018 in Data Analytics by Kalgi
• 40,460 points
1,078 views