Splitting the data into training and testing sets - R

0 votes

I am working with the 'beaver1' data-set, below is a sample:

   day time  temp activ
1 346  840   36.33   0
2 346  850   36.34   0
3 346  900   36.35   0
4 346  910   36.42   0
5 346  920   36.55   0
6 346  930   36.69   0

I want to split this data into 'train' and 'test' sets with 65:35 ratio so that i can build a machine learning model on top of it, how can i do it?

May 7, 2018 in Data Analytics by DataKing99
• 8,100 points
460 views

1 answer to this question.

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.
0 votes

You can use the sample.split() function from the caTools package for this purpose:

Start off by loading the 'caTools' package:

library(caTools)

Then, use the sample.split() function which takes in two parameters -> the dataset - 'beaver1' and SplitRatio - 0.65

sample.split(beaver1,SplitRatio = 0.65)->mysplit

Following which, use the subset() function and select all those observations where 'mysplit' tag is True and store them in 'train'

subset(beaver1,mysplit==T)->train

Similarly, select all those observations where  the 'mysplilt' tag is Fasle, and store them in 'test'

subset(beaver1,mysplit==F)->test

And that's how you split the data into 'train' and 'test' sets.

answered May 7, 2018 by Bharani
• 4,550 points

Related Questions In Data Analytics

0 votes
1 answer

How can I list all the data sets available in all R packages?

You can use the below line of ...READ MORE

answered Sep 7, 2018 in Data Analytics by zombie
• 3,690 points
21 views
0 votes
1 answer

R query and Data Science

Dear Deepika, Hope you are doing great. You can ...READ MORE

answered Dec 17, 2017 in Data Analytics by Sudhir
• 1,610 points
16 views
0 votes
1 answer
0 votes
1 answer

Which function can I use to clear the console in R and RStudio ?

Description                   Windows & Linux           Mac Clear console                      Ctrl+L ...READ MORE

answered Apr 17, 2018 in Data Analytics by anonymous
632 views
0 votes
2 answers

Transforming a key/value string into distinct rows in R

We would start off by loading the ...READ MORE

answered Mar 26, 2018 in Data Analytics by Bharani
• 4,550 points
27 views
0 votes
1 answer

Finding frequency of observations in R

You can use the "dplyr" package to ...READ MORE

answered Mar 26, 2018 in Data Analytics by Bharani
• 4,550 points
35 views
0 votes
1 answer

Left Join and Right Join using "dplyr"

The below is the code to perform ...READ MORE

answered Mar 26, 2018 in Data Analytics by Bharani
• 4,550 points
63 views
0 votes
1 answer

Plotting multiple graphs on the same page in R

If you want to plot 4 graphs ...READ MORE

answered Mar 27, 2018 in Data Analytics by Bharani
• 4,550 points
29 views
0 votes
2 answers

"Train" and "Test" sets in Data Science

Normally to perform supervised learning you need ...READ MORE

answered Aug 2, 2018 in Data Analytics by ANMOL
• 3,620 points
25 views
0 votes
1 answer

Applying the same function to every row of a data.frame - R

You can use the 'appply()' function for ...READ MORE

answered May 22, 2018 in Data Analytics by Bharani
• 4,550 points
24 views

© 2018 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
"PMP®","PMI®", "PMI-ACP®" and "PMBOK®" are registered marks of the Project Management Institute, Inc. MongoDB®, Mongo and the leaf logo are the registered trademarks of MongoDB, Inc.