How to sample n random rows per group in a dataframe?

0 votes

Here are some sample data:

df <- data.frame(matrix(rnorm(80), nrow=40))
df$color <-  rep(c("blue", "red", "yellow", "pink"), each=10)

df[sample(nrow(df), 3), ] #samples 3 random rows from df, without replacement.

To e.g. just sample 3 random rows from 'pink' color - using library(kimisc):

library(kimisc)
sample.rows(subset(df, color == "pink"), 3)

or writing custom function:

sample.df <- function(df, n) df[sample(nrow(df), n), , drop = FALSE]
sample.df(subset(df, color == "pink"), 3)

I want to sample 3 (or n) random rows from each level of the factor. I.e. the new df would have 12 rows (3 from blue, 3 from red, 3 from yellow, 3 from pink). 

I am looking for a really simple solution.

Jul 2, 2018 in Data Analytics by anonymous
98 views

1 answer to this question.

0 votes

You can assign a random ID to each element that has a particular factor level using ave. Then you can select all random IDs in a certain range.

rndid <- with(df, ave(X1, color, FUN=function(x) {sample.int(length(x))}))
df[rndid<=3,]

answered Jul 2, 2018 by darklord
• 6,140 points

Related Questions In Data Analytics

0 votes
7 answers
0 votes
1 answer
0 votes
1 answer
0 votes
1 answer

In a dpylr pipline how to use sample and seq?

For avoiding rowwise(), I prefer to use ...READ MORE

answered Apr 6, 2018 in Data Analytics by DeepCoder786
• 1,700 points
60 views
0 votes
1 answer

How to convert a list of vectors with various length into a Data.Frame?

We can easily use this command as.data.frame(lapply(d1, "length< ...READ MORE

answered Apr 4, 2018 in Data Analytics by DeepCoder786
• 1,700 points
46 views
0 votes
2 answers

In data frame how to spilt strings into values?

You can do this using dplyr and ...READ MORE

answered Dec 4, 2018 in Data Analytics by Kalgi
• 37,320 points
37 views
0 votes
1 answer
0 votes
1 answer

How to convert a text mining termDocumentMatrix into excel or csv in R?

By assuming that all the values are ...READ MORE

answered Apr 5, 2018 in Data Analytics by DeepCoder786
• 1,700 points
77 views
0 votes
1 answer

How to sample random rows in dataframe?

Create data frame and then implement as ...READ MORE

answered Jul 2, 2018 in Data Analytics by darklord
• 6,140 points
29 views
0 votes
1 answer

How to remove rows with missing values (NAs) in a data frame?

You can use complete.cases in the following ...READ MORE

answered Apr 13, 2018 in Data Analytics by darklord
• 6,140 points
3,666 views