How to sample n random rows per group in a dataframe

0 votes

Here are some sample data:

df <- data.frame(matrix(rnorm(80), nrow=40))
df$color <-  rep(c("blue", "red", "yellow", "pink"), each=10)

df[sample(nrow(df), 3), ] #samples 3 random rows from df, without replacement.

To e.g. just sample 3 random rows from 'pink' color - using library(kimisc):

library(kimisc)
sample.rows(subset(df, color == "pink"), 3)

or writing custom function:

sample.df <- function(df, n) df[sample(nrow(df), n), , drop = FALSE]
sample.df(subset(df, color == "pink"), 3)

I want to sample 3 (or n) random rows from each level of the factor. I.e. the new df would have 12 rows (3 from blue, 3 from red, 3 from yellow, 3 from pink). 

I am looking for a really simple solution.

Jul 3, 2018 in Data Analytics by anonymous
4,682 views

1 answer to this question.

0 votes

You can assign a random ID to each element that has a particular factor level using ave. Then you can select all random IDs in a certain range.

rndid <- with(df, ave(X1, color, FUN=function(x) {sample.int(length(x))}))
df[rndid<=3,]

answered Jul 3, 2018 by Sahiti
• 6,370 points

Related Questions In Data Analytics

0 votes
7 answers
+1 vote
2 answers
0 votes
3 answers

How to select rows in a range from dataframe?

This should do it integer_location = np.where(df.index == ...READ MORE

answered Dec 16, 2020 in Data Analytics by Roshni
• 10,520 points
14,643 views
0 votes
1 answer
+1 vote
1 answer

How to convert a list of vectors with various length into a Data.Frame?

We can easily use this command as.data.frame(lapply(d1, "length< ...READ MORE

answered Apr 4, 2018 in Data Analytics by DeepCoder786
• 1,720 points
1,242 views
0 votes
2 answers

In data frame how to spilt strings into values?

You can do this using dplyr and ...READ MORE

answered Dec 5, 2018 in Data Analytics by Kalgi
• 52,360 points
755 views
0 votes
1 answer
0 votes
1 answer

How to convert a text mining termDocumentMatrix into excel or csv in R?

By assuming that all the values are ...READ MORE

answered Apr 5, 2018 in Data Analytics by DeepCoder786
• 1,720 points
1,605 views
0 votes
1 answer

How to sample random rows in dataframe?

Create data frame and then implement as ...READ MORE

answered Jul 3, 2018 in Data Analytics by Sahiti
• 6,370 points
588 views
0 votes
2 answers

How to remove rows with missing values (NAs) in a data frame?

Hi, The below code returns rows without ...READ MORE

answered Aug 20, 2019 in Data Analytics by anonymous
• 33,030 points
14,405 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP