How can I remove duplicated rows in R

0 votes
Apr 27, 2018 in Data Analytics by zombie
• 3,790 points
2,714 views

1 answer to this question.

0 votes
The function distinct() in the dplyr package performs arbitrary duplicate removal

Data:

dt <- data.frame(m = rep(c(1,2),4), n = rep(LETTERS[1:4],2))

Remove rows where specified columns have been duplicated:

library(dplyr)
dat %>% distinct(m, .keep_all = TRUE)

  m n
1 1 A
2 2 B

Remove rows which are complete duplicates of other rows:

dat %>% distinct

  m n
1 1 A
2 2 B
3 1 C
4 2 D

Generaleneral answer for duplicate row removal

m <- c(rep("A", 3), rep("B", 3), rep("C",2))
n <- c(1,1,2,4,1,1,2,2)
df <-data.frame(m,n)

duplicated(df)
[1] FALSE  TRUE FALSE FALSE FALSE  TRUE FALSE  TRUE

df[duplicated(df), ]
  m n
2 A 1
6 B 1
8 C 2

df[!duplicated(df), ]
  m n
1 A 1
3 A 2
4 B 4
5 B 1
7 C 2
answered Apr 27, 2018 by shams
• 3,670 points

Related Questions In Data Analytics

0 votes
1 answer

How can I control the size of points in an R scatterplot?

plot(variable, type='o' , pch=5, cex=.3) The pch argument ...READ MORE

answered May 3, 2018 in Data Analytics by shams
• 3,670 points
1,053 views
0 votes
1 answer

How can I append rows to an R data frame?

Consider a dataSet i.e cicar(present under library ...READ MORE

answered May 9, 2018 in Data Analytics by zombie
• 3,790 points
10,481 views
0 votes
1 answer

How can I select a CRAN mirror in R ?

There are many ways of doing so ...READ MORE

answered May 9, 2018 in Data Analytics by zombie
• 3,790 points
735 views
0 votes
1 answer

How can I rotate axis labels in R ?

library(ggplot2) p <- data.frame(Day=c("2011-04-11", "2014-05-24","2004-01-12","2014-06-20","2010-08-07","2014-05-28"), Impressions=c(24010,15959,16107,21792,24933,21634),Clicks=c(211,106,248,196,160,241)) p       ...READ MORE

answered May 18, 2018 in Data Analytics by zombie
• 3,790 points
1,665 views
0 votes
1 answer

How can I get type of all variables in R?

It is a easy task and one ...READ MORE

answered May 22, 2018 in Data Analytics by zombie
• 3,790 points
885 views
0 votes
1 answer

How can I Split code over multiple lines in an R script?

You can do this as follows: setwd(paste("~/a/very/long/path/here ...READ MORE

answered May 22, 2018 in Data Analytics by zombie
• 3,790 points
1,729 views
0 votes
1 answer

How can I find file name from full file path in R?

You can use: basename("C:/some_dir/filename.ext") # [1] "filename. ...READ MORE

answered May 24, 2018 in Data Analytics by zombie
• 3,790 points
1,142 views
0 votes
1 answer

How can I pause, sleep, wait, execution for X seconds in R?

You can try the following piece of ...READ MORE

answered May 25, 2018 in Data Analytics by zombie
• 3,790 points
1,849 views
0 votes
1 answer

How can I define Global Variables in R?

The variables declared inside a function are ...READ MORE

answered Apr 25, 2018 in Data Analytics by shams
• 3,670 points
5,741 views
0 votes
1 answer

How can I delete multiple values from a vector in R?

The %in% operator tells  which elements are ...READ MORE

answered Apr 27, 2018 in Data Analytics by shams
• 3,670 points
5,950 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP