How can I remove duplicated rows in R ?

0 votes
Apr 26, 2018 in Data Analytics by zombie
• 3,690 points
569 views

1 answer to this question.

0 votes
The function distinct() in the dplyr package performs arbitrary duplicate removal

Data:

dt <- data.frame(m = rep(c(1,2),4), n = rep(LETTERS[1:4],2))

Remove rows where specified columns have been duplicated:

library(dplyr)
dat %>% distinct(m, .keep_all = TRUE)

  m n
1 1 A
2 2 B

Remove rows which are complete duplicates of other rows:

dat %>% distinct

  m n
1 1 A
2 2 B
3 1 C
4 2 D

Generaleneral answer for duplicate row removal

m <- c(rep("A", 3), rep("B", 3), rep("C",2))
n <- c(1,1,2,4,1,1,2,2)
df <-data.frame(m,n)

duplicated(df)
[1] FALSE  TRUE FALSE FALSE FALSE  TRUE FALSE  TRUE

df[duplicated(df), ]
  m n
2 A 1
6 B 1
8 C 2

df[!duplicated(df), ]
  m n
1 A 1
3 A 2
4 B 4
5 B 1
7 C 2
answered Apr 26, 2018 by shams
• 3,580 points

Related Questions In Data Analytics

0 votes
1 answer

How can I control the size of points in an R scatterplot?

plot(variable, type='o' , pch=5, cex=.3) The pch argument ...READ MORE

answered May 3, 2018 in Data Analytics by shams
• 3,580 points
31 views
0 votes
1 answer

How can I append rows to an R data frame?

Consider a dataSet i.e cicar(present under library ...READ MORE

answered May 8, 2018 in Data Analytics by zombie
• 3,690 points
38 views
0 votes
1 answer

How can I select a CRAN mirror in R ?

There are many ways of doing so ...READ MORE

answered May 9, 2018 in Data Analytics by zombie
• 3,690 points
40 views
0 votes
1 answer

How can I rotate axis labels in R ?

library(ggplot2) p <- data.frame(Day=c("2011-04-11", "2014-05-24","2004-01-12","2014-06-20","2010-08-07","2014-05-28"), Impressions=c(24010,15959,16107,21792,24933,21634),Clicks=c(211,106,248,196,160,241)) p       ...READ MORE

answered May 18, 2018 in Data Analytics by zombie
• 3,690 points
267 views
0 votes
1 answer

How can I get type of all variables in R?

It is a easy task and one ...READ MORE

answered May 21, 2018 in Data Analytics by zombie
• 3,690 points
40 views
0 votes
1 answer

How can I Split code over multiple lines in an R script?

You can do this as follows: setwd(paste("~/a/very/long/path/here ...READ MORE

answered May 21, 2018 in Data Analytics by zombie
• 3,690 points
98 views
0 votes
1 answer

How can I find file name from full file path in R?

You can use: basename("C:/some_dir/filename.ext") # [1] "filename.e ...READ MORE

answered May 24, 2018 in Data Analytics by zombie
• 3,690 points
19 views
0 votes
1 answer

How can I pause, sleep, wait, execution for X seconds in R?

You can try the following piece of ...READ MORE

answered May 25, 2018 in Data Analytics by zombie
• 3,690 points
130 views
0 votes
1 answer

How can I define Global Variables in R?

The variables declared inside a function are ...READ MORE

answered Apr 25, 2018 in Data Analytics by shams
• 3,580 points
1,168 views
0 votes
1 answer

How can I delete multiple values from a vector in R?

The %in% operator tells  which elements are ...READ MORE

answered Apr 26, 2018 in Data Analytics by shams
• 3,580 points
96 views