How to join two tables tibbles by list columns in R

0 votes

If the list of both key variables is identical, I want the row to be included in the final table, and if not then it should not be included. In the below example there are two key variables x and y. So the result should be only the first row since it's the only identical one in both key variables.

Refer to the code below:

tib1 <- tibble(x = list(1, 2, 3), y = list(4, 5, 6))
tib1
# A tibble: 3 × 2
      x         y
 <list>    <list>
1 <dbl [1]> <dbl [1]>
2 <dbl [1]> <dbl [1]>
3 <dbl [1]> <dbl [1]>

tib2 <- tibble(x = list(1, 2, 4, 5), y = list(4, c(5, 10), 6, 7))
tib2
# A tibble: 4 × 2
      x         y
 <list>    <list>
1 <dbl [1]> <dbl [1]>
2 <dbl [1]> <dbl [2]>
3 <dbl [1]> <dbl [1]>
4 <dbl [1]> <dbl [1]>

dplyr::inner_join(tib1, tib2)
Apr 6, 2018 in Data Analytics by DataKing99
• 8,250 points
1,683 views

1 answer to this question.

0 votes

You can use the hash from digest as follows:

tib1 <- tibble(x = list(1, 2, 3), y = list(4, 5, 6))
tib2 <- tibble(x = list(1, 2, 4, 5), y = list(4, c(5, 10), 6, 7))

tib1 <- mutate_all(tib1, funs(hash = map_chr(., digest::digest)))
tib2 <- mutate_all(tib2, funs(hash = map_chr(., digest::digest)))

dplyr::inner_join(tib1, tib2, c('x_hash', 'y_hash')) %>%
  select(x.x, x.y)

answered Apr 6, 2018 by kappa3010
• 2,090 points

Related Questions In Data Analytics

+1 vote
2 answers

How to sort a data frame by columns in R?

You can use dplyr function arrange() like ...READ MORE

answered Aug 21, 2019 in Data Analytics by anonymous
• 33,030 points
1,789 views
0 votes
2 answers

How to use group by for multiple columns in dplyr, using string vector input in R?

data = data.frame(   zzz11def = sample(LETTERS[1:3], 100, replace=TRUE),   zbc123qws1 ...READ MORE

answered Aug 6, 2019 in Data Analytics by anonymous
14,079 views
0 votes
1 answer

How to convert a list to data frame in R?

Let's assume your list of lists is ...READ MORE

answered Apr 12, 2018 in Data Analytics by nirvana
• 3,130 points

edited Apr 12, 2018 by nirvana 22,092 views
+4 votes
3 answers

How to sum a variable by group in R?

You can also try this way, x_new = ...READ MORE

answered Aug 1, 2019 in Data Analytics by Cherukuri
• 33,030 points
78,114 views
+1 vote
5 answers

How to remove NA values with dplyr::filter()

Try this: df %>% filter(!is.na(col1)) READ MORE

answered Mar 26, 2019 in Data Analytics by anonymous
330,909 views
0 votes
1 answer

How can I use parallel so that it preserves the list of data frames

You can use pmap as follows: nc <- ...READ MORE

answered Apr 4, 2018 in Data Analytics by kappa3010
• 2,090 points
1,052 views
0 votes
1 answer

Join list of data.frames using map() call

You can use Reduce set.seed(24) r1 <- map(c(5, 10, 15), ...READ MORE

answered Apr 6, 2018 in Data Analytics by Sahiti
• 6,370 points
1,030 views
0 votes
1 answer

How to use dplyr functions such as filter() inside nested data frames with map()

You can use map() call as follows:  map(full, ...READ MORE

answered Apr 6, 2018 in Data Analytics by Sahiti
• 6,370 points
4,664 views
0 votes
1 answer
0 votes
1 answer
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP