How to merge data frames using joins?

0 votes

Below are two data frames:

data1 = data.frame(Cust_Id = c(1:6), Product = c(rep("Headphones", 3), rep("Toaster", 3)))
data2 = data.frame(Cust_Id = c(2, 4, 6), State = c(rep("New York", 2), rep("California", 1)))

data1
#  Cust_Id   Product
#           1 Headphones
#           2 Headphones
#           3 Headphones
#           4   Toaster
#           5   Toaster
#           6   Toaster

data2
#  Cust_Id   State
#           2 New York
#           4 New York
#           6 California

How can I apply joins on these data frames?

  • An inner join of data1 and data2:
    This returns only those rows in which the left table has matching keys in the right table.
  • An outer join of data1 and data2:
    This should return all rows from both tables, basically joins records from the left which have matching keys in the right table.
  • A left outer join or a simple left join of data1 and data2:
    This should return all rows from the left table, and any rows with matching keys from the right table.
  • A right outer join of data1 and data2:
    This should return all rows from the right table, and any rows with matching keys from the left table.
Apr 12, 2018 in Data Analytics by CodingByHeart77
• 3,680 points
47 views

1 answer to this question.

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.
0 votes

You can use the merge function with it's optional parameters

Inner join: merge(data1, data2) will work .

This is because R  joins the frames by common variable names.

To make sure that you are matching by only those fields that you desire, you can specify the 'by' parameter like this:

 merge(df1, df2, by = "Cust_Id")

You can also use the by.x and by.y parameters if the matching variables have different names and are present in different data frames.

For all other joins you can use the commands mentioned below:

Outer join: merge(x = data1, y = data2, by = "Cust_Id", all = TRUE)

Left outer: merge(x = data1, y = data2, by = "Cust_Id", all.x = TRUE)

Right outer: merge(x = data1, y = data2, by = "Cust_Id", all.y = TRUE)

Cross join: merge(x = data1, y = data2, by = NULL)

answered Apr 12, 2018 by kappa3010
• 2,010 points

Related Questions In Data Analytics

0 votes
1 answer

How to use dplyr functions such as filter() inside nested data frames with map()

You can use map() call as follows:  map(full, ...READ MORE

answered Apr 6, 2018 in Data Analytics by darklord
• 6,140 points
108 views
0 votes
1 answer

How to create a list of Data frames?

Basically all we have to do is ...READ MORE

answered Apr 9, 2018 in Data Analytics by DeepCoder786
• 1,700 points
38 views
0 votes
1 answer

How to order data frame rows according to vector with specific order using R?

You can try using match: data <- data.frame(alphabets=letters[1:4], ...READ MORE

answered Apr 30, 2018 in Data Analytics by darklord
• 6,140 points
44 views
0 votes
1 answer

How to forecast season and trend of data using STL and ARIMA in R?

You can use the forecast.stl function for the ...READ MORE

answered May 18, 2018 in Data Analytics by DataKing99
• 8,100 points
420 views
+13 votes
2 answers
0 votes
1 answer
0 votes
1 answer

How to join a list of data frames using map()

You can use reduce set.seed(24) r1 <- map(c(5, 10, ...READ MORE

answered Apr 11, 2018 in Data Analytics by kappa3010
• 2,010 points
26 views
0 votes
1 answer

© 2018 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
"PMP®","PMI®", "PMI-ACP®" and "PMBOK®" are registered marks of the Project Management Institute, Inc. MongoDB®, Mongo and the leaf logo are the registered trademarks of MongoDB, Inc.