SD in data table in R

0 votes

What does .SD stand for? How is it helpful and when to use it?

According to some source, .SD is a data.table containing the subset of x's data for each group, excluding the group column(s).

Can be used when grouping by i, when grouping by by, keyed by, and adhoc_ by

Does that mean that the subset data.tables is held in memory for the upcoming/next operation?

Apr 13, 2018 in Data Analytics by kappa3010
• 2,090 points
5,034 views

1 answer to this question.

0 votes

.SD stands for "Subset of Data.table". The dot before SD has no significance but doesn't let it clash with a user-defined column name.

Consider your data.table as follows:

DT = data.table(a=rep(c("x","y","z"),each=2), b=c(1,3), v=1:6)
setkey(DT, p)
DT
#    a b p
# 1: x 1 1
# 2: y 1 3
# 3: z 1 5
# 4: x 3 2
# 5: y 3 4
# 6: z 3 6

Try the below code to understand what .SD does:

DT[ , .SD[ , paste(a, p, sep="", collapse="_")], by=b]
#    b       V1
# 1: 1 x1_y3_z5
# 2: 3 x2_y4_z6

The by=b statements divides the original data.table into a subset of 2 data.tables

DT[ , print(.SD), by=b]
# 1st sub-data.table, called '.SD' while it's being operated on:
#    a p
# 1: x 1
# 2: y 3
# 3: z 5
# 2nd sub-data.table, called '.SD' while it's being operated on:
#    a p
# 1: x 2
# 2: y 4
# 3: z 6
# Final output, since print() doesn't return anything
# Empty data.table (0 rows) of 1 col: b
and operates on them in turn.

While it is operating on any one of the subset, it let's you refer to the current subset of data.table by using a nick-name/handle/symbol .SD.

So, you can access and operate on the columns very easily.

But, data.table will carry out the operations on every single sub-data.table defined by combinations of the key, and then "pasting" them back together. After which it will return the results in a single data.table!

answered Apr 13, 2018 by nirvana
• 3,130 points

Related Questions In Data Analytics

+1 vote
2 answers

How to sort a data frame by columns in R?

You can use dplyr function arrange() like ...READ MORE

answered Aug 21, 2019 in Data Analytics by anonymous
• 33,010 points
878 views
0 votes
1 answer

How to convert tables to a data frame in R ?

> trial.table.df <- as.data.frame(trial.table) //assuming that trial.table ...READ MORE

answered Apr 20, 2018 in Data Analytics by zombie
• 3,790 points
4,388 views
0 votes
1 answer

How to filter a data frame with dplyr and tidy evaluation in R?

Requires the use of map_df to run each model, ...READ MORE

answered May 17, 2018 in Data Analytics by DataKing99
• 8,240 points
1,163 views
0 votes
1 answer

How to forecast season and trend of data using STL and ARIMA in R?

You can use the forecast.stl function for the ...READ MORE

answered May 19, 2018 in Data Analytics by DataKing99
• 8,240 points
1,496 views
+1 vote
1 answer

How to convert a list of vectors with various length into a Data.Frame?

We can easily use this command as.data.frame(lapply(d1, "length< ...READ MORE

answered Apr 4, 2018 in Data Analytics by DeepCoder786
• 1,720 points
809 views
0 votes
2 answers

In data frame how to spilt strings into values?

You can do this using dplyr and ...READ MORE

answered Dec 5, 2018 in Data Analytics by Kalgi
• 52,350 points
344 views
0 votes
1 answer
0 votes
1 answer

How to convert a text mining termDocumentMatrix into excel or csv in R?

By assuming that all the values are ...READ MORE

answered Apr 5, 2018 in Data Analytics by DeepCoder786
• 1,720 points
992 views
0 votes
1 answer

How to convert a list to data frame in R?

Let's assume your list of lists is ...READ MORE

answered Apr 12, 2018 in Data Analytics by nirvana
• 3,130 points

edited Apr 12, 2018 by nirvana 21,312 views
0 votes
1 answer

Is there any way to check for missing packages and install them in R?

There are 2 options: Either you can use ...READ MORE

answered Apr 17, 2018 in Data Analytics by nirvana
• 3,130 points
355 views
webinar REGISTER FOR FREE WEBINAR X
Send OTP
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP