5 ways to measure running time of R code:

1. Using Sys.time

The run time of a chunk of code can be measured by taking the difference between the time at the start and at the end of the code chunk.

Simple yet flexible :sunglasses:.

sleep_for_a_minute <- function() { Sys.sleep(60) }

start_time <- Sys.time()

sleep_for_a_minute()

end_time <- Sys.time()

end_time - start_time

# Time difference of 1.000327 mins

2. Library tictoc

The functions tic and toc are used in the same manner for benchmarking as the just demonstrated Sys.time.

However tictoc adds a lot more convenience to the whole.

One can time a single code chunk:

library(tictoc)

tic("sleeping")

print("falling asleep...")

sleep_for_a_minute()

print("...waking up")

toc()

# [1] "falling asleep..."

# [1] "...waking up"

# sleeping: 60.026 sec elapsed

Or nest multiple timers:

tic("total")

tic("data generation")

X <- matrix(rnorm(50000*1000), 50000, 1000)

b <- sample(1:1000, 1000)

y <- runif(1) + X %*% b + rnorm(50000)

toc()

tic("model fitting")

model <- lm(y ~ X)

toc()

toc()

# data generation: 3.792 sec elapsed

# model fitting: 39.278 sec elapsed

# total: 43.071 sec elapsed

3. Using system.time

One can time the evaluation of an R expression using system.time.

For example, we can use it to measure the execution time of the function sleep_for_a_minute (defined above) as follows.

system.time({ sleep_for_a_minute() })

# user system elapsed

# 0.004 0.000 60.051

4. Library rbenchmark

The documentation to the function benchmark from the rbenchmark R package describes it as “a simple wrapper around system.time”.

However it adds a lot of convenience compared to bare system.time calls.

For example it requires just one benchmark call to time multiple replications of multiple expressions.

library(rbenchmark)

benchmark("lm" = {

X <- matrix(rnorm(1000), 100, 10)

y <- X %*% sample(1:10, 10) + rnorm(100)

b <- lm(y ~ X + 0)$coef

},

"pseudoinverse" = {

X <- matrix(rnorm(1000), 100, 10)

y <- X %*% sample(1:10, 10) + rnorm(100)

b <- solve(t(X) %*% X) %*% t(X) %*% y

},

"linear system" = {

X <- matrix(rnorm(1000), 100, 10)

y <- X %*% sample(1:10, 10) + rnorm(100)

b <- solve(t(X) %*% X, t(X) %*% y)

},

replications = 1000,

columns = c("test", "replications", "elapsed",

"relative", "user.self", "sys.self"))

# test replications elapsed relative user.self sys.self

# 3 linear system 1000 0.167 1.000 0.208 0.240

# 1 lm 1000 0.930 5.569 0.952 0.212

# 2 pseudoinverse 1000 0.240 1.437 0.332 0.612

5. Library microbenchmark

Much like benchmark from the package rbenchmark, the function microbenchmark can be used to compare running times of multiple R code chunks.

But it offers a great deal of convenience and additional functionality.

library(microbenchmark)

set.seed(2017)

n <- 10000

p <- 100

X <- matrix(rnorm(n*p), n, p)

y <- X %*% rnorm(p) + rnorm(100)

check_for_equal_coefs <- function(values) {

tol <- 1e-12

max_error <- max(c(abs(values[[1]] - values[[2]]),

abs(values[[2]] - values[[3]]),

abs(values[[1]] - values[[3]])))

max_error < tol

}

mbm <- microbenchmark("lm" = { b <- lm(y ~ X + 0)$coef },

"pseudoinverse" = {

b <- solve(t(X) %*% X) %*% t(X) %*% y

},

"linear system" = {

b <- solve(t(X) %*% X, t(X) %*% y)

},

check = check_for_equal_coefs)

mbm

# Unit: milliseconds

# expr min lq mean median uq max neval cld

# lm 96.12717 124.43298 150.72674 135.12729 188.32154 236.4910 100 c

# pseudoinverse 26.61816 28.81151 53.32246 30.69587 80.61303 145.0489 100 b

# linear system 16.70331 18.58778 35.14599 19.48467 22.69537 138.6660 100 a