Well, as everyone knows that while solving some situational problems or while searching for guidance, always an excellent exam is helpful

How do I create an example? What are the details I should include? How do I paste data structures from r in a text format?

Are there any tips and tricks available in addition to using dput(), dump() or structure?

When should you include library() or require() statements?

Which reserved words should one avoid, in addition to c, df, data, etc?

Any help is highly appreciated! Apr 10, 2018
recategorized Apr 10, 2018 169 views

## 1 answer to this question.

An excellent example must consist of the following items:

• A small dataset
• A running code necessary to reproduce the error  for the dataset
• The system requirements, R version and its used packages details
• You can also look at the examples in help files as they are often helpful.

In general, all the code given there fulfills the requirements. Data is provided, and minimal code is provided.

# How to produce a minimal dataset?

There are various options where you can use built-in datasets.

A simple data set can be built by providing a vector/data frame with some values.

You can use library(help="datasets") where you can find the description of every data set. If you want more information then it can be obtained with a question mark.

example: ?mtcars where 'mtcars' is one of the datasets in the list

Sometimes you can also choose to make a vector. There are various functions you can use to randomize a vector. Such as:

•  x < - rnorm(20) for normal distribution, x <- runif(20) for uniform distribution
• sample() to randomize a vector :: x <- sample(1:10) for vector 1:10 in random order
•  letters is a useful vector containing the alphabet. This can be used for making factors like: x <- sample(letters[1:4], 20, replace = TRUE)
• For matrices, you can use matrix(1:30,ncol=3)

Just in case if you want to make data frames then use data.frame() Make sure you don't make the entries names complicated.

Let me show you an example :

` Data <- data.frame( X = sample(1:20), Y = sample(c("yes", "no"), 20, replace = TRUE) ) `

# How to copy your data?

If you have a large dataset then you always can create a subset of your original data, using head(), subset() or the indicies.

After that you can use dput() to give something that we can put in R immediately.

` dput(head(iris,4)) `
`structure(list(Sepal.Length = c(5.1, 4.9, 4.7, 4.6), Sepal.Width = c(3.5, 3, 3.2, 3.1), Petal.Length = c(1.4, 1.4, 1.3, 1.5), Petal.Width = c(0.2, 0.2, 0.2, 0.2), Species = structure(c(1L, 1L, 1L, 1L), .Label = c("setosa", "versicolor", "virginica"), class = "factor")), .Names = c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width", "Species"), row.names = c(NA, 4L), class = "data.frame") `

Suppose your dataframe has a factor with various levels, then dput should not be used since it will list all the possible factor levels, even if they arent present in the subset of your chosen data.

To avoid this, you can use the droplevels() function.

`dput(droplevels(head(iris, 4)))`
`structure(list(Sepal.Length = c(5.1, 4.9, 4.7, 4.6), Sepal.Width = c(3.5, 3, 3.2, 3.1), Petal.Length = c(1.4, 1.4, 1.3, 1.5), Petal.Width = c(0.2, 0.2, 0.2, 0.2), Species = structure(c(1L, 1L, 1L, 1L), .Label = "setosa", class = "factor")), .Names = c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width", "Species"), row.names = c(NA, 4L), class = "data.frame") `

Also dput does not work on keyed data.table objects or grouped tbl_df(class grouped_df) from dplyr.

In these cases you can convert back the data to a regular data frame before sharing,dput(as.data.frame(my_data)).

You can give a text representation that can be read in using the text parameter of read.table :

`zz <- "Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa 3 4.7 3.2 1.3 0.2 setosa 4 4.6 3.1 1.5 0.2 setosa 5 5.0 3.6 1.4 0.2 setosa 6 5.4 3.9 1.7 0.4 setosa" Data <- read.table(text=zz, header = TRUE) `

# How to produce a minimal code?

• You should not add all kinds of data formats (unless that is the problem of course)
• You should not copy-paste a whole function/chunk of code that gives an error.

Now, what all you should do, is:

• Add which packages which are actually used.
• Just in case you open /makefiles, add some code to close them or delete the files (using unlink()).
•  If you change options, make sure the code contains a statement to revert them back to the original ones.
• Test run your code in a new, empty R session to make sure the code is runnable.

# Give the required information

• Make sure you give the complete information on R version, operating system etc.
•  If you are running R in R Studio usingrstudioapi::versionInfo() can help you to report your RStudio version.
• If you have a problem with a specific package you can provide the version of the package by giving the output of packageVersion("name of the package"). answered Apr 10, 2018 by
• 2,090 points

edited Apr 12, 2018

## How to create dummy variables based on a categorical variable of lists in R?

You can use mtabulate in the following way: library(qdapTools) cbind(data, ...READ MORE

## How to create a box-plot using “plotly” in R?

You can use this command to create ...READ MORE

## How to create a new R6 Class in R?

You have to first create an object ...READ MORE

+1 vote

## How to create global data sets in R?

You can use the <<- operator for assigning variables ...READ MORE

## By using dpylr package sum of multiple columns

Basically here we are making an equation ...READ MORE

## How to convert a text mining termDocumentMatrix into excel or csv in R?

By assuming that all the values are ...READ MORE

## In a dpylr pipline how to use sample and seq?

For avoiding rowwise(), I prefer to use ...READ MORE

## How to create a list of Data frames?

Basically all we have to do is ...READ MORE