How to treat missing values during analysis?

0 votes
I am trying to perform analysis in R and I have a lot of missing values in my dataset.

 I want to know how to treat these missing values.

Can someone please help!
Jul 12, 2018 in Data Analytics by CodingByHeart77
• 3,710 points
74 views

1 answer to this question.

0 votes

The extent of the missing values is identified after identifying the variables with missing values. If any patterns are identified the analyst has to concentrate on them as it could lead to interesting and meaningful business insights.

But, if there are no patterns identified, then the missing values can be substituted with mean or median values (imputation) or they can simply be ignored. 

Assigning a default value which can be mean, minimum or maximum value. Getting into the data is important.

If it is a categorical variable, the default value is assigned. The missing value is assigned a default value. If you have a distribution of data coming, for normal distribution give the mean value.

If 80% of the values for a variable are missing then you can answer that you would be dropping the variable instead of treating the missing values.

You can do this practically in R with the methods such as the mean imputation, median imputation, replace with dummy values, filling with co-relations and similarities, remove the record entirely, and leave the record as it is. 

answered Jul 12, 2018 by Sahiti
• 6,290 points

Related Questions In Data Analytics

0 votes
1 answer

How to write a custom function which will replace all the missing values in a vector with the mean of values in R?

Consider this vector: a<-c(1,2,3,NA,4,5,NA,NA) Write the function to impute ...READ MORE

answered Jul 4, 2018 in Data Analytics by CodingByHeart77
• 3,710 points
401 views
0 votes
1 answer

How do I copy an excel file to my Rconsole with all the missing values?

You can use read.table function in the ...READ MORE

answered Nov 16, 2018 in Data Analytics by Maverick
• 10,800 points
50 views
+1 vote
2 answers

Custom Function to replace missing values in a vector with the mean of values

Try this. lapply(a,function(x){ifelse(is.na(x),mean(a,na.rm = TRUE) ...READ MORE

answered Aug 14, 2019 in Data Analytics by anonymous
145 views
0 votes
4 answers

How to remove NA values with dplyr::filter()

Can we create a alist as below ...READ MORE

answered Aug 5, 2019 in Data Analytics by anonymous
20,970 views
0 votes
3 answers

Number of missing values in dataset

Try this, lapply(airquality, function(x) { sum(is.na(x)) }) READ MORE

answered Aug 6, 2019 in Data Analytics by anonymous
434 views
0 votes
1 answer

Big Data transformations with R

Dear Koushik, Hope you are doing great. You can ...READ MORE

answered Dec 17, 2017 in Data Analytics by Sudhir
• 1,610 points
96 views
0 votes
2 answers

Transforming a key/value string into distinct rows in R

We would start off by loading the ...READ MORE

answered Mar 26, 2018 in Data Analytics by Bharani
• 4,560 points
129 views
0 votes
1 answer

Finding frequency of observations in R

You can use the "dplyr" package to ...READ MORE

answered Mar 26, 2018 in Data Analytics by Bharani
• 4,560 points
758 views
0 votes
2 answers

How to remove rows with missing values (NAs) in a data frame?

Hi, The below code returns rows without ...READ MORE

answered Aug 20, 2019 in Data Analytics by anonymous
• 32,440 points
8,481 views
0 votes
2 answers

How to count unique values in R?

You can try this way, as.data.frame(v) %>% count(v) READ MORE

answered Aug 8, 2019 in Data Analytics by anonymous
2,869 views