How to treat missing values during analysis

0 votes
I am trying to perform analysis in R and I have a lot of missing values in my dataset.

 I want to know how to treat these missing values.

Can someone please help!
Jul 12, 2018 in Data Analytics by CodingByHeart77
• 3,740 points
795 views

1 answer to this question.

0 votes

The extent of the missing values is identified after identifying the variables with missing values. If any patterns are identified the analyst has to concentrate on them as it could lead to interesting and meaningful business insights.

But, if there are no patterns identified, then the missing values can be substituted with mean or median values (imputation) or they can simply be ignored. 

Assigning a default value which can be mean, minimum or maximum value. Getting into the data is important.

If it is a categorical variable, the default value is assigned. The missing value is assigned a default value. If you have a distribution of data coming, for normal distribution give the mean value.

If 80% of the values for a variable are missing then you can answer that you would be dropping the variable instead of treating the missing values.

You can do this practically in R with the methods such as the mean imputation, median imputation, replace with dummy values, filling with co-relations and similarities, remove the record entirely, and leave the record as it is. 

answered Jul 12, 2018 by Sahiti
• 6,370 points

Related Questions In Data Analytics

0 votes
1 answer

How to write a custom function which will replace all the missing values in a vector with the mean of values in R?

Consider this vector: a<-c(1,2,3,NA,4,5,NA,NA) Write the function to impute ...READ MORE

answered Jul 4, 2018 in Data Analytics by CodingByHeart77
• 3,740 points
4,240 views
0 votes
1 answer

How do I copy an excel file to my Rconsole with all the missing values?

You can use read.table function in the ...READ MORE

answered Nov 16, 2018 in Data Analytics by Maverick
• 10,840 points
387 views
+1 vote
2 answers

Custom Function to replace missing values in a vector with the mean of values

Try this. lapply(a,function(x){ifelse(is.na(x),mean(a,na.rm = TRUE ...READ MORE

answered Aug 14, 2019 in Data Analytics by anonymous
1,648 views
0 votes
5 answers

How to remove NA values with dplyr::filter()

Try this: df %>% filter(!is.na(col1)) READ MORE

answered Mar 26, 2019 in Data Analytics by anonymous
321,288 views
+1 vote
3 answers

Number of missing values in dataset

Try this, lapply(airquality, function(x) { sum(is.na(x)) }) READ MORE

answered Aug 7, 2019 in Data Analytics by anonymous
3,859 views
0 votes
1 answer

Big Data transformations with R

Dear Koushik, Hope you are doing great. You can ...READ MORE

answered Dec 18, 2017 in Data Analytics by Sudhir
• 1,610 points
767 views
0 votes
2 answers

Transforming a key/value string into distinct rows in R

We would start off by loading the ...READ MORE

answered Mar 26, 2018 in Data Analytics by Bharani
• 4,660 points
844 views
0 votes
1 answer

Finding frequency of observations in R

You can use the "dplyr" package to ...READ MORE

answered Mar 26, 2018 in Data Analytics by Bharani
• 4,660 points
5,547 views
0 votes
2 answers

How to remove rows with missing values (NAs) in a data frame?

Hi, The below code returns rows without ...READ MORE

answered Aug 20, 2019 in Data Analytics by anonymous
• 33,030 points
14,438 views
0 votes
2 answers

How to count unique values in R?

You can try this way, as.data.frame(v) %>% count(v) READ MORE

answered Aug 8, 2019 in Data Analytics by anonymous
6,287 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP