How do I remove unnecessary redundant data from a dataset

0 votes
I have a data set with 100 clumns and 18854 rows. How do I eliminate redundant data?
Nov 13, 2018 in Data Analytics by Ali
• 11,360 points
1,231 views

1 answer to this question.

0 votes

You can use dimensionality reduction methods such as PCA.

dimensionality reduction is the process of reducing the number of random variables under consideration by obtaining a set of principal variables.

PCA is a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables (entities each of which takes on various numerical values) into a set of values of linearly uncorrelated variables called principal components.

answered Nov 13, 2018 by Maverick
• 10,840 points

Related Questions In Data Analytics

0 votes
1 answer

How do I remove an element from a list by index in R?

Use list[index] = NULL The list value will ...READ MORE

answered Oct 31, 2019 in Data Analytics by Cherukuri
• 33,030 points
2,048 views
0 votes
1 answer

How do I become a data scientist step by step?

I am assuming that you are a ...READ MORE

answered Jul 26, 2018 in Data Analytics by Abhi
• 3,720 points
538 views
+1 vote
1 answer

How do I perform feature selection in a disease prediction data set?

Feature selection is based equally upon logic ...READ MORE

answered Aug 20, 2018 in Data Analytics by Abhi
• 3,720 points
622 views
0 votes
1 answer

How do I make a matrix from a list of vectors in R?

Suppose l1 and l2 are my vectors, li = ...READ MORE

answered Aug 7, 2019 in Data Analytics by Cherukuri
• 33,030 points
819 views
0 votes
1 answer

Replace comma with a period in data cleaning using R

You can use the scan function in ...READ MORE

answered Nov 13, 2018 in Data Analytics by Maverick
• 10,840 points
3,268 views
0 votes
1 answer

Cleaning a Data Frame Using Regexp in R

The simplest way: library(dplyr) library(stringi) df %>% mutate(NUMERO_APPEL.fix = ...READ MORE

answered Nov 13, 2018 in Data Analytics by Maverick
• 10,840 points
433 views
0 votes
1 answer

Manipulate character string using gsub() and perform multivariate data cleaning efficiently in R

gsubfn is perfect for this task: library(gsubfn) as.vector(sapply(gsubfn("[A-Z]", list(B="* 1", ...READ MORE

answered Nov 13, 2018 in Data Analytics by Maverick
• 10,840 points
585 views
0 votes
1 answer

Clean and standardize words using R

You might want to checkout the stringdist package, e.g.: library(stringdist) toMatch ...READ MORE

answered Nov 13, 2018 in Data Analytics by Maverick
• 10,840 points
574 views
+1 vote
1 answer

How do i send R errors from console to standard java output?

R offers a command to save its ...READ MORE

answered Nov 8, 2018 in Data Analytics by Maverick
• 10,840 points
471 views
0 votes
1 answer

How to remove certain character from a vector

We can use sub to remove the * by specifying fixed = ...READ MORE

answered Nov 14, 2018 in Data Analytics by Maverick
• 10,840 points
449 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP