What is the difference between LDA and PCA for dimensionality reduction?

0 votes
I have a dataset with n number of dimensions what should be the ideal algorithm to approach it.
Aug 2, 2018 in Data Analytics by Anmol
• 1,610 points
1,220 views

2 answers to this question.

0 votes

PCA is a Dimensionality Reduction algorithm.

Basically, its a machine learning based technique to extract hidden factors from the dataset.

image

  • Defines your data using lesser number of components to explain the variance in your data
  • Reduces the number of dimensions in the data such that your computational complexity is reduced

Working of PCA:

Consider a scenario where you have data on x and y axis:

image

Applying PCA results into the generation of components, such that they are orthogonal and hence, highly uncorrelated with each other. Hence, also solving the problem of multicollinearity.

image

Though PCA reduces dimensions but when dealing with multi-class data it’s necessary to reduce dimensions in a way that inter class separation is also taken care of. LDA is an algorithm used for the same. Let’s discuss it in detail :

  1. Reduces Dimensions
  2. Searches for a linear combination of variables that best separates 2 classes
  3. Reduces the degree of overfitting

Working of LDA: 

  1. Assume a set of D - dimensional samples {x(1, x(2, …, x(N}, N1 of which belong to class ω1 and N2 to class ω2

    Obtain a scalar y by projecting the samples x onto a line: Y = W^TX

    Of all the possible lines select the one that maximizes the separability of the scalars:

    image

answered Aug 3, 2018 by Anmol
• 3,620 points
0 votes

Principal Component Analysis (PCA) is an unsupervised learning algorithm as it ignores the class labels (the so-called principal components) that maximize the variance in a dataset, to find the directions. In other words, PCA is basically summarization of data.PCA does not select a set of features and discard other features, but it infers some new features, which best describe the type of class from the existing features.

PCA works on eigenvectors and eigenvalues of the covariance matrix, which is the equivalent of fitting those straight, principal-component lines to the variance of the data. Why? Because eigenvectors trace the principal lines of force, In other words, PCA determines the lines of variance in the dataset which are called as principal components with the first principal component having the maximum variance, second principal component having second maximum variance and so on.

Linear Discriminant Analysis is a supervised algorithm as it takes the class label into consideration. It is a way to reduce ‘dimensionality’ while at the same time preserving as much of the class discrimination information as possible.

LDA helps you find the boundaries around clusters of classes. It projects your data points on a line so that your clusters are as separated as possible, with each cluster having a relative (close) distance to a centroid.

So the question arises- how are these clusters are defined and how do we get the reduced feature set in case of LDA?

Basically LDA finds a centroid of each class datapoints. For example with thirteen different features LDA will find the centroid of each of its class using the thirteen different feature dataset. Now on the basis of this, it determines a new dimension which is nothing but an axis which should satisfy two criteria:
1. Maximize the distance between the centroid of each class.
2. Minimize the variation (which LDA calls scatter and is represented by s2), within each category.

PCA performs better in case where number of samples per class is less. Whereas LDA works better with large dataset having multiple classes; class separability is an important factor while reducing dimensionality.

answered Mar 6 by Seema
• 140 points
Thanks @Seema, that was very well explained.

Related Questions In Data Analytics

0 votes
1 answer
0 votes
1 answer

Define a SQL query? What is the difference between SELECT and UPDATE Query? How do you use SQL in SAS?

Structured query language (SQL) is a programming ...READ MORE

answered Aug 24, 2018 in Data Analytics by Anmol
• 3,620 points
160 views
0 votes
1 answer

What is the difference between library () and require () functions in R ?

 library() require() Library () function gives an error message ...READ MORE

answered Sep 5, 2018 in Data Analytics by zombie
• 3,690 points
86 views
0 votes
1 answer

What is the difference between rnorm and runif functions ?

rnorm function generates "n" normal random numbers ...READ MORE

answered Oct 10, 2018 in Data Analytics by zombie
• 3,690 points
155 views
0 votes
1 answer

How do I become a data scientist step by step?

I am assuming that you are a ...READ MORE

answered Jul 26, 2018 in Data Analytics by Anmol
• 3,620 points
99 views
+1 vote
1 answer

How do I perform feature selection in a disease prediction data set?

Feature selection is based equally upon logic ...READ MORE

answered Aug 20, 2018 in Data Analytics by Anmol
• 3,620 points
59 views
0 votes
2 answers

What will be first step to be a data scientist?

Your first steps towards becoming a top ...READ MORE

answered Aug 8, 2018 in Data Analytics by zombie
• 3,690 points
46 views
0 votes
2 answers

What is the difference between correlation and covariance?

Correlation and Co-variance both are used as ...READ MORE

answered Jul 24, 2018 in Data Analytics by Anmol
• 3,620 points
1,760 views
0 votes
1 answer

What is the difference between random forest and decision trees?

The basic difference is that Random Forest ...READ MORE

answered Jul 30, 2018 in Data Analytics by Anmol
• 3,620 points
493 views