+1 vote
I have a dataset with n number of dimensions what should be the ideal algorithm to approach it. Aug 2, 2018 6,902 views

## 2 answers to this question.

PCA is a Dimensionality Reduction algorithm.

Basically, its a machine learning based technique to extract hidden factors from the dataset. • Defines your data using lesser number of components to explain the variance in your data
• Reduces the number of dimensions in the data such that your computational complexity is reduced

Working of PCA:

Consider a scenario where you have data on x and y axis: Applying PCA results into the generation of components, such that they are orthogonal and hence, highly uncorrelated with each other. Hence, also solving the problem of multicollinearity. Though PCA reduces dimensions but when dealing with multi-class data it’s necessary to reduce dimensions in a way that inter class separation is also taken care of. LDA is an algorithm used for the same. Let’s discuss it in detail :

1. Reduces Dimensions
2. Searches for a linear combination of variables that best separates 2 classes
3. Reduces the degree of overfitting

Working of LDA:

1. Assume a set of D - dimensional samples {x(1, x(2, …, x(N}, N1 of which belong to class ω1 and N2 to class ω2

Obtain a scalar y by projecting the samples x onto a line: Y = W^TX

Of all the possible lines select the one that maximizes the separability of the scalars:  answered Aug 3, 2018 by
• 3,720 points

Principal Component Analysis (PCA) is an unsupervised learning algorithm as it ignores the class labels (the so-called principal components) that maximize the variance in a dataset, to find the directions. In other words, PCA is basically summarization of data.PCA does not select a set of features and discard other features, but it infers some new features, which best describe the type of class from the existing features.

PCA works on eigenvectors and eigenvalues of the covariance matrix, which is the equivalent of fitting those straight, principal-component lines to the variance of the data. Why? Because eigenvectors trace the principal lines of force, In other words, PCA determines the lines of variance in the dataset which are called as principal components with the first principal component having the maximum variance, second principal component having second maximum variance and so on.

Linear Discriminant Analysis is a supervised algorithm as it takes the class label into consideration. It is a way to reduce ‘dimensionality’ while at the same time preserving as much of the class discrimination information as possible.

LDA helps you find the boundaries around clusters of classes. It projects your data points on a line so that your clusters are as separated as possible, with each cluster having a relative (close) distance to a centroid.

So the question arises- how are these clusters are defined and how do we get the reduced feature set in case of LDA?

Basically LDA finds a centroid of each class datapoints. For example with thirteen different features LDA will find the centroid of each of its class using the thirteen different feature dataset. Now on the basis of this, it determines a new dimension which is nothing but an axis which should satisfy two criteria:
1. Maximize the distance between the centroid of each class.
2. Minimize the variation (which LDA calls scatter and is represented by s2), within each category.

PCA performs better in case where number of samples per class is less. Whereas LDA works better with large dataset having multiple classes; class separability is an important factor while reducing dimensionality. answered Mar 6, 2019 by
• 140 points
Thanks @Seema, that was very well explained.

## What is the difference between [] and [[]] notations to access the elements of a list or dataframe in R?

R provides 3 basic indexing operators. Refer ...READ MORE

## What is the difference between library () and require () functions in R ?

library() require() Library () function gives an error message ...READ MORE

## What is the difference between rnorm and runif functions ?

rnorm function generates "n" normal random numbers ...READ MORE

## What is the difference between R and SPSS?

One of the main difference is R ...READ MORE

## On a given dataset would time taken to train n - random forest be equal to time taken to train n X (Decision tree)

No, the time to train the random ...READ MORE

## How do I become a data scientist step by step?

I am assuming that you are a ...READ MORE

+1 vote

## How do I perform feature selection in a disease prediction data set?

Feature selection is based equally upon logic ...READ MORE

+1 vote