What is Principal Component Analysis and how do I create it's model in R
Jul 17, 2018 163 views

Principal Component Analysis is a method for dimensionality reduction. Many times, it happens that, one observation is related to multiple dimensions(features) and this brings in a lot of chaos to the data, that is why it is important to reduce the number of dimensions.

The concept of Principal Component Analysis is this:

• The data is transformed to a new space, with equal or less number of dimensions. These dimensions(features) are known as principal components.
• The first principal component captures the maximum amount of variance from the features in the original data.
• The second principal component is orthogonal to the first and captures the maximum amount of variability left.
• The same is true for each principal component, they are all uncorrelated and each is less important than the previous one.

You can do PCA in R with the help of “prcomp()” function.

• 6,140 points

Principal component analysis (PCA) is routinely employed on a wide range of problems. From the detection of outliers to predictive modeling, PCA has the ability of projecting the observations described by $p$ variables into few orthogonal components defined at where the data ‘stretch’ the most, rendering a simplified overview. PCA is particularly powerful in dealing with multicollinearity and variables that outnumber the samples ($p \gg n$).

It is an unsupervised method, meaning it will always look into the greatest sources of variation regardless of the data structure. Its counterpart, the partial least squares (PLS), is a supervised method and will perform the same sort of covariance decomposition, albeit building a user-defined number of components (frequently designated as latent variables) that minimize the SSE from predicting a specified outcome with an ordinary least squares (OLS).

Although there is a plethora of PCA methods available for R, I will only introduce two,

• prcomp, a default function from the R base package
• pcaMethods, a bioconductor package that I frequently use for my own PCAs
• 3,690 points

Create a tree model in R from data.frame?

See the below example to understand how ...READ MORE

SVM model in R

What is svm model? How to use ...READ MORE

+1 vote

Installing MXNet for R in Windows System

You can install it for python in ...READ MORE

Big Data transformations with R

Dear Koushik, Hope you are doing great. You can ...READ MORE

Finding frequency of observations in R

You can use the "dplyr" package to ...READ MORE

The below is the code to perform ...READ MORE