In this simple tutorial, I would explain the concept of Principal Components Analysis (PCA) in Machine Learning. I would try to be as simple and clear as possible.

The we would use Python in Tutorial 2 to actually do some of the hands-on, performing principal components analysis.

**What is Principal Components Analysis?**

Principal Components Analysis is an unsupervised learning class of statistical techniques used to explain data in high dimension using smaller number of variables called the principal components.

In PCA, we compute the principal component and used the to explain the data.

**How PCA Work?**

Assuming we have a set X made up of n measurements each represented by a set of * p* features, X

_{1}, X

_{2}, … , X

_{p}. If we want to plot this data in a 2-dimensional plane, we can plot n measurements using two features at a time. If the number of features are more than 3 or four then plotting this in two dimension will be a challenge as the number of plots would be p(p-1)/2 which would be hard to plot.

We would like to visualize this data in two dimension without losing information contained in the data. This is what PCA allows us to do.

**How to Computer Principal Components?**

Given a dataset X of dimension n x p, how do we compute the first principal components?

To do this we look for linear combination of the feature values of the form:

that has the largest sample variance subject to the constraint that:

This means that the first principal component loading vector solves the optimization problem such that we need to maximize the objective function subject to some constraint.

The objective function is given by:

And this is subject to the constraint:

The objective function (function to maximize) can be rewritten as:

Since this also holds:

Therefore the average of z_{11},…, z_{n1} will also be zero. Therefor the objective function that is being maximized is simply the sample variance of the n values of z_{i1}.

z_{11}, z_{2},…,z_{n1} are referred to as the scores of the first principal component.

**How then do we maximize the given objective function? **

We do this by performing eigen decomposition of the covariance matrix. Details of how to perform eigen decomposition is explained here.

**Explaining the Principal Components**

The loading vector Ф_{1} with elements Ф_{11}, Ф_{21},…,Ф_{p1} defines a direction in the feature space along which there is maximum variance in the data.

Thus, if we are to project the n data points x_{1}, x_{2},…, x_{n} onto this direction, then projected values are the actual principal component scores z_{11}, z_{21}, …, z_{n1}.

After the first principal components, Z_{1} of the features has been determined, then the second principal component is the linear combination of X_{1}, ,X_{2},… X_{p} that has the highest variance out of all the linear combinations that are uncorrelated with Z_{1}. The second principal component scores z_{12}, z_{22},…,z_{n2} take the form

where Ф_{2} is the second principal component loading vector, with elements Ф_{11}, Ф_{12}, … ,Ф_{p2} . It turns out that constraining Z_{2} to be uncorrelated with Z_{1} is the same as constraining the direction of Ф2 to be orthogonal to the direction of Ф_{1}

We would now take an example to see how PCA works.

I enjoy the article

Hey there just wanted to give you a quick heads up. The text in your post seem to be running off the screen in Firefox. I’m not sure if this is a format issue or something to do with browser compatibility but I figured I’d post to let you know. The style and design look great though! Hope you get the problem fixed soon. Thanks|

I want to to thank you for this great read!! I absolutely loved every little bit of it. I have you book marked to check out new stuff you post…|

We stumbled over here different page and thought I should check things out. I like what I see so now i am following you. Look forward to looking over your web page again.|

Everyone loves it when folks come together and share views. Great blog, keep it up!|

I’m not that much of a internet reader to be honest but your blogs really nice, keep it up! I’ll go ahead and bookmark your site to come back later. Cheers|

I think this is among the most vital information for me. And i am glad reading your article. But wanna remark on few general things, The website style is great, the articles is really nice : D. Good job, cheers|

I have learn some good stuff here. Definitely price bookmarking for revisiting. I wonder how a lot effort you set to create this kind of excellent informative site.|