What is Principal Component Analysis (PCA) - A Simple Tutorial

In this simple tutorial, I would explain the concept of Principal Components Analysis (PCA) in Machine Learning. I would try to be as simple and clear as possible.
The we would use Python in Tutorial 2 to actually do some of the hands-on, performing principal components analysis.

What is Principal Components Analysis?
Principal Components Analysis is an unsupervised learning class of statistical techniques used to explain data in high dimension using smaller number of variables called the principal components.
In PCA, we compute the principal component and used the to explain the data.

How PCA Work?
Assuming we have a set X made up of n measurements each represented by a set of p features, X₁, X₂, … , X_p. If we want to plot this data in a 2-dimensional plane, we can plot n measurements using two features at a time. If the number of features are more than 3 or four then plotting this in two dimension will be a challenge as the number of plots would be p(p-1)/2 which would be hard to plot.
We would like to visualize this data in two dimension without losing information contained in the data. This is what PCA allows us to do.

How to Computer Principal Components?
Given a dataset X of dimension n x p, how do we compute the first principal components?
To do this we look for linear combination of the feature values of the form:

that has the largest sample variance subject to the constraint that:

This means that the first principal component loading vector solves the optimization problem such that we need to maximize the objective function subject to some constraint.
The objective function is given by:

And this is subject to the constraint:

The objective function (function to maximize) can be rewritten as:

Since this also holds:

Therefore the average of z₁₁,…, z_n1 will also be zero. Therefor the objective function that is being maximized is simply the sample variance of the n values of z_i1.
z₁₁, z₂,…,z_n1 are referred to as the scores of the first principal component.

How then do we maximize the given objective function?
We do this by performing eigen decomposition of the covariance matrix. Details of how to perform eigen decomposition is explained here.

Explaining the Principal Components
The loading vector Ф₁ with elements Ф₁₁, Ф₂₁,…,Ф_p1 defines a direction in the feature space along which there is maximum variance in the data.
Thus, if we are to project the n data points x₁, x₂,…, x_n onto this direction, then projected values are the actual principal component scores z₁₁, z₂₁, …, z_n1.

After the first principal components, Z₁ of the features has been determined, then the second principal component is the linear combination of X₁, ,X₂,… X_p that has the highest variance out of all the linear combinations that are uncorrelated with Z₁. The second principal component scores z₁₂, z₂₂,…,z_n2 take the form

where Ф₂ is the second principal component loading vector, with elements Ф₁₁, Ф₁₂, … ,Ф_p2 . It turns out that constraining Z₂ to be uncorrelated with Z₁ is the same as constraining the direction of Ф2 to be orthogonal to the direction of Ф₁
We would now take an example to see how PCA works.

0 0 votes

Article Rating

8 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Shelli

6 years ago

I enjoy the article

Maureen Sivia

Hey there just wanted to give you a quick heads up. The text in your post seem to be running off the screen in Firefox. I’m not sure if this is a format issue or something to do with browser compatibility but I figured I’d post to let you know. The style and design look great though! Hope you get the problem fixed soon. Thanks|

Roland Ignacio

I want to to thank you for this great read!! I absolutely loved every little bit of it. I have you book marked to check out new stuff you post…|

Zonia Sixtos

We stumbled over here different page and thought I should check things out. I like what I see so now i am following you. Look forward to looking over your web page again.|

Inger Werra

Everyone loves it when folks come together and share views. Great blog, keep it up!|

Oda Erbes

I’m not that much of a internet reader to be honest but your blogs really nice, keep it up! I’ll go ahead and bookmark your site to come back later. Cheers|

Willie Leitem

I think this is among the most vital information for me. And i am glad reading your article. But wanna remark on few general things, The website style is great, the articles is really nice : D. Good job, cheers|

Genevive Mena

I have learn some good stuff here. Definitely price bookmarking for revisiting. I wonder how a lot effort you set to create this kind of excellent informative site.|

You might also like

Classification in Machine Learning

3 Approaches to Classification in Machine Learning

Basics of Decision Theory – How Medical Diagnosis Apps Work