Dimensionality Reduction and Principal Component Analysis (PCA)

We would explain the concept of dimensionality reduction in a very simple way. In line with the principle of my articles, I would try to be as clear as possible.
In this lesson, we would focus on the explaining the concept and in the next lesson, we would look at the underlying derivation of the technique.

  • Problem with High Dimensional Data
  • What is Dimensionality Reduction
  • Two Type of Dimensionality Reduction
  • What is Principal Components Analysis
  • Methods of Dimensionality Reduction
  • How PCA Works
    • Demonstration of PCA
    • Obtain Covariance Matrix
    • Obtain Eigen Pairs
    • Obtain Scores and Loadings
  • Extract the Principal Components
  • Summary

The Simple Explanation
You already know that if you are given data in two dimension, say x and y, you could probably plot the graph and see the relationship. What if you are given data in three dimension? You could still try to create the plot but if the data is large enough, visualizing the plot would be difficult.
Now what if the data is in 10 or 20 or even 100  and more? How could you plot it? Even if you could, you find out that it may not make much sense. This is were dimensionality reduction or dimensional reduction comes in.

Formal Definition of Dimensionality Reduction
“…is the process of reducing the number of random variables under consideration by obtaining a set of principal variables” – Wikipedia
“… it the process of reducing the number of variables or features in review” – Big Data University

Problem of High-Dimensional Data

  • training a model with high-dimensional data requires much time-space complexity
  • Overfitting
  • Not all the features of the data are relevant to the problem being solved
  • Data in lower dimension has lower noise(unnecessary parts of the data)

Type of Dimensionality Reduction
The two types of dimensionality reduction are:
1. Feature Extraction: This technique has to do with finding new features in the data after it has been transformed from a high-dimensional space to a low dimensional space.

2. Feature Selection: This have to do with finding the most relevant features to a problem. This is done by obtaining a subset or key features of the original variables

Methods of Dimesionality Reduction

Principal Component Analysis(PCA): This is a classical method that provides a sequence of best linear approximations to a given high-dimensional observation. It is one of the most popular dimensionality reduction techniques. However, its effectiveness is limited by its global linearity/
Multidimensional Scaling(MDS): This technique is closely related to PCA  and have the same limitations as PCA.
Factor Analysis: This technique assumes that the underlying manifold is a linear subspace.
Independent Component Analysis(ICA): This technique starts from a factor analysis solution and searches for rotations that lead to independent components.
Principal Component Analysis (PCA)
PCA is a variance-maximising technique that projects the original data  onto a direction that maximizes variance. PCA performs a linear mapping of the original data to a lower-dimensional space such that the variance of the data in the low-dimensional representation is maximized.
How PCA Works
In math terms, PCA is the performed by carrying out and eigen-decomposition of the co-variance matrix.
The result would be a set of eigenvectors and a set of eigenvalues which can then be used to describe the original data. 
A Little More Details
An eigenvector in linear algebra is a vector that would not change its direction under associated linear transformation. If we  have a non-zero vector v, then its an eigenvector of a square matrix A is Av is a scalar multiple of v.

The eigenvalue is a scalar characteristic value associated with the eigenvector v
Eigenvectors are the coefficients attached to the eigenvectors and that is what gives the axes their magnitude.

Dimensionality Reduction reduces data in high dimension to lower dimension by obtaining the principal components

PCA is performed by:

  • constructing a co-variance matrix
  • performing an eigen-decomposition of that matrix to obtain a set of eigenvectors (W)
  • columns of W are ordered by the size of their corresponding eigenvalues
  • choose the first n columns of W and use it to describe your data

In the next lesson(which would be a web video), we would actually some of the derivations behind Principal Component Analysis).
We would also perform PCA on real data using MatLAB and R.
So you can follow this course to get updates(just click on the follow button under the name of the author) and also subscribe to the video channel here.