{"id":1886,"date":"2019-01-12T12:00:00","date_gmt":"2019-01-12T11:00:00","guid":{"rendered":"https:\/\/kindsonthegenius.com\/blog\/principal-components-analysispca-in-python-step-by-step\/"},"modified":"2026-07-05T03:22:08","modified_gmt":"2026-07-05T01:22:08","slug":"principal-components-analysispca-in-python-step-by-step","status":"publish","type":"post","link":"https:\/\/kindsonthegenius.com\/blog\/principal-components-analysispca-in-python-step-by-step\/","title":{"rendered":"Principal Components Analysis(PCA) in Python \u2013 Step by Step"},"content":{"rendered":"<p>In this simple tutorial, we are going to learn how to perform Principal Components Analysis in Python.\u00a0 This tutorial would be completed using Jupyter Notebook. I assume you have Jupyter notebook installed. You can also learn about the concept of PCA from the following two tutorials:<\/p>\n<ul>\n<li><a href=\"https:\/\/kindsonthegenius.com\/tempsite\/pca-tutorial-1-introduction-to-pca-and-dimensionality-reduction\/\">Introduction to PCA and Dimensionality Reduction<\/a><\/li>\n<li><a href=\"https:\/\/kindsonthegenius.com\/tempsite\/pca-tutorial-1-how-to-perform-principal-components-analysis-pca\/\">How to Perform Principal Components Analysis &#8211; PCA (Theory)<\/a><\/li>\n<\/ul>\n<p>These are the following\u00a0 eight steps to performing PCA in Python:<\/p>\n<ul>\n<li><a href=\"#t1\">Step 1: Import the Neccessary Modules<\/a><\/li>\n<li><a href=\"#t2\">Step 2: Obtain Your Dataset<\/a><\/li>\n<li><a href=\"#t3\">Step 3: Preview Your Data<\/a><\/li>\n<li><a href=\"#t4\">Step 4: Standardize the Data<\/a><\/li>\n<li><a href=\"#t5\">Step 5: Perform PCA<\/a><\/li>\n<li><a href=\"#t6\">Step 6: Combine Target and Principal Components<\/a><\/li>\n<li><a href=\"#t7\">Step 7: Do a Scree Plot of the Principal Components<\/a><\/li>\n<li><a href=\"#t8\">Step 8: Visualize your New Data in 2D<\/a><\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<h5 id=\"t1\">Step 1: Import the Necessary Modules<\/h5>\n<p>The modules we would need are pandas, numpy, sklearn and matplotlib. To import them however, write the following import statement inside the first cell of Jupyter Notebook<\/p>\n<div style=\"background: #ffffff; overflow: auto; width: auto; border: solid gray; border-width: .1em .1em .1em .8em; padding: .2em .6em;\">\n<pre style=\"margin: 0; line-height: 125%;\"><span style=\"color: #008800; font-weight: bold;\">import<\/span> <span style=\"color: #0e84b5; font-weight: bold;\">pandas<\/span> <span style=\"color: #008800; font-weight: bold;\">as<\/span> <span style=\"color: #0e84b5; font-weight: bold;\">pd<\/span>\n<span style=\"color: #008800; font-weight: bold;\">import<\/span> <span style=\"color: #0e84b5; font-weight: bold;\">numpy<\/span> <span style=\"color: #008800; font-weight: bold;\">as<\/span> <span style=\"color: #0e84b5; font-weight: bold;\">np<\/span>\n<span style=\"color: #008800; font-weight: bold;\">from<\/span> <span style=\"color: #0e84b5; font-weight: bold;\">sklearn.decomposition<\/span> <span style=\"color: #008800; font-weight: bold;\">import<\/span> PCA\n<span style=\"color: #008800; font-weight: bold;\">from<\/span> <span style=\"color: #0e84b5; font-weight: bold;\">sklearn<\/span> <span style=\"color: #008800; font-weight: bold;\">import<\/span> preprocessing\n<span style=\"color: #008800; font-weight: bold;\">import<\/span> <span style=\"color: #0e84b5; font-weight: bold;\">matplotlib.pyplot<\/span> <span style=\"color: #008800; font-weight: bold;\">as<\/span> <span style=\"color: #0e84b5; font-weight: bold;\">plt<\/span>\n<\/pre>\n<\/div>\n<p>Listing 1.0: Import necessary modules<\/p>\n<p>&nbsp;<\/p>\n<h5 id=\"t2\">Step 2: Obtain the Dataset<\/h5>\n<p>The dataset would be obtained from UCI Machine Learning Repository. To do that, you can right-click on the link below and save a copy of the dataset to your local drive.<\/p>\n<p>Add the following lines to the next cell to load the dataset into a variable<\/p>\n<p><!-- HTML generated using hilite.me --><\/p>\n<div style=\"background: #ffffff; overflow: auto; width: auto; border: solid gray; border-width: .1em .1em .1em .8em; padding: .2em .6em;\">\n<pre style=\"margin: 0; line-height: 125%;\">url <span style=\"color: #333333;\">=<\/span> <span style=\"background-color: #fff0f0;\">\"https:\/\/archive.ics.uci.edu\/ml\/machine-learning-databases\/iris\/iris.data\"<\/span>\n\ndf <span style=\"color: #333333;\">=<\/span> pd<span style=\"color: #333333;\">.<\/span>read_csv(url, names<span style=\"color: #333333;\">=<\/span>[<span style=\"background-color: #fff0f0;\">'sepal length'<\/span>, <span style=\"background-color: #fff0f0;\">'sepal width'<\/span> , <span style=\"background-color: #fff0f0;\">'petal lenght'<\/span>, <span style=\"background-color: #fff0f0;\">'petal width'<\/span>, <span style=\"background-color: #fff0f0;\">'target'<\/span>])\n<\/pre>\n<\/div>\n<p>Listing 1.1: Obtain and load your dataset<\/p>\n<p>In Listing 1.1, the first line specifies the url of the dataset, the second line loads the dataset into a dataframe df (a dataframe is simply used to hold data).<\/p>\n<p><strong>pd.read_csv()<\/strong> is a function in pandas. The first argument is the path to the data, the second argument is a list of the column names. What this means is the that the first column of the data would be named &#8216;sepal lenght&#8217;, the second column is &#8216;sepal_width&#8217; and so on.<\/p>\n<p>When the code in Listing 2.1 executes, then your dataset is available in the variable df.<\/p>\n<p>&nbsp;<\/p>\n<h5 id=\"t3\">Step 3: Preview Your Data<\/h5>\n<p>So you can view your data by typing df into the next cell and running it as shown in Figure 1.0. You can also type print(df). In the table, there are four features, and one target(or class)<\/p>\n<figure id=\"attachment_419\" aria-describedby=\"caption-attachment-419\" style=\"width: 537px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-419\" src=\"https:\/\/www.kindsonthegenius.com\/wp-content\/uploads\/2020\/09\/DataFrame-in-Python.jpg\" alt=\"\" width=\"537\" height=\"299\" \/><figcaption id=\"caption-attachment-419\" class=\"wp-caption-text\">Figure 1.0: Data in DataFrame<\/figcaption><\/figure>\n<p>&nbsp;<\/p>\n<h5>Step 4: Perform Scaling on the Data<\/h5>\n<p>This means that we need to center and scale the data. In this way the average value of each record would be 0 and the variance for each record would be 1.<\/p>\n<p>To scale our data, we would use StandardScalar which is available in sklearn.<\/p>\n<p><span style=\"display: inline !important; float: none; background-color: transparent; color: #404040; cursor: text; font-family: 'Lato',sans-serif; font-size: 16px; font-style: normal; font-variant: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-decoration: none; text-indent: 0px; text-transform: none; -webkit-text-stroke-width: 0px; white-space: normal; word-spacing: 0px;\">Note that we are only going to scale the features and not the target.<\/span> So to do this, we<\/p>\n<ul>\n<li>first import StandardScalar<\/li>\n<li>separate the features from the target<\/li>\n<li>scale the features<\/li>\n<\/ul>\n<p>This three operations are accomplished using the four lines of codes below<\/p>\n<p><!-- HTML generated using hilite.me --><\/p>\n<div style=\"background: #ffffff; overflow: auto; width: auto; border: solid gray; border-width: .1em .1em .1em .8em; padding: .2em .6em;\">\n<pre style=\"margin: 0; line-height: 125%;\"><span style=\"color: #008800; font-weight: bold;\">from<\/span> <span style=\"color: #0e84b5; font-weight: bold;\">sklearn.preprocessing<\/span> <span style=\"color: #008800; font-weight: bold;\">import<\/span> StandardScaler\n\nfeatures <span style=\"color: #333333;\">=<\/span> [<span style=\"background-color: #fff0f0;\">'sepal length'<\/span>, <span style=\"background-color: #fff0f0;\">'sepal width'<\/span>, <span style=\"background-color: #fff0f0;\">'petal length'<\/span>, <span style=\"background-color: #fff0f0;\">'petal width'<\/span>]\n\nx <span style=\"color: #333333;\">=<\/span> df<span style=\"color: #333333;\">.<\/span>loc[:, features]<span style=\"color: #333333;\">.<\/span>values\n\ny <span style=\"color: #333333;\">=<\/span> df<span style=\"color: #333333;\">.<\/span>loc[:, [<span style=\"background-color: #fff0f0;\">'target'<\/span>]]<span style=\"color: #333333;\">.<\/span>values\n\nx <span style=\"color: #333333;\">=<\/span> StandardScaler()<span style=\"color: #333333;\">.<\/span>fit_transform(x)\n<\/pre>\n<\/div>\n<p>Listing 1.2: Separate features from target and standardize features<\/p>\n<p>&nbsp;<\/p>\n<h5 id=\"t5\">Step 5:\u00a0 Perform PCA<\/h5>\n<p>To then perform PCA we would use PCA module from sklearn which we have already imported in Step 1. In Listing 1.3, below, the first and the\u00a0 line performs the PCA, the third line loads the principal components into a dataframe. You can view your data by typing principalComponents or principalDataframe in a cell and running it.<\/p>\n<p><!-- HTML generated using hilite.me --><\/p>\n<div style=\"background: #ffffff; overflow: auto; width: auto; border: solid gray; border-width: .1em .1em .1em .8em; padding: .2em .6em;\">\n<pre style=\"margin: 0; line-height: 125%;\">pca <span style=\"color: #333333;\">=<\/span> PCA(n_components<span style=\"color: #333333;\">=<\/span><span style=\"color: #0000dd; font-weight: bold;\">2<\/span>)\n\nprincipalComponents <span style=\"color: #333333;\">=<\/span> pca<span style=\"color: #333333;\">.<\/span>fit_transform(x)\n\nprincipalDataframe <span style=\"color: #333333;\">=<\/span> pd<span style=\"color: #333333;\">.<\/span>DataFrame(data <span style=\"color: #333333;\">=<\/span> principalComponents, columns <span style=\"color: #333333;\">=<\/span> [<span style=\"background-color: #fff0f0;\">'PC1'<\/span>, <span style=\"background-color: #fff0f0;\">'PC2'<\/span>])\n<\/pre>\n<\/div>\n<p>Listing 1.3: PCA for two Principal Components<\/p>\n<p>&nbsp;<\/p>\n<h5 id=\"t6\">Step 6: Combine the Target and the Principal Components<\/h5>\n<p>Remember that the original data has five columns: four features an<span style=\"display: inline !important; float: none; background-color: transparent; color: #404040; cursor: text; font-family: 'Lato',sans-serif; font-size: 16px; font-style: normal; font-variant: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-decoration: none; text-indent: 0px; text-transform: none; -webkit-text-stroke-width: 0px; white-space: normal; word-spacing: 0px;\">d one target column. Now after performing PCA, we have just two columns for the features. The target dataset y was not touched. Therefore, we attached back the target column to the new set of principal components. To do that, use the code below.<\/span><!-- HTML generated using hilite.me --><\/p>\n<div style=\"background: #ffffff; overflow: auto; width: auto; border: solid gray; border-width: .1em .1em .1em .8em; padding: .2em .6em;\">\n<pre style=\"margin: 0; line-height: 125%;\">targetDataframe <span style=\"color: #333333;\">=<\/span> df[[<span style=\"background-color: #fff0f0;\">'target'<\/span>]]\n\nnewDataframe <span style=\"color: #333333;\">=<\/span> pd<span style=\"color: #333333;\">.<\/span>concat([principalDataframe, targetDataframe],axis <span style=\"color: #333333;\">=<\/span> <span style=\"color: #0000dd; font-weight: bold;\">1<\/span>)\n<\/pre>\n<\/div>\n<p>Listing 1.4: Combine Principal Components with target<\/p>\n<p>You can also view your new dataset by just typing newDataframe and running the cell.\u00a0 Your output would therefore be as shown in Figure 1.1<\/p>\n<figure id=\"attachment_420\" aria-describedby=\"caption-attachment-420\" style=\"width: 549px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-420\" src=\"https:\/\/www.kindsonthegenius.com\/wp-content\/uploads\/2020\/09\/newDataframe.jpg\" alt=\"\" width=\"549\" height=\"282\" \/><figcaption id=\"caption-attachment-420\" class=\"wp-caption-text\">Figure 1.1: New Dataset after performing PCA<\/figcaption><\/figure>\n<p>&nbsp;<\/p>\n<h5 id=\"t7\">Step 7: Perform a Scree Plot of the Principal Components<\/h5>\n<p>A scree plot is like a bar chart showing the size of each of the principal components. It helps us to visualize the percentage of variation captured by each of the principal components. To perform a scree plot you need to:<\/p>\n<ul>\n<li>first of all, create a list of columns<\/li>\n<li>\u00a0then, list of PCs<\/li>\n<li>finally, do the scree plot using plt<\/li>\n<\/ul>\n<p>Now, copy and past the code in listing 1.5 below into Jupyter Notebook and then run it. Consequently, your output would be as shown in Figure 1.2<\/p>\n<p><!-- HTML generated using hilite.me --><\/p>\n<div style=\"background: #ffffff; overflow: auto; width: auto; border: solid gray; border-width: .1em .1em .1em .8em; padding: .2em .6em;\">\n<pre style=\"margin: 0; line-height: 125%;\">percent_variance <span style=\"color: #333333;\">=<\/span> np<span style=\"color: #333333;\">.<\/span>round(pca<span style=\"color: #333333;\">.<\/span>explained_variance_ratio_<span style=\"color: #333333;\">*<\/span> <span style=\"color: #0000dd; font-weight: bold;\">100<\/span>, decimals <span style=\"color: #333333;\">=<\/span><span style=\"color: #0000dd; font-weight: bold;\">2<\/span>)\ncolumns <span style=\"color: #333333;\">=<\/span> [<span style=\"background-color: #fff0f0;\">'PC1'<\/span>, <span style=\"background-color: #fff0f0;\">'PC2'<\/span>, <span style=\"background-color: #fff0f0;\">'PC3'<\/span>, <span style=\"background-color: #fff0f0;\">'PC4'<\/span>]\nplt<span style=\"color: #333333;\">.<\/span>bar(x<span style=\"color: #333333;\">=<\/span> <span style=\"color: #007020;\">range<\/span>(<span style=\"color: #0000dd; font-weight: bold;\">1<\/span>,<span style=\"color: #0000dd; font-weight: bold;\">5<\/span>), height<span style=\"color: #333333;\">=<\/span>percent_variance, tick_label<span style=\"color: #333333;\">=<\/span>columns)\nplt<span style=\"color: #333333;\">.<\/span>ylabel(<span style=\"background-color: #fff0f0;\">'Percentate of Variance Explained'<\/span>)\nplt<span style=\"color: #333333;\">.<\/span>xlabel(<span style=\"background-color: #fff0f0;\">'Principal Component'<\/span>)\nplt<span style=\"color: #333333;\">.<\/span>title(<span style=\"background-color: #fff0f0;\">'PCA Scree Plot'<\/span>)\nplt<span style=\"color: #333333;\">.<\/span>show()\n<\/pre>\n<\/div>\n<p>Listing 1.5: PCA Scree Plot<\/p>\n<p>You can hence see the scree plot below.<\/p>\n<figure id=\"attachment_429\" aria-describedby=\"caption-attachment-429\" style=\"width: 735px\" class=\"wp-caption alignnone\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-429 size-large\" src=\"https:\/\/www.kindsonthegenius.com\/wp-content\/uploads\/2020\/09\/Scree-Plot-1024x483.jpg\" alt=\"\" width=\"735\" height=\"347\" \/><figcaption id=\"caption-attachment-429\" class=\"wp-caption-text\">Figure 1.3: Scree Plot<\/figcaption><\/figure>\n<p>&nbsp;<\/p>\n<h5 id=\"t8\">Step 8: Plot the Principal Components on 2D<\/h5>\n<p>Now we have performed PCA, we need to visualize the new dataset to see how PCA makes it easier to explain the original data. We would use scatter plot<\/p>\n<p><!-- HTML generated using hilite.me --><\/p>\n<div style=\"background: #ffffff; overflow: auto; width: auto; border: solid gray; border-width: .1em .1em .1em .8em; padding: .2em .6em;\">\n<pre style=\"margin: 0; line-height: 125%;\">plt<span style=\"color: #333333;\">.<\/span>scatter(principalDataframe<span style=\"color: #333333;\">.<\/span>PC1, principalDataframe<span style=\"color: #333333;\">.<\/span>PC2)\nplt<span style=\"color: #333333;\">.<\/span>title(<span style=\"background-color: #fff0f0;\">'PC1 against PC2'<\/span>)\nplt<span style=\"color: #333333;\">.<\/span>xlabel(<span style=\"background-color: #fff0f0;\">'PC1'<\/span>)\nplt<span style=\"color: #333333;\">.<\/span>ylabel(<span style=\"background-color: #fff0f0;\">'PC2'<\/span>)\n<\/pre>\n<\/div>\n<p>Listing 1.6:\u00a0 2D Plot of PC1 and PC2<\/p>\n<p>If you execute the code above then you will have the plot given in Figure 1.2<\/p>\n<figure id=\"attachment_421\" aria-describedby=\"caption-attachment-421\" style=\"width: 525px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-421 \" src=\"https:\/\/www.kindsonthegenius.com\/wp-content\/uploads\/2020\/09\/PCA-Plot-1.jpg\" alt=\"\" width=\"525\" height=\"296\" \/><figcaption id=\"caption-attachment-421\" class=\"wp-caption-text\">Figure 1.3: First PCA plot of PC1 and PC2<\/figcaption><\/figure>\n<p><strong>So what have we achieved?<\/strong><\/p>\n<p>We would repeat this plot this time with colors for each of the targets (Iris-setosa, Iris-versicolor and Iris-virginica). In this way we would see how PCA helps explain the data. However, to keep things simple, I would not explain this very code.\u00a0 Write and run the code below.<\/p>\n<p><!-- HTML generated using hilite.me --><\/p>\n<div style=\"background: #ffffff; overflow: auto; width: auto; border: solid gray; border-width: .1em .1em .1em .8em; padding: .2em .6em;\">\n<pre style=\"margin: 0; line-height: 125%;\">fig <span style=\"color: #333333;\">=<\/span> plt<span style=\"color: #333333;\">.<\/span>figure(figsize <span style=\"color: #333333;\">=<\/span> (<span style=\"color: #0000dd; font-weight: bold;\">8<\/span>,<span style=\"color: #0000dd; font-weight: bold;\">8<\/span>))\nax <span style=\"color: #333333;\">=<\/span> fig<span style=\"color: #333333;\">.<\/span>add_subplot(<span style=\"color: #0000dd; font-weight: bold;\">1<\/span>,<span style=\"color: #0000dd; font-weight: bold;\">1<\/span>,<span style=\"color: #0000dd; font-weight: bold;\">1<\/span>) \nax<span style=\"color: #333333;\">.<\/span>set_xlabel(<span style=\"background-color: #fff0f0;\">'PC1'<\/span>)\nax<span style=\"color: #333333;\">.<\/span>set_ylabel(<span style=\"background-color: #fff0f0;\">'PC2'<\/span>)\n\nax<span style=\"color: #333333;\">.<\/span>set_title(<span style=\"background-color: #fff0f0;\">'Plot of PC1 vs PC2'<\/span>, fontsize <span style=\"color: #333333;\">=<\/span> <span style=\"color: #0000dd; font-weight: bold;\">20<\/span>)\n\ntargets <span style=\"color: #333333;\">=<\/span> [<span style=\"background-color: #fff0f0;\">'Iris-setosa'<\/span>, <span style=\"background-color: #fff0f0;\">'Iris-versicolor'<\/span>, <span style=\"background-color: #fff0f0;\">'Iris-virginica'<\/span>]\n\ncolors <span style=\"color: #333333;\">=<\/span> [<span style=\"background-color: #fff0f0;\">'r'<\/span>, <span style=\"background-color: #fff0f0;\">'g'<\/span>, <span style=\"background-color: #fff0f0;\">'b'<\/span>]\n\n<span style=\"color: #008800; font-weight: bold;\">for<\/span> target, color <span style=\"color: #000000; font-weight: bold;\">in<\/span> <span style=\"color: #007020;\">zip<\/span>(targets,colors):\n    indicesToKeep <span style=\"color: #333333;\">=<\/span> newDataframe[<span style=\"background-color: #fff0f0;\">'target'<\/span>] <span style=\"color: #333333;\">==<\/span> target\n    ax<span style=\"color: #333333;\">.<\/span>scatter(newDataframe<span style=\"color: #333333;\">.<\/span>loc[indicesToKeep, <span style=\"background-color: #fff0f0;\">'PC1'<\/span>]\n               , newDataframe<span style=\"color: #333333;\">.<\/span>loc[indicesToKeep, <span style=\"background-color: #fff0f0;\">'PC2'<\/span>]\n               , c <span style=\"color: #333333;\">=<\/span> color\n               , s <span style=\"color: #333333;\">=<\/span> <span style=\"color: #0000dd; font-weight: bold;\">50<\/span>)\n    \nax<span style=\"color: #333333;\">.<\/span>legend(targets)\nax<span style=\"color: #333333;\">.<\/span>grid()\n<\/pre>\n<\/div>\n<p><span style=\"display: inline !important; float: none; background-color: transparent; color: #404040; cursor: text; font-family: 'Lato',sans-serif; font-size: 16px; font-style: normal; font-variant: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-decoration: none; text-indent: 0px; text-transform: none; -webkit-text-stroke-width: 0px; white-space: normal; word-spacing: 0px;\">Listing 1.7: Plot of PC1 vs PC2 with<\/span> color codes<\/p>\n<p>Likewise, if you execute the code in Listing 1.6 above, you will have the output given in Figure below:<\/p>\n<figure id=\"attachment_422\" aria-describedby=\"caption-attachment-422\" style=\"width: 485px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-422 \" src=\"https:\/\/www.kindsonthegenius.com\/wp-content\/uploads\/2020\/09\/PCA-Plot-2-1024x682.jpg\" alt=\"\" width=\"485\" height=\"323\" \/><figcaption id=\"caption-attachment-422\" class=\"wp-caption-text\">Figure 1.4: Final Plot of PC1 and PC2<\/figcaption><\/figure>\n<p><strong>Explaining the Variance Using Principal Component<\/strong><\/p>\n<p>Finally\u00a0 we need to see how the two principal components explain our data. To do that we would use the command below:<\/p>\n<p><!-- HTML generated using hilite.me --><\/p>\n<div style=\"background: #ffffff; overflow: auto; width: auto; border: solid gray; border-width: .1em .1em .1em .8em; padding: .2em .6em;\">\n<pre style=\"margin: 0; line-height: 125%;\">pca<span style=\"color: #333333;\">.<\/span>explained_variance_ratio_\n<\/pre>\n<\/div>\n<p>&nbsp;<\/p>\n<p>Then you will get the output:<\/p>\n<div class=\"output\">\n<div class=\"output_area\">\n<div class=\"output_subarea output_text output_result\">\n<pre>array([0.72770452, 0.23030523])<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<p>This values show that the first principal component PC1 explains <strong>72.77%<\/strong> of the variation in the original data while the second principal component explains<strong> 23.03%<\/strong> of the variation in the original data.<\/p>\n<p>In conclusion, this means that the original 4 dimensional data can be safely reduced to 2 dimensions using PCA because the dataset can be explained by only two components!<\/p>\n<p>Finally, I hope that this lesson has clearly helped you to see how you can perform Principal Components Analysis using Python.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this simple tutorial, we are going to learn how to perform Principal Components Analysis in Python.\u00a0 This tutorial would be completed using Jupyter Notebook. &hellip; <\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"pagelayer_contact_templates":[],"_pagelayer_content":"","footnotes":""},"categories":[7],"tags":[],"class_list":["post-1886","post","type-post","status-publish","format-standard","hentry","category-python-tutorials"],"_links":{"self":[{"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/posts\/1886","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/comments?post=1886"}],"version-history":[{"count":1,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/posts\/1886\/revisions"}],"predecessor-version":[{"id":2054,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/posts\/1886\/revisions\/2054"}],"wp:attachment":[{"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/media?parent=1886"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/categories?post=1886"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/tags?post=1886"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}