{"id":192,"date":"2018-01-10T18:58:00","date_gmt":"2018-01-10T17:58:00","guid":{"rendered":"https:\/\/kindsonthegenius.com\/blog\/2018\/01\/10\/introduction-to-support-vector-machine-svm\/"},"modified":"2020-08-22T14:38:04","modified_gmt":"2020-08-22T12:38:04","slug":"introduction-to-support-vector-machine-svm","status":"publish","type":"post","link":"https:\/\/kindsonthegenius.com\/blog\/introduction-to-support-vector-machine-svm\/","title":{"rendered":"Introduction to Support Vector Machine (SVM)"},"content":{"rendered":"<p>Today we would give a clear and simple explanation of Support Vector Machines. We would discuss the basics of support vector machines in very clear terms.<\/p>\n<p><a href=\"https:\/\/youtu.be\/Wa0Re_U3W38\" target=\"_blank\" rel=\"noopener\">Video 1: Introduction to Support Vector Machines Video<\/a><br \/>\n<a href=\"https:\/\/youtu.be\/xud3_VaeWUw\" target=\"_blank\" rel=\"noopener\">Video 2: Support Vector Machines Tutorials<\/a><\/p>\n<div style=\"clear: both; text-align: center;\">\u00a0<a style=\"margin-left: 1em; margin-right: 1em;\" href=\"https:\/\/1.bp.blogspot.com\/-nVJgEr5ST2o\/WlZh1SjC9YI\/AAAAAAAAAtg\/aQz5zx_MtfUwjv3tWhvqwyNrlw1RRlIpgCLcBGAs\/s1600\/Introduction-to-Support-Vector-Machines%2528SVM%2529-in-Machine-Learning.jpg\" target=\"_blank\" rel=\"noopener\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/1.bp.blogspot.com\/-nVJgEr5ST2o\/WlZh1SjC9YI\/AAAAAAAAAtg\/aQz5zx_MtfUwjv3tWhvqwyNrlw1RRlIpgCLcBGAs\/s320\/Introduction-to-Support-Vector-Machines%2528SVM%2529-in-Machine-Learning.jpg\" width=\"320\" height=\"165\" border=\"0\" data-original-height=\"593\" data-original-width=\"1144\" \/><\/a><\/div>\n<p>We would cover the following 6 topics in this lesson.<\/p>\n<ol>\n<li><a href=\"#t1\">What are Support Vector Machines?<\/a><\/li>\n<li><a href=\"#t2\">How does SVMs Work<\/a><\/li>\n<li><a href=\"#t3\">Maximum-Margin Hyperplane <\/a><\/li>\n<li><a href=\"#t4\">What are the advantages of SVM?<\/a><\/li>\n<li><a href=\"#t5\">What are the Cons of SVM?\u00a0<\/a><\/li>\n<li><a href=\"#t6\">Introduction to the Kernel Trick <\/a><\/li>\n<li><a href=\"#t7\">Final Notes<\/a><\/li>\n<\/ol>\n<p>&nbsp;<\/p>\n<p>You need to note the following points<\/p>\n<ul>\n<li>Support Vector Machine is a binary classification algorithm that maximises the margin<\/li>\n<li>Support Vectors are points that are closest to each the classes while the maximum margin hyperplane is the hyperplane mid-point between these two. That is why it is called maximum margin hyperplane<\/li>\n<li>The support vectors determine the margin<\/li>\n<li>The distance between the planes is called a margin<\/li>\n<li>The SVM algorithm tries to find the largest margin between the hyperplanes<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<h3 id=\"t1\">1. What is a Support Vector Machine?<\/h3>\n<p>A support vector machine(svm) also called support vector networks is a supervised learning method that is used for regression, classification and outlier detection.<br \/>\nGiven a set of training data set, with each observation marked as belonging to either of two classes, as Support Vector Machine develops a model that assigns a new observation to one class or the other.<\/p>\n<p>&nbsp;<\/p>\n<h3 id=\"t2\">2. How Does Support Vector Machines Work?<\/h3>\n<p>SVM separates the two classes of data using a hyperplane?<br \/>\n<span style=\"color: black;\"><i>But what is a Hyperplane anyway?<\/i><\/span><br \/>\nHyperplanes are planes in a lower dimensional space than the given input data that separetes the data into classes. If a space is a 3-dimensional space, then the hyperplanes are 2-dimensionas planes. If a space is 2-dimensional, then its hyperplane are 1-dimensional lines.<br \/>\nNow that you understand what hyperplanes are, let&#8217;s continue our discussion on how SVM work.<\/p>\n<p>&nbsp;<\/p>\n<h3 id=\"t3\">3. The Maximum-Margin HyperPlane<\/h3>\n<p>Assuming a linearly separable dataset, we can then select two hyperplanes that seperate the two data classes such that the distance between them is as large as possible. The distance between these two hyperplanes is known as the &#8216;margin&#8217; or decision boundary for each of the classes. This is illustrated in Figure 2 below.<\/p>\n<table style=\"margin-left: auto; margin-right: auto; text-align: center;\" cellspacing=\"0\" cellpadding=\"0\" align=\"center\">\n<tbody>\n<tr>\n<td style=\"text-align: center;\"><a style=\"margin-left: auto; margin-right: auto;\" href=\"https:\/\/2.bp.blogspot.com\/-gYhgDw__5-Q\/WlZTYxws5FI\/AAAAAAAAAtA\/AyMMVQBhSw0mUAf_u0vgT53b24cusavAgCLcBGAs\/s1600\/Support-Vector-Machines-HyperPlanes.jpg\" target=\"_blank\" rel=\"noopener\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/2.bp.blogspot.com\/-gYhgDw__5-Q\/WlZTYxws5FI\/AAAAAAAAAtA\/AyMMVQBhSw0mUAf_u0vgT53b24cusavAgCLcBGAs\/s400\/Support-Vector-Machines-HyperPlanes.jpg\" width=\"400\" height=\"232\" border=\"0\" data-original-height=\"571\" data-original-width=\"981\" \/><\/a><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><b>Figure 2:<\/b> Margin between the two HyperPlanes<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The maximum-margin hyperplane is one that lies at the midpoint between these two hyperplanes.<\/p>\n<p>Now the two classes could be defined using the linear model given by:<\/p>\n<div style=\"text-align: center;\"><span style=\"color: black;\"><i><span style=\"font-family: 'georgia' , 'times new roman' , serif;\">y(x) = w<sup>T<\/sup>\u03c6(<b><span style=\"font-family: 'helvetica neue' , 'arial' , 'helvetica' , sans-serif;\">x<\/span><\/b>) + b<\/span><\/i><\/span><\/div>\n<p>where \u03c6(x) is a transformation function and<br \/>\nb is the bias<\/p>\n<p>For the example above example in Figure 2, the parameters has to\u00a0 be chosen such that for the blue data points, y(x) &lt; 0 and for the red data points y(x) &gt; 0.<br \/>\nSo we can now define the equations for the two classes as<\/p>\n<p>For the blue class on the left<i><\/i><\/p>\n<div style=\"text-align: center;\"><span style=\"color: black;\"><i><span style=\"font-family: 'georgia' , 'times new roman' , serif;\">w<sup>T<\/sup>\u03c6(<b><span style=\"font-family: 'trebuchet ms' , sans-serif;\">x<\/span><\/b>) + b = -1<\/span><\/i><\/span><\/div>\n<p>For the red class on the right:<\/p>\n<div style=\"text-align: center;\"><span style=\"color: black;\"><i><span style=\"font-family: 'georgia' , 'times new roman' , serif;\">w<sup>T<\/sup>\u03c6(<span style=\"font-family: 'trebuchet ms' , sans-serif;\"><b><span style=\"font-family: 'verdana' , sans-serif;\">x<\/span><\/b><\/span>) + b = +1<\/span><\/i><\/span><\/div>\n<p>This means that each data point must lie on the correct margin<br \/>\nWe can now determine the maximum-margin hyperplane by fitting a line exactly half-way between the two hyperplanes. This is illustrated in Figure 3 below.<\/p>\n<p>&nbsp;<\/p>\n<table style=\"margin-left: auto; margin-right: auto; text-align: center;\" cellspacing=\"0\" cellpadding=\"0\" align=\"center\">\n<tbody>\n<tr>\n<td style=\"text-align: center;\"><a style=\"margin-left: auto; margin-right: auto;\" href=\"https:\/\/3.bp.blogspot.com\/-eV-3s1lxJnE\/WlZaGFFr7CI\/AAAAAAAAAtQ\/Ld3U5eJ7TCYsXK_PjgztziUZ7SLX6P3zgCLcBGAs\/s1600\/Maximum-margin-hyperplane.jpg\" target=\"_blank\" rel=\"noopener\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/3.bp.blogspot.com\/-eV-3s1lxJnE\/WlZaGFFr7CI\/AAAAAAAAAtQ\/Ld3U5eJ7TCYsXK_PjgztziUZ7SLX6P3zgCLcBGAs\/s400\/Maximum-margin-hyperplane.jpg\" width=\"400\" height=\"227\" border=\"0\" data-original-height=\"573\" data-original-width=\"1006\" \/><\/a><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><b>Figure 3:<\/b> Maximum-Margin Hyperplane<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><ins style=\"display: block; text-align: center;\" data-ad-layout=\"in-article\" data-ad-format=\"fluid\" data-ad-client=\"ca-pub-7041870931346451\" data-ad-slot=\"4209786523\"><\/ins><br \/>\n<b><\/b><\/p>\n<h3 id=\"t4\">4. Advantage of SVM<\/h3>\n<p>&nbsp;<\/p>\n<ul>\n<li>SVMs are very efficient in handling data in high-dimensional space<\/li>\n<li>It is memory efficient because it uses a subset of training points in the support vectors<\/li>\n<li>It is very effective when the number of dimensions is greater than the number of observations<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<h3 id=\"t5\">5. Cons of SVM<\/h3>\n<p>&nbsp;<\/p>\n<ul>\n<li>It is not suitable for very large data sets<\/li>\n<li>It is not very efficient for data set have much outliers<\/li>\n<li>It doen&#8217;t directly provide an indication of probability estimation<\/li>\n<\/ul>\n<p><ins style=\"display: block; text-align: center;\" data-ad-layout=\"in-article\" data-ad-format=\"fluid\" data-ad-client=\"ca-pub-7041870931346451\" data-ad-slot=\"4209786523\"><\/ins><br \/>\n<b><\/b><\/p>\n<h3 id=\"t6\">6. Introduction to the Kernel Trick<\/h3>\n<p>When it comes to solving complex classification function e.g for very large data sets, then the simple SVM method may not be very effective. A method known as the kernel trick is employed.<br \/>\nA function that takes as input, vectors in the original input space and returns the dot product of the vectors in the feature space is called a Kernel Function or the Kernel Trick. We would not discuss more on the kernel trick here since there is another complete lesson on the Kernel Trick<\/p>\n<p>&nbsp;<\/p>\n<h3 id=\"t7\">7. Final Notes<\/h3>\n<p>We have examined the basics of SVM or Introduction to Support Vector Machines. This lesson is meant to give you an overview of what the concept is. For indepth study of SVM, I would recommend getting a\u00a0 textbook such as &#8216;<i>Pattern Recognition and Machine Learning<\/i>&#8216; by Bishop.<\/p>\n<p>I would like to thank you for reading. For any observation, you can leave a comment on the form on the left side of the page.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Today we would give a clear and simple explanation of Support Vector Machines. We would discuss the basics of support vector machines in very clear &hellip; <\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"pagelayer_contact_templates":[],"_pagelayer_content":"","footnotes":""},"categories":[11,16,395],"tags":[],"class_list":["post-192","post","type-post","status-publish","format-standard","hentry","category-learn-machine-learning","category-machine-learning","category-unsupervised-learning"],"_links":{"self":[{"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/posts\/192","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/comments?post=192"}],"version-history":[{"count":2,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/posts\/192\/revisions"}],"predecessor-version":[{"id":932,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/posts\/192\/revisions\/932"}],"wp:attachment":[{"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/media?parent=192"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/categories?post=192"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/tags?post=192"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}