{"id":177,"date":"2018-01-21T08:54:00","date_gmt":"2018-01-21T07:54:00","guid":{"rendered":"https:\/\/kindsonthegenius.com\/blog\/2018\/01\/21\/how-to-minimize-misclassifcation-rate-in-classification-machine-learning\/"},"modified":"2020-08-22T09:23:07","modified_gmt":"2020-08-22T07:23:07","slug":"how-to-minimize-misclassifcation-rate-in-classification-machine-learning","status":"publish","type":"post","link":"https:\/\/kindsonthegenius.com\/blog\/how-to-minimize-misclassifcation-rate-in-classification-machine-learning\/","title":{"rendered":"How to Minimize Misclassifcation Rate in Classification (Machine Learning)"},"content":{"rendered":"<div style=\"color: #555555; font-size: 18px; line-height: 30px; text-align: justify;\">\n<div style=\"font-family: 'segoe ui';\">Remember that classification is a supervised learning concept that has to do with determining the the class a new input variable belongs.<br \/>In trying to assign input variables to classes, there is possibility that we may assign to a wrong class. This is called misclassification.<\/p>\n<div style=\"clear: both; text-align: center;\"><a href=\"https:\/\/4.bp.blogspot.com\/-14mo2qXuYhc\/WmRVMQTXb6I\/AAAAAAAAA3I\/e_i8nfgyWnARbCcbi_nKkzslUwEhSBwAQCLcBGAs\/s1600\/Minimizing%2BMisclassification%2B-%2BThumbnail.jpg\" style=\"margin-left: 1em; margin-right: 1em;\" target=\"_blank\" rel=\"noopener\"><img loading=\"lazy\" decoding=\"async\" border=\"0\" data-original-height=\"518\" data-original-width=\"1263\" height=\"131\" src=\"https:\/\/4.bp.blogspot.com\/-14mo2qXuYhc\/WmRVMQTXb6I\/AAAAAAAAA3I\/e_i8nfgyWnARbCcbi_nKkzslUwEhSBwAQCLcBGAs\/s320\/Minimizing%2BMisclassification%2B-%2BThumbnail.jpg\" width=\"320\" \/><\/a><\/div>\n<p>By definition, misclassifcation occurs when an input variable is assigned to the wrong class.<\/p>\n<p>One of the goals of classification is to minimize the number of misclassifications. This is done by defining a rule that assigns input x to one of the available classes.<br \/>The approach is to divide the input space into regions <span style=\"color: black;\"><i><span style=\"font-family: Times, &quot;Times New Roman&quot;, serif;\">R<sub>k<\/sub><\/span><\/i><\/span> called decision regions, one region for each class.<br \/><span style=\"color: black;\"><i><span style=\"font-family: Times, &quot;Times New Roman&quot;, serif;\">R<sub>k<\/sub><\/span><\/i><\/span> is assigned to class <span style=\"font-family: Times, &quot;Times New Roman&quot;, serif;\"><span style=\"color: black;\"><i>C<sub>k<\/sub><\/i><\/span><\/span><\/p>\n<p>Consider the case of two classes C<sub>1<\/sub> and C<sub>2<\/sub>. A mistake occurs when an input vector belonging to R<sub>1<\/sub> is assigned to C<sub>2<\/sub> or vector x belonging to R<sub>2<\/sub> is assigned to C<sub>1<\/sub>.<br \/>We can represent this as follows:<\/p>\n<div style=\"clear: both; text-align: center;\"><\/div>\n<p><\/p>\n<div style=\"clear: both; text-align: center;\"><a href=\"https:\/\/4.bp.blogspot.com\/-S99PbJ4AVYU\/WmLDA9S4V3I\/AAAAAAAAA04\/Ee668P0jODoW4IlXDwgZBYOayWgJDfjxgCLcBGAs\/s1600\/Minimizing%2BMisclassification.jpg\" style=\"margin-left: 1em; margin-right: 1em;\" target=\"_blank\" rel=\"noopener\"><img loading=\"lazy\" decoding=\"async\" border=\"0\" data-original-height=\"203\" data-original-width=\"964\" height=\"81\" src=\"https:\/\/4.bp.blogspot.com\/-S99PbJ4AVYU\/WmLDA9S4V3I\/AAAAAAAAA04\/Ee668P0jODoW4IlXDwgZBYOayWgJDfjxgCLcBGAs\/s400\/Minimizing%2BMisclassification.jpg\" width=\"400\" \/><\/a><\/div>\n<div style=\"clear: both; text-align: center;\"><\/div>\n<p>To minimize misclassification, we must choose to assign x to which of the classes has the smaller value of the integrand.<\/p>\n<p>So if <span style=\"color: black;\"><span style=\"font-family: Times, &quot;Times New Roman&quot;, serif;\"><i>p(x, C<sub>1<\/sub>)<\/i><\/span><\/span>&nbsp; is greater than <span style=\"font-family: Times, &quot;Times New Roman&quot;, serif;\"><span style=\"color: black;\"><i>p(x, C<sub>2<\/sub>)<\/i><\/span><\/span>, then x would be assigned to C<sub>1<\/sub>.<\/p>\n<p>Using the product rule, we can&nbsp; determine the posterior probability:<\/p>\n<div style=\"text-align: center;\"><span style=\"color: black;\"><i><span style=\"font-family: &quot;times&quot; , &quot;times new roman&quot; , serif;\">p(x, C<sub>k<\/sub>) = p(C<sub>k<\/sub> | x)p(x)<\/span><\/i><\/span><\/div>\n<p>Note that the term<span style=\"color: black;\"><i><span style=\"font-family: &quot;times&quot; , &quot;times new roman&quot; , serif;\"> p(C<sub>k<\/sub> | x)<\/span><\/i><\/span> is known as the posterior probability and x should be assigned to the class having the largest posterior probability <span style=\"color: black;\"><i><span style=\"font-family: &quot;times&quot; , &quot;times new roman&quot; , serif;\">p(C<sub>k<\/sub> | x).<\/span><\/i><\/span><\/p>\n<p>For the more general case of K classes, it would be a bit easier to maximize the probability of being correct and this is given by:<\/p>\n<div style=\"clear: both; text-align: center;\"><a href=\"https:\/\/3.bp.blogspot.com\/-T0OI-heI-qM\/WmRQ_i5bCoI\/AAAAAAAAA28\/LYXPweqbAeIokCeQhXMJ4hpuHVk5fcIUACLcBGAs\/s1600\/Minimizing%2BMisclassification%2B-%2BP-Correct.jpg\" style=\"margin-left: 1em; margin-right: 1em;\" target=\"_blank\" rel=\"noopener\"><img loading=\"lazy\" decoding=\"async\" border=\"0\" data-original-height=\"331\" data-original-width=\"752\" height=\"140\" src=\"https:\/\/3.bp.blogspot.com\/-T0OI-heI-qM\/WmRQ_i5bCoI\/AAAAAAAAA28\/LYXPweqbAeIokCeQhXMJ4hpuHVk5fcIUACLcBGAs\/s320\/Minimizing%2BMisclassification%2B-%2BP-Correct.jpg\" width=\"320\" \/><\/a><\/div>\n<div style=\"clear: both; text-align: center;\"><\/div>\n<p>This means the to minimize misclassification, we need to maximize this probability over the region <span style=\"color: black;\"><i><span style=\"font-family: Times, &quot;Times New Roman&quot;, serif;\">R<sub>k<\/sub><\/span><\/i><\/span>.<br \/>Using the product rule which states that:<\/p>\n<div style=\"text-align: center;\"><span style=\"color: black;\"><i><span style=\"font-family: &quot;times&quot; , &quot;times new roman&quot; , serif;\">p(<\/span><\/i><b><span style=\"font-family: &quot;times&quot; , &quot;times new roman&quot; , serif;\">x<\/span><\/b><i><span style=\"font-family: &quot;times&quot; , &quot;times new roman&quot; , serif;\">, C<sub>k<\/sub>) = p(C<sub>k<\/sub> | <\/span><\/i><span style=\"font-family: &quot;times&quot; , &quot;times new roman&quot; , serif;\"><b>x<\/b><\/span><i><span style=\"font-family: &quot;times&quot; , &quot;times new roman&quot; , serif;\">)p(x)<\/span><\/i><\/span><\/div>\n<p>We can see that each class has to be assigned to the class that have the highest posterior probability&nbsp; <span style=\"color: black;\"><i><span style=\"font-family: &quot;times&quot; , &quot;times new roman&quot; , serif;\">p(C<sub>k<\/sub> | <\/span><\/i><span style=\"font-family: &quot;times&quot; , &quot;times new roman&quot; , serif;\"><b>x<\/b><\/span><i><span style=\"font-family: &quot;times&quot; , &quot;times new roman&quot; , serif;\">)<\/span><\/i><\/span> <\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Remember that classification is a supervised learning concept that has to do with determining the the class a new input variable belongs.In trying to assign &hellip; <\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"pagelayer_contact_templates":[],"_pagelayer_content":"","footnotes":""},"categories":[16],"tags":[],"class_list":["post-177","post","type-post","status-publish","format-standard","hentry","category-machine-learning"],"_links":{"self":[{"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/posts\/177","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/comments?post=177"}],"version-history":[{"count":1,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/posts\/177\/revisions"}],"predecessor-version":[{"id":1442,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/posts\/177\/revisions\/1442"}],"wp:attachment":[{"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/media?parent=177"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/categories?post=177"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/tags?post=177"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}