{"id":116,"date":"2018-03-27T13:51:00","date_gmt":"2018-03-27T13:51:00","guid":{"rendered":"https:\/\/kindsonthegenius.com\/blog\/2018\/03\/27\/classification-in-machine-learning\/"},"modified":"2020-07-25T22:57:13","modified_gmt":"2020-07-25T20:57:13","slug":"classification-in-machine-learning","status":"publish","type":"post","link":"https:\/\/kindsonthegenius.com\/blog\/classification-in-machine-learning\/","title":{"rendered":"Classification in Machine Learning"},"content":{"rendered":"<p>In this lesson, we are going to examine classification in machine learning.\u00a0 Below are the topics we are going to cover in this lesson<\/p>\n<ol>\n<li><a href=\"https:\/\/kindsonthegenius.com\/blog\/classification-in-machine-learning#t1\">Formulation of the Problem<\/a><\/li>\n<li><a href=\"https:\/\/kindsonthegenius.com\/blog\/classification-in-machine-learning#t2\">The Cancer Diagnosis Example<\/a><\/li>\n<li><a href=\"https:\/\/kindsonthegenius.com\/blog\/classification-in-machine-learning#t3\">The Inference and Decision Problems<\/a><\/li>\n<li><a href=\"https:\/\/kindsonthegenius.com\/blog\/classification-in-machine-learning#t4\">The Role of Probability<\/a><\/li>\n<li><a href=\"https:\/\/kindsonthegenius.com\/blog\/classification-in-machine-learning#t5\">Minimizing Rate of Misclassification<\/a><\/li>\n<li><a href=\"https:\/\/kindsonthegenius.com\/blog\/classification-in-machine-learning#t6\">Minimizing Expected Loss<\/a><\/li>\n<li><a href=\"#t7\">Approaches to Classification bases on a Priori Knowledge<\/a><\/li>\n<\/ol>\n<p>So let&#8217;s begin with the first one<\/p>\n<p>&nbsp;<\/p>\n<h3  id=\"t1\">1. Formulation of the Problem<\/h3>\n<div style=\"color: #555555;\">Suppose we have an input vector x together with a corresponding vector t of the target variable. This set of input vector x, and target vector t forms the training data set.<\/div>\n<div><\/div>\n<div style=\"color: #555555;\">Now, the goal is to predict t for a new value of x. For classification problem, t would represent class labels. The joint probability distribution p(x, t) provides a complete summary of the uncertainty associated with these variables. Determination of this probability from the training data set is an example of inference problem<\/div>\n<p>&nbsp;<\/p>\n<h3 id=\"t2\">2. The Cancer Diagnosis Example<\/h3>\n<div style=\"color: #555555;\">Let&#8217;s take the example of the medical diagnosis problem where we need to determine whether or not the patient have cancer. An x-ray image of the patient is taken, so the input vector x is a set of pixel intensities of the x-ray image.<\/div>\n<div style=\"color: #555555;\">The output variable t will represent the presence or absence of cancer. Now we can form two classes C<sub>1<\/sub> and C<sub>2<\/sub>. C&lt;sub1 represents the presence of cancer while C2 represents the absence of cancer.<\/div>\n<div style=\"color: #555555;\">We need to assign the input to one of the two classes.<\/div>\n<p>&nbsp;<\/p>\n<h3  id=\"t3\">3. The Inference and Decision Problems<\/h3>\n<div style=\"color: #555555;\">The general inference\u00a0 involves determining the joint distribution p(x, Ck), or p(x,t), which give the most complete probabilistic description of the problem. This is the inference step.<\/div>\n<div style=\"color: #555555;\">After this, a decision is made whether to give treatment or not. This is the decision step. This step becomes easy if the inference problem have been solved.<\/div>\n<p>&nbsp;<\/p>\n<h3  id=\"t4\">4. The Role of Probability<\/h3>\n<div style=\"color: #555555;\">When the x-ray images is obtained x for a new patient, the goal is to determine which of the two classes to assign the image.<\/div>\n<div><\/div>\n<div style=\"color: #555555;\">We could assign it to any of the two classes, C1 or C2 (let&#8217;s use Ck, for k = {1,2}). We need to find the probabilities of the two classes given the image. This probability is given by p(Ck | x). Using Bayes theorem, these probabilities can be expressed in the form:<\/div>\n<div><\/div>\n<div style=\"color: #555555;\"><\/div>\n<div style=\"clear: both; color: #555555; text-align: center;\"><a style=\"margin-left: 1em; margin-right: 1em;\" href=\"https:\/\/3.bp.blogspot.com\/-_xOUL80zqAU\/WmSTn9rciSI\/AAAAAAAAA4M\/IAZTHVvp-icqK7pQ8lrlxGHEtjd-y_2zACLcBGAs\/s1600\/Bayes%2BTheory%2Bof%2BProbability.jpg\"><img decoding=\"async\" loading=\"lazy\" class=\"\" src=\"https:\/\/3.bp.blogspot.com\/-_xOUL80zqAU\/WmSTn9rciSI\/AAAAAAAAA4M\/IAZTHVvp-icqK7pQ8lrlxGHEtjd-y_2zACLcBGAs\/s320\/Bayes%2BTheory%2Bof%2BProbability.jpg\" width=\"260\" height=\"61\" border=\"0\" data-original-height=\"183\" data-original-width=\"780\" \/><\/a><\/div>\n<div style=\"color: #555555;\">\n<p>&nbsp;<\/p>\n<h3 id=\"t5\">5. Minimizing Misclassification<\/h3>\n<p>Sometimes in classification, we may assign an input to the wrong class. In this case, misclassification have occurred. The goal is to make as few misclassifications as possible.<\/p>\n<\/div>\n<div style=\"color: #555555;\">Misclassification occurs when an input variable is assigned to the wrong class. The goal is to minimize the number of misclassifications.<\/div>\n<div><\/div>\n<div style=\"color: #555555;\">We handle this by creating a rule that assigns each x to one of the available classes. This rule would divide the input space into regions Rx called decision regions, one region for each class.<\/div>\n<div style=\"color: #555555;\">Points in Rx are assigned to Ck.<\/div>\n<div><\/div>\n<div style=\"color: #555555;\">A misclassification when a point in a region is assigned to the wrong class e.g<\/div>\n<ul style=\"color: #555555;\">\n<li>x is in R1 but is assigned to class C2<\/li>\n<li>x is in R2 but is assigned to class C1<\/li>\n<\/ul>\n<div style=\"color: #555555;\"><\/div>\n<div style=\"color: #555555;\">The probability of misclassification is given by:<\/div>\n<div style=\"color: #555555;\"><\/div>\n<div style=\"clear: both; color: #555555; text-align: center;\"><a style=\"margin-left: 1em; margin-right: 1em;\" href=\"https:\/\/3.bp.blogspot.com\/-5NzGAv667wc\/WmTiFQ0szEI\/AAAAAAAAA48\/Q_KuwBnXkoAWzTKGbDQg6uu6fZ66v_ifQCLcBGAs\/s1600\/Minimizing%2BMisclassification.jpg\"><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/3.bp.blogspot.com\/-5NzGAv667wc\/WmTiFQ0szEI\/AAAAAAAAA48\/Q_KuwBnXkoAWzTKGbDQg6uu6fZ66v_ifQCLcBGAs\/s400\/Minimizing%2BMisclassification.jpg\" width=\"400\" height=\"83\" border=\"0\" data-original-height=\"203\" data-original-width=\"964\" \/><\/a><\/div>\n<div style=\"color: #555555;\"><\/div>\n<div style=\"color: #555555;\">To minimize this error, the rule must assign <span style=\"font-family: 'georgia' , 'times new roman' , serif;\"><b>x<\/b><\/span> to which class has minimum integrand:<\/div>\n<div><\/div>\n<div style=\"color: #555555; text-align: center;\"><span style=\"font-family: 'georgia' , 'times new roman' , serif;\"><span style=\"color: black;\"><i>p(<\/i><b>x<\/b><i>, C<sub>k<\/sub>) = p(C<sub>k<\/sub> | <\/i><b>x<\/b><i>)p(<\/i><b>x<\/b><i>)<\/i><\/span><\/span><\/div>\n<div><\/div>\n<div style=\"color: #555555;\">From the above formula, to minimize misclassification, x should be assigned to which class the posterior probability is largest.<\/div>\n<div style=\"color: #555555;\"><\/div>\n<div style=\"color: #555555;\">\n<p>&nbsp;<\/p>\n<h3>6. Reducing Expected Loss<\/h3>\n<p>The case of cancer diagnosis shows that a loss incurred when there is misclassification could be of different degrees.<\/p>\n<\/div>\n<div style=\"color: #555555; font-family: 'segoe ui';\">So we introduce a loss function or cost function E(L).<\/div>\n<div style=\"color: #555555; font-family: 'segoe ui';\">Suppose that for a new value of x, the correct class is Ck but we assign it to Cj (where j\u00a0 k). In this case, we have incurred some loss Lkj.<\/div>\n<div style=\"color: #555555; font-family: 'segoe ui';\">where k and j are elements of the loss matrix, (as in the case of cancer diagnosis).<\/div>\n<div style=\"color: #555555; font-family: 'segoe ui';\">The optimal solution is the one that minimizes the value of the loss function.<\/div>\n<div style=\"color: #555555; font-family: 'segoe ui';\">The average loss depends on the joint probability p(x, Ck) and is given by:<\/div>\n<div><\/div>\n<div style=\"color: #555555; font-family: 'segoe ui';\"><\/div>\n<div style=\"clear: both; color: #555555; text-align: center;\"><a style=\"margin-left: 1em; margin-right: 1em;\" href=\"https:\/\/2.bp.blogspot.com\/-hw9FmMTepE8\/WmTmfV5MyAI\/AAAAAAAAA5I\/2NjTeoIizvwMa7agrxry9A41FWtGdvzLwCLcBGAs\/s1600\/Loss%2BFunction.jpg\"><img decoding=\"async\" loading=\"lazy\" class=\"\" src=\"https:\/\/2.bp.blogspot.com\/-hw9FmMTepE8\/WmTmfV5MyAI\/AAAAAAAAA5I\/2NjTeoIizvwMa7agrxry9A41FWtGdvzLwCLcBGAs\/s320\/Loss%2BFunction.jpg\" width=\"259\" height=\"55\" border=\"0\" data-original-height=\"131\" data-original-width=\"617\" \/><\/a><\/div>\n<div style=\"color: #555555; font-family: 'segoe ui';\"><\/div>\n<div style=\"color: #555555; font-family: 'segoe ui';\">The objective\u00a0 is to choose Rj in order to minimize the expected loss. If we know the posterior probability p(Ck, x), then we can minimize the loss.<\/div>\n<div style=\"color: #555555; font-family: 'segoe ui';\">\n<p>&nbsp;<\/p>\n<h3><b>7. Three Approaches to Classification<\/b><\/h3>\n<p>The three approaches to classification are:<\/p>\n<ul>\n<li>Determination of the class conditional probabilities<\/li>\n<li>Determination of the posterior probability directly<\/li>\n<li>Use a Discriminant function<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<p><b>a. Determine the Class Conditional Probabilities<\/b><br \/>\nDetermine the Class Conditional Probabilities p(x|Ck) for each class. Then determine the prior probabilities p(Ck) for each class. Then using Bayes theorem,\u00a0 determine the posterior probability.<\/p>\n<p>&nbsp;<\/p>\n<p><b>b. Directly Determine the Posterior Probabilities<\/b><br \/>\nSolve the inference problem by first obtaining the posterior class probabilities p(Ck |x) for each class. Then use decision theory to assign new values of x to classes.<\/p>\n<p>&nbsp;<\/p>\n<p><b>c. Use a Discriminant Function<\/b><br \/>\nFind a function f(x) called a discriminant function that would map each input directly to a class label. For example:<\/p>\n<div style=\"clear: both; text-align: center;\"><a style=\"margin-left: 1em; margin-right: 1em;\" href=\"https:\/\/2.bp.blogspot.com\/-O-Q6lToclZE\/WmTpnOIKQ5I\/AAAAAAAAA5U\/uonr-KNZ1Rs6QtObyDlKdjIbbIyrPvNYACLcBGAs\/s1600\/Discriminant%2BFunction.jpg\"><br \/>\n<img decoding=\"async\" loading=\"lazy\" class=\"\" src=\"https:\/\/2.bp.blogspot.com\/-O-Q6lToclZE\/WmTpnOIKQ5I\/AAAAAAAAA5U\/uonr-KNZ1Rs6QtObyDlKdjIbbIyrPvNYACLcBGAs\/s320\/Discriminant%2BFunction.jpg\" width=\"216\" height=\"56\" border=\"0\" data-original-height=\"144\" data-original-width=\"556\" \/><br \/>\n<\/a><\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>In this lesson, we are going to examine classification in machine learning.\u00a0 Below are the topics we are going to cover in this lesson Formulation &hellip; <\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_mi_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0},"categories":[11,16],"tags":[],"_links":{"self":[{"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/posts\/116"}],"collection":[{"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/comments?post=116"}],"version-history":[{"count":11,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/posts\/116\/revisions"}],"predecessor-version":[{"id":1244,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/posts\/116\/revisions\/1244"}],"wp:attachment":[{"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/media?parent=116"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/categories?post=116"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/tags?post=116"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}