- Determination of the Class-Conditional Densities
*p(x | C*_{k}) - Determination of the a Posteriori Class Probabilities
*p(C*_{k}| x) - Use of a Discriminative Function

Let’s begin with the first one.

**1. Determination of the Class-Conditional Densities p(x | C_{k})**

The first step is to solve the inference problem of determining the class-conditional probability densities

*p(x|C*) for each class

_{k}*C*individually. Also separately infer the prior class probabilities

_{k}*p(C*

_{k}).Then use Bayes theory of the form

Bayes Theorem |

to find the posterior class probabilities* p(C _{k} | x)*.

We not that the expression

*p(x)*in the numerator can be found by the formula:

Having found the posterior probability *p(C _{k} | x)*, we can use decision theory to determine the class membership. Such approaches that model the inputs as well as the outputs are called generative models, because they can be used to generate synthetic data points in the input space.

**2. First Determine the Posterior Class Probabilities, p(C_{k} | x)**

This approach first solves the inference problem of determining the posterior class probabilities

*p(C*and then subsequently use decision theory to assign each new x to one of the available classes. Such approaches that model the posterior probabilities directly are called discriminative models.

_{k}| x),**3. Using a Discriminant Function**

The third approach is to find a function *f(x)*, called a discriminant function. This function maps each input *x* directly to a class label.

For example, in the case of the two-class problem, the function could have a binary output such that *f(x) = 0* represents class C_{1} while *f(x) = 1* represents class C_{2}. In the use of discriminant function, probability is not used. The binary output discriminant function is shown:

**Final Notes**

The first approach is the most demanding of the three. This is because it involves finding the joint distribution over both x and *C _{k}*. But this approach have the advantage of allowing for marginal dencity of data to be determined.

The simplest approach is the last one as it does not require probability functions.