As we already mentioned in the previous sections, in order to estimate the class-conditional density p(x|X) an assumption is usually made that x|X is distributed normally with unknown parameters. In the framework of this tutorial we'll make another simplifying assumption that the covariance matrix is known.
In this section we'll focus on the case
where the feature vectors are one-dimensional. That is, the only parameter
in the parameter vector is
the mean
. Recall from the
approach section
that the a priori density of the parameter vector is known. In our case,
we'll assume that
that is the mean
is distributed normally with the parameters
and
.
Moreover, the parameter vector
completely defines the probability density function p(x) which means
where variance is known and mean is the parameter.
Let's now look how to compute
p(
|X).
It holds that
where n is the number
of samples and c1 is a constant independent of .
Since given
,
p(x|
)
is distributed normally with the parameters
,
it holds that
Therefore,
Let's pay attention that
Therefore,
It's easy to verify that p( |X)
is also distributed normally, that is
where the parameters
and
depend on the number of samples n. It holds that
where
Now we are ready to compute the desired class-conditional density function:
Therefore, p(x|X) which is equal to the class-conditional density function p(x|wk,Xk) is normally distributed:
Let's now look at an example illustrating the formulas we developed.
Suppose p(x| )~N(
,
4) where p(
)~N(0,1).
The figure above contains 50 samples.
Let's see how p( |X)
gets
changed as number of samples increases from 10 to
50:
Since approaches
zero as n goes to infinity, the density function gets sharper as
the number of samples increases.
Click here to play with an applet illustrating the theory discussed above.