Theory


The NN Rule

Recall that the only knowledge we have is a set of i points x correctly classified into categories : . By intuition, it is reasonable to assume that observations which are close together -- for some appropriate metric -- will have the same classification.

Thus, when classifying an unknown sample x, it seems appropriate to weight the evidence of the nearby 's heavily.

One simple non-parametric decision procedure of this form is the nearest neighbor rule or NN-rule. This rule classifies x in the category of its nearest neighbor. More precisely, we call

a neast neighbor to x if

where .

The nearest neighbor rule chooses to classify x to the category , where is the nearest neighbor to x and belongs to class . A mistake is made if is not the same as . Also, notice that the NN-rule only uses the nearest neighbor as a classifier, while ignoring the remaining pre-labeled data points.

Figure 1: The NN rule

Figure 1 shows an example of the NN rule. In this problem, there are two classes: (yellow triangles) and (blue squares). The circle represents the unknown sample x and as its nearest neighbor comes from class , it is labeled as class .


The k-NN Rule

If the number of pre-classified points is large it makes good sense to use, instead of the single nearest neighbor, the majority vote of the nearest k neighbors. This method is referred to as the k-NN rule.

The number k should be:
1) large to minimize the probability of misclassifying x.
2) small (with respect to the number of samples) so that the points are close enough to x to give an accurate estimate of the true class of x.

Figure 2: The k-NN rule with k=3

Figure 2 shows an example of the k-NN rule, with k=3. As before, there are two classes: (yellow triangles) and (blue squares). The circle represents the unknown sample x and as two of its nearest neighbors come from class , it is labeled class .


Nearest Neighbor Rule: A Short Tutorial
March, 1999