Finding close matches to words in text,
searching a database for entries, and homology (similarity) searches on DNA
sequences are but a few of the applications of string matching, which extends
to many instances of Pattern Recognition problems. The problem of
approximate string matching can be formalized as the problem of finding in
a given text of length *n* all text positions which match a pattern
*p* of length *m* with up to *k* errors, where allowed errors
are character deletions, insertions and substitution. Less formally, this
can be viewed as the problem of transforming a part of the input text to obtain
*p* using *k* insertion, deletion and substitution operations.
The *error level*
is defined to be a = *k*/*m*, for
which typical values range from 0.1 to 0.3.

The fastest classical algorithm for low error levels is a*
filtration algorithm*
, and as such its performance degrades quickly as error levels are increased.
The algorithm shown here improves upon previous techniques for finding potential
matches, and improves the filter for biased texts such as natural languages
. These new features allow the algorithm to continues to perform well even
on moderate error levels, making it the fastest algorithm in the field for
string matching in all "interesting" cases.

The fastest classical algorithm for low error levels is a