octo's bayesian mail filter 0.05
--------------------------------

The most important thing first: This is a toy. There are a lot of programs
out there that do exactly what this code does, but probably better. Please
visit obmf's homepage for some links to other programs if you want a
system that simply works.

If you are looking for some documentation you can find it as POD within
obmf.pl itself. If you are not familiar with perl, try the following:
$ perldoc obmf.pl

obmf uses a database to store it's data. This has advantages: The average
database is a lot faster than any perl program could ever be, databases
often have multi-user capabilities so you can benefit from other people's
spam and databases can be used network-wide.
Per default obmf wants to use a MySQL database, simple because that's what
I use. There is no configuration file yet, so you will have to adjust
those settings at the very beginning of the `obmf.pl' file. If you
experience any problems with the SQL-statements using another database
than MySQL please, PLEASE tell me.. Thanks :)

Since you probably soon get tired of piping all messages to obmf by hand
there are sample configurations for procmail and mutt included. It is
important that you re-classify all false-negatives as well as
false-positives. For a first, basic training you can pipe an entire
mailbox (unix style only, I'm afraid) to obmf and train it this way.

If used with --filter obmf will add a X-Spam-Probability header field
which contains a probability, which is a _number_. If you want to match
this field later on you can use the following regexp, which will match all
mail with a probability of 60% and above:
^X-Spam-Probability: (100|[6-9][0-9])

Just like a lot other bayesian filters this one has also been created
after the lecture of "A Plan For Spam" by Paul Graham. However, although
the idea was taken from that paper, the algorithms used are different.

obmf was written 2002-2003 by Florian "octo" Forster and may be
distributed under the terms of the "GNU Public License".
You can reach me at <octo at verplant org>.
