THE PRE-CONFIGURED DEFAULT BAYESIAN TOKEN DATABASE
Automatically upon
Praetor's installation, spam is filtered using the
preconfigured default Bayesian token database. This contains
information processed from CMS' study of thousands of spam and non-spam
email messages. With this database, Praetor will instantly filter over 90%
of all spam messages.
TRAINING THE BAYESIAN TOKEN DATABASE
The Bayesian training process starts with Praetor receiving messages
and using your
pre-existing rules or the supplied default rules to make the first classification of the email. Is
an email message spam (unwanted) or ham (good)?
Once 1000 or more samples are captured, review them to determine if the classifications
are correct. Usually this can be determined from the summary line shown in the Log Viewer, but if
more details are needed, the message can be opened. For any misclassified message (spam accepted as
ham or ham quarantined as spam), press the appropriate button to remove the wrong classification.
After the review is complete, CMS recommends waiting until normal mail traffic of the business day has
subsided. Then use the training wizard. This analyzes each message, breaks it down into
tokens and updates the frequency count of each token that is stored in an SQL database.
This training process could take several hours. Praetor will receive messages, but the CPU
utilization will stay high during this training period. For high volume sites (even after business
hours), CMS recommends that the Praetor box be temporarily taken off-line by stopping the SMTP server.
For tips and suggestions on how to train Praetor's Bayesian filter,
read the "Bayesian Training Tips"
page. |