FAQ - Antispam - Bayesian filter training

Computer Mail Services, Inc.
Software / Services / eMail Tools: IP Address Blocking, Spam Filtering, Log Data Mining and DNS Blacklist Monitoring

TELEPHONE: 248.352.6700 or 800.883.2674 (USA Only)

FOR SALES AND OTHER INFORMATION...

Products

BL-Monitor

ES-Insight

XE-Filter

Praetor Software

OTHER LINKS

FAQ

Downloads

Price Quotes

Purchasing/Forms

ROI/Spam Calculator

News/Analysis

Press Release

Comments

Support

About CMS

RECENT CMS BLOGS

Spam, Bacon and Ice Cream eMail

Golf, Miami Housing and Cristal Champagne

Twitter, Facebook, LinkedIn and eMail Spam

Visit CMS Blog...

CMS A Microsoft Certified Partner

Trademarks / Logos

Site Map

Praetor Questions	Go to Praetor FAQ: Exchange Go to Praetor FAQ: Domino Go to Praetor FAQ: Install

What is the process of training Praetor for Bayesian filtering?

THE PRE-CONFIGURED DEFAULT BAYESIAN TOKEN DATABASE

Automatically upon Praetor's installation, spam is filtered using the preconfigured default Bayesian token database. This contains information processed from CMS' study of thousands of spam and non-spam email messages. With this database, Praetor will instantly filter over 90% of all spam messages.

TRAINING THE BAYESIAN TOKEN DATABASE

The Bayesian training process starts with Praetor receiving messages and using your pre-existing rules or the supplied default rules to make the first classification of the email. Is an email message spam (unwanted) or ham (good)?

Once 1000 or more samples are captured, review them to determine if the classifications are correct. Usually this can be determined from the summary line shown in the Log Viewer, but if more details are needed, the message can be opened. For any misclassified message (spam accepted as ham or ham quarantined as spam), press the appropriate button to remove the wrong classification.

After the review is complete, CMS recommends waiting until normal mail traffic of the business day has subsided. Then use the training wizard. This analyzes each message, breaks it down into tokens and updates the frequency count of each token that is stored in an SQL database.

This training process could take several hours. Praetor will receive messages, but the CPU utilization will stay high during this training period. For high volume sites (even after business hours), CMS recommends that the Praetor box be temporarily taken off-line by stopping the SMTP server.

For tips and suggestions on how to train Praetor's Bayesian filter, read the "Bayesian Training Tips" page.

[ Home ] [ About CMS ] [ Site Map ] [ Support ] [ Downloads ] [ FAQ ] [ News ] [ Press Release ]
[ XE-Filter ] [ ES-Insight ] [ BL-Monitor ] [ Praetor Software ] [ Ad Sponsorship ]

Send mail to Webmaster with questions or comments about this web site.
Copyright 2011 Computer Mail Services, Inc.