Print Page | Close Window

Does Baysian Filter Properly learn Whitelisted Entries

Printed From: LogSat Software
Category: Spam Filter ISP
Forum Name: Spam Filter ISP Support
Forum Description: General support for Spam Filter ISP
URL: https://www.logsat.com/spamfilter/forums/forum_posts.asp?TID=3553
Printed Date: 15 January 2025 at 9:58am


Topic: Does Baysian Filter Properly learn Whitelisted Entries
Posted By: Guests
Subject: Does Baysian Filter Properly learn Whitelisted Entries
Date Posted: 06 May 2004 at 12:44am

We have recently reset our Bayesian filter.  It was not picking up enough of the junk mail.  At the same time we whitelisted the domains of all of our clients, while increasing the words in the blacklist keywords filter.  The result has been very positive, less spam, and less false positives, until today.

Today the Bayesian filter hit the 5000 good email mark again, (30,000+ spam) and now every email that comes in that is not on the whitelist is flagged as 100% Spam.  Words like "the", "and", "to" have a .99 percent spam probablity. 

This was not the case before we put in the whitelist.  Does the filter properly learn the words in a whitelisted email as well as the blacklisted emails?  Also, I read postings of a tool to edit the Corpus.  When is that due out?




Replies:
Posted By: LogSat
Date Posted: 06 May 2004 at 11:53pm

Doug,

SpamFilter does not "learn" from the black/white lists, but rather it examines the incoming emails, analyzes their content, and updates the statistical database with them. The more accurate your existing "regular" filter, the better trained the statistical filter becomes. It helps a lot at the beginning to check the quarantine to force delivery of false positives (legitimate email that has been blocked). This process trains the filter to recognize the mistake it has made, and to "weigh" more those emails as good mail in the future. Please also note that with time the filter becomes more and more accurate.

Also make sure that when you "reset" the Bayesian filter, you delete ALL files in the SpamFilter\Corpus directory to properly start from scratch.

If you continue to have incccurate result, it may help to double or triple the minimum threshold for the Bayes activation (MinEmailsForBayesKickIm setting in SpamFilter.ini file) so that it kicks in later after it has performed more learning.

Roberto F.
LogSat Software




Print Page | Close Window