Bayesian Filter still operational? |
Post Reply |
Author | |
yapadu
Senior Member Joined: 12 May 2005 Status: Offline Points: 297 |
Post Options
Thanks(0)
Posted: 24 June 2017 at 2:42am |
I know Bayesian filtering was popular with LogSat several years ago, but we ended up disabling it (can't really remember why).
Today we wanted to turn it back on and see how it performed. We have it set to activate at 5000 messages ham/spam. Currently the corpus file says we have much more than that processed: Spam=15689 Good=12866 We have yet to see it catch anything, our logs indicate stuff like this: 06/24/17 06:13:43:788 -- (2664055264) Token Good Spam Prob is Spam 06/24/17 06:13:43:788 -- (2664055264) EMail from g-21254616771-21161-2102972555-1498284821079@bounce.news.crumble.com to m_w@example.com passes Bayesian filter - 0% spam (172ms) Should the token good spam prob spam say anything? All messages in the logs say 0% spam, I would have assumed messages have at least some 'spam' component so they should not all be 0%. |
|
--------------------------------------------------------------
I am a user of SF, not an employee. Use any advice offered at your own risk. |
|
LogSat
Admin Group Joined: 25 January 2005 Location: United States Status: Offline Points: 4104 |
Post Options
Thanks(0)
|
The Bayesian filter is very selective, and most of the emails it classifies will have probabilities that will be very close to either the upper or lower extremes, meaning very close to either 0% or 100%. That is normal, and is just the nature of the Bayesian filtering.
SpamFilter applies several different filters to incoming emails in a specific order to optimize performance. The Bayesian filter is one of the last ones to be used by SpamFilter, and thus will catch a very small percentage of spam compared to the other filters. In our own ISP for example, the Bayesian filter (when activated) catches only about 0.1% of spam, compared to 99.9% of the other filters (we disabled this filter a few years ago on our own live server). Adding to this, the Bayesian filters were "the thing" 9-10 years ago, and for a while this was the "star" filter in our SpamFilter. However the spammers have since learned how to easily bypass them, making the Bayesian filter even less effective. We often suggest disabling this filter for companies that receive large amounts of emails (~250,000 or more per day) as it does use a lot of resources with only minor gains. |
|
Post Reply | |
Tweet
|
Forum Jump | Forum Permissions You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |
This page was generated in 0.086 seconds.