Spam Filter ISP Support Forum

  New Posts New Posts RSS Feed - Bayesian Filter
  FAQ FAQ  Forum Search   Register Register  Login Login

Bayesian Filter

 Post Reply Post Reply
Author
Allan View Drop Down
Guest Group
Guest Group
Post Options Post Options   Thanks (0) Thanks(0)   Quote Allan Quote  Post ReplyReply Direct Link To This Post Topic: Bayesian Filter
    Posted: 08 March 2004 at 3:42am

I've purchased v.2 - waited for 5000 rejected mails og finally in this weekend, the filter kicked in. But to what use ? I have seen two emails score higher than 0.05% - the rest is lower.

Am I doing something wrong or is my database just not "good enough" !?

I am using the extension blocker, does this have an effect on the efficiency og the Bayesian filter ?

 

Regards,

Allan

Back to Top
Andy View Drop Down
Guest Group
Guest Group
Post Options Post Options   Thanks (0) Thanks(0)   Quote Andy Quote  Post ReplyReply Direct Link To This Post Posted: 08 March 2004 at 9:58am

I too have reached the point where Bayes has kicked in and it appears majority of Bayes inspected emails are either 0% or 100%. It seems hard to believe that any message would have 0% since some tokens would have to be common in spam or non-spam.

I have another anti-spam system behind Spamfilter (Mdaemon) and it shows a lot of stuff getting through and its Bayes filter is catching them. I was hoping that Spamfilter would replace Mdaemon but now I'm not so sure.

Am I just being impatient? Is there a 'Best Practices' document to implementing the Bayes in Spamfilter?

I used MAPS, Keywords, and blocked attachments to feed the Bayes through the quarantine database. I can't see keeping these going to the quarantine db once the Bayes has been initially trained.

Are others seeing different results?

Thanks!
Andy

 

 

Back to Top
Alan View Drop Down
Guest Group
Guest Group
Post Options Post Options   Thanks (0) Thanks(0)   Quote Alan Quote  Post ReplyReply Direct Link To This Post Posted: 08 March 2004 at 12:36pm

I am really unclear as to how much use the Bayesian filtering can be at this point if there is still no feature to be able to submit spam that has passed through unfiltered, back into the corpus.  As it stands now it appears the Bayesian is only good for filtering spam that is similar to spam that had already been caught by other existing Spamfilter filtering already.  It appears it has no way to learn any more than what your exisitng filters already do for you.

Users need a way to manually submit unfiltered email as spam.  Without that feature it seems the corpus will never amount to much beyond what your other filters do already.

And seeing that Spamfilter still does not scan the full headers, you are still losing out on another big advantage that Bayesian would offer you.

A recent networking trade journal (was it Network World?  I forget off the top of my head) had an article on Bayesian filtering and they agreed that the BEST Baysesian filters scanned the full headers also.  It would sure be nice if Spamfilter could also.

Back to Top
Desperado View Drop Down
Senior Member
Senior Member
Avatar

Joined: 27 January 2005
Location: United States
Status: Offline
Points: 1143
Post Options Post Options   Thanks (0) Thanks(0)   Quote Desperado Quote  Post ReplyReply Direct Link To This Post Posted: 08 March 2004 at 3:06pm

All,

I am running the Bayesian filter and am having Very good success.  The Bayesian filter is only checked if all other filter fail to detect anything "bad" in the message.  In the last 3 days, it has blocked 2634 messages AFTER all the other filters.  My average is around 30-40 an hour or one every 2 minutes.  This doesn't sound like a lot but it has removed some of the messages that I have been unable to create a reliable filter to block otherwise.  If your volume of messages is lower than mine (around 50,000 / day) you may want to change /add the ini setting

CleanUpCorpusIntervalDays=5

to a longer time.  The 5 days is my value but I am not sure what the default is.  I have one instance that handles very little email and I am trying a value of 20.  In 10 days, I have not yet reached the 5000 / 5000 level so the Filter may not be very effective for that customer but he also has many other features turned off which drastically effects the value of the SpamFilter as a whole but that is his choice.

Also, I believe LogSat is looking into ways to "slant" the Corpus data but that is best answered by them.

Regards,

Dan S.

Back to Top
LogSat View Drop Down
Admin Group
Admin Group
Avatar

Joined: 25 January 2005
Location: United States
Status: Offline
Points: 4104
Post Options Post Options   Thanks (0) Thanks(0)   Quote LogSat Quote  Post ReplyReply Direct Link To This Post Posted: 08 March 2004 at 10:51pm

Please see Notes about the Bayesian statistical filtering

Roberto F.
LogSat Software

Back to Top
LogSat View Drop Down
Admin Group
Admin Group
Avatar

Joined: 25 January 2005
Location: United States
Status: Offline
Points: 4104
Post Options Post Options   Thanks (0) Thanks(0)   Quote LogSat Quote  Post ReplyReply Direct Link To This Post Posted: 08 March 2004 at 10:51pm

Please see Notes about the Bayesian statistical filtering

Roberto F.
LogSat Software

Back to Top
LogSat View Drop Down
Admin Group
Admin Group
Avatar

Joined: 25 January 2005
Location: United States
Status: Offline
Points: 4104
Post Options Post Options   Thanks (0) Thanks(0)   Quote LogSat Quote  Post ReplyReply Direct Link To This Post Posted: 08 March 2004 at 10:52pm

Please see Notes about the Bayesian statistical filtering

Roberto F.
LogSat Software

Back to Top
Andy View Drop Down
Guest Group
Guest Group
Post Options Post Options   Thanks (0) Thanks(0)   Quote Andy Quote  Post ReplyReply Direct Link To This Post Posted: 09 March 2004 at 12:00am

Any suggestions on how to deal with spam that was not learned by Bayes during training and is now showing up as 0%.

I had been using Mdaemon 6.8.5 (with Bayes) for months prior to buying SpamFilter and now am using it behind SpamFilter. I see a LOT of spam in MDaemon which I hoped would have been caught by Spamfilter and when I compare the date/time and to/from  in the msgs  to whats in the SpamFilter log I see that SF tagged those particular msgs as 0%, That means the corpus will always see that spam as being OK. What hope is there of changing the corpus to make it think its something other than 0%?

Thanks!
Andy

Back to Top
ASB View Drop Down
Guest Group
Guest Group
Post Options Post Options   Thanks (0) Thanks(0)   Quote ASB Quote  Post ReplyReply Direct Link To This Post Posted: 10 March 2004 at 1:23am

===
Users need a way to manually submit unfiltered email as spam.  Without that feature it seems the corpus will never amount to much beyond what your other filters do already.
===

I agree...     Too many things are getting by the Bayesian Filter for me...

 

Back to Top
 Post Reply Post Reply
  Share Topic   

Forum Jump Forum Permissions View Drop Down



This page was generated in 0.215 seconds.