Spam Filter ISP Support Forum

  New Posts New Posts RSS Feed - Sawmill analysis
  FAQ FAQ  Forum Search   Register Register  Login Login

Sawmill analysis

 Post Reply Post Reply
Author
Alan View Drop Down
Guest Group
Guest Group
Post Options Post Options   Thanks (0) Thanks(0)   Quote Alan Quote  Post ReplyReply Direct Link To This Post Topic: Sawmill analysis
    Posted: 20 May 2004 at 4:59pm

Roberto, I am using Sawmill as an analysis tool for SF.  It seems to work well for some things but not for others.

For instance I was hoping to use it to evaluate the effectiveness of keyword filters, but alas since it only checks against the logs it cannot tell me which items are totally ineffective and worth deleting, only the items that work very well.

Also when checking the Reasons for filtering, it only lists items that match a content filter, reverse DNS and RBL list items.  It does not include any entries for other reasons such as attachments, IP filters, From and To blacklists, etc even though I know there are items being filtered for those reasons.  Is this info not in the logs such that it is not included in the analysis?

Back to Top
Desperado View Drop Down
Senior Member
Senior Member
Avatar

Joined: 27 January 2005
Location: United States
Status: Offline
Points: 1143
Post Options Post Options   Thanks (0) Thanks(0)   Quote Desperado Quote  Post ReplyReply Direct Link To This Post Posted: 21 May 2004 at 10:24pm

Alan,

Sawwmill DOES give you the information you require.  Try going to "Keywords" then under options set "Show Paranthasied Items" and "Show all rows"

Dan

Back to Top
pcmatt View Drop Down
Senior Member
Senior Member
Avatar

Joined: 15 February 2005
Location: United States
Status: Offline
Points: 116
Post Options Post Options   Thanks (0) Thanks(0)   Quote pcmatt Quote  Post ReplyReply Direct Link To This Post Posted: 22 May 2004 at 8:16am

I found that Sawmill provided limited information about log files.  The problem is that one connection to your Spamfilter server can create several to dozens of log entries accounting for one to many email messages.  Because email connections are so varied and complicated it's not possible to get accurate information from the type of log processing Sawmill provides.  Sawmill is designed and works best for transactions captured in a single log entry.

I wrote a C++  program to do my own log analysis processing.  My program analyzes log entries and groups all related columns of information together to create a single row in a database for each email message that is processed.  This way I can combine sender, recipient, source IP, source hostname, keyword, relevant message and result all in one row for each email message.   The end result is a database that gives you all email transactions with the information pertaining to each email message processed by Spamfilter.  This makes it real easy to query and report on email processing.

If you are interested in trying out my program, send me an email: http://mailto:pcmatt@idp.net and I'll send you a copy to test on your log files.

 

 

Back to Top
Alan View Drop Down
Guest Group
Guest Group
Post Options Post Options   Thanks (0) Thanks(0)   Quote Alan Quote  Post ReplyReply Direct Link To This Post Posted: 24 May 2004 at 2:00pm

Dan, if I am thinking correctly, Sawmill is only capable of showing results reflected in the logs.  So keywords that never trigger a filter will not appear in the logs and thus not in Sawmill's analysis either.  You get the keywords that triggered on one or more occasion, but not on keywords that no longer trigger a filter for a specified time period.  You can probably export the data and work backwards to derive this info but I don;t believe you can get it directly.  I was just looking for a better way to do occasional cleanup of old entries from the keyword lists that are no longer effective and was disappointed that this info was not directly available.  If this is incorrect I would be happy to find out how this can be done.

I like how the Bayesian database is cleaned up based on efficiency, it would be nice if the text lists could also be maintained this way.

Dan did you have any observations about why the Reasons selection seems to not provide the other info I mentioned?  Seems like it should as that info is in the logs.

Back to Top
 Post Reply Post Reply
  Share Topic   

Forum Jump Forum Permissions View Drop Down



This page was generated in 0.281 seconds.