Print Page | Close Window

Sawmill analysis

Printed From: LogSat Software
Category: Spam Filter ISP
Forum Name: Spam Filter ISP Support
Forum Description: General support for Spam Filter ISP
URL: https://www.logsat.com/spamfilter/forums/forum_posts.asp?TID=3625
Printed Date: 27 December 2024 at 4:31pm


Topic: Sawmill analysis
Posted By: Guests
Subject: Sawmill analysis
Date Posted: 20 May 2004 at 4:59pm

Roberto, I am using Sawmill as an analysis tool for SF.  It seems to work well for some things but not for others.

For instance I was hoping to use it to evaluate the effectiveness of keyword filters, but alas since it only checks against the logs it cannot tell me which items are totally ineffective and worth deleting, only the items that work very well.

Also when checking the Reasons for filtering, it only lists items that match a content filter, reverse DNS and RBL list items.  It does not include any entries for other reasons such as attachments, IP filters, From and To blacklists, etc even though I know there are items being filtered for those reasons.  Is this info not in the logs such that it is not included in the analysis?




Replies:
Posted By: Desperado
Date Posted: 21 May 2004 at 10:24pm

Alan,

Sawwmill DOES give you the information you require.  Try going to "Keywords" then under options set "Show Paranthasied Items" and "Show all rows"

Dan



Posted By: pcmatt
Date Posted: 22 May 2004 at 8:16am

I found that Sawmill provided limited information about log files.  The problem is that one connection to your Spamfilter server can create several to dozens of log entries accounting for one to many email messages.  Because email connections are so varied and complicated it's not possible to get accurate information from the type of log processing Sawmill provides.  Sawmill is designed and works best for transactions captured in a single log entry.

I wrote a C++  program to do my own log analysis processing.  My program analyzes log entries and groups all related columns of information together to create a single row in a database for each email message that is processed.  This way I can combine sender, recipient, source IP, source hostname, keyword, relevant message and result all in one row for each email message.   The end result is a database that gives you all email transactions with the information pertaining to each email message processed by Spamfilter.  This makes it real easy to query and report on email processing.

If you are interested in trying out my program, send me an email: mailto:pcmatt@idp.net" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - http://mailto:pcmatt@idp.net" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - http://mailto:pcmatt@idp.net and I'll send you a copy to test on your log files.

 

 



Posted By: Guests
Date Posted: 24 May 2004 at 2:00pm

Dan, if I am thinking correctly, Sawmill is only capable of showing results reflected in the logs.  So keywords that never trigger a filter will not appear in the logs and thus not in Sawmill's analysis either.  You get the keywords that triggered on one or more occasion, but not on keywords that no longer trigger a filter for a specified time period.  You can probably export the data and work backwards to derive this info but I don;t believe you can get it directly.  I was just looking for a better way to do occasional cleanup of old entries from the keyword lists that are no longer effective and was disappointed that this info was not directly available.  If this is incorrect I would be happy to find out how this can be done.

I like how the Bayesian database is cleaned up based on efficiency, it would be nice if the text lists could also be maintained this way.

Dan did you have any observations about why the Reasons selection seems to not provide the other info I mentioned?  Seems like it should as that info is in the logs.




Print Page | Close Window