Sawmill analysis |
Post Reply |
Author | |
Alan
Guest Group |
Post Options
Thanks(0)
Posted: 20 May 2004 at 4:59pm |
Roberto, I am using Sawmill as an analysis tool for SF. It seems to work well for some things but not for others. For instance I was hoping to use it to evaluate the effectiveness of keyword filters, but alas since it only checks against the logs it cannot tell me which items are totally ineffective and worth deleting, only the items that work very well. Also when checking the Reasons for filtering, it only lists items that match a content filter, reverse DNS and RBL list items. It does not include any entries for other reasons such as attachments, IP filters, From and To blacklists, etc even though I know there are items being filtered for those reasons. Is this info not in the logs such that it is not included in the analysis? |
|
Desperado
Senior Member Joined: 27 January 2005 Location: United States Status: Offline Points: 1143 |
Post Options
Thanks(0)
|
Alan, Sawwmill DOES give you the information you require. Try going to "Keywords" then under options set "Show Paranthasied Items" and "Show all rows" Dan |
|
pcmatt
Senior Member Joined: 15 February 2005 Location: United States Status: Offline Points: 116 |
Post Options
Thanks(0)
|
I found that Sawmill provided limited information about log files. The problem is that one connection to your Spamfilter server can create several to dozens of log entries accounting for one to many email messages. Because email connections are so varied and complicated it's not possible to get accurate information from the type of log processing Sawmill provides. Sawmill is designed and works best for transactions captured in a single log entry. I wrote a C++ program to do my own log analysis processing. My program analyzes log entries and groups all related columns of information together to create a single row in a database for each email message that is processed. This way I can combine sender, recipient, source IP, source hostname, keyword, relevant message and result all in one row for each email message. The end result is a database that gives you all email transactions with the information pertaining to each email message processed by Spamfilter. This makes it real easy to query and report on email processing. If you are interested in trying out my program, send me an email: http://mailto:pcmatt@idp.net and I'll send you a copy to test on your log files.
|
|
Alan
Guest Group |
Post Options
Thanks(0)
|
Dan, if I am thinking correctly, Sawmill is only capable of showing results reflected in the logs. So keywords that never trigger a filter will not appear in the logs and thus not in Sawmill's analysis either. You get the keywords that triggered on one or more occasion, but not on keywords that no longer trigger a filter for a specified time period. You can probably export the data and work backwards to derive this info but I don;t believe you can get it directly. I was just looking for a better way to do occasional cleanup of old entries from the keyword lists that are no longer effective and was disappointed that this info was not directly available. If this is incorrect I would be happy to find out how this can be done. I like how the Bayesian database is cleaned up based on efficiency, it would be nice if the text lists could also be maintained this way. Dan did you have any observations about why the Reasons selection seems to not provide the other info I mentioned? Seems like it should as that info is in the logs. |
|
Post Reply | |
Tweet
|
Forum Jump | Forum Permissions You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |
This page was generated in 0.281 seconds.