LogSat Software

Message Topic Search Topic Options Post Reply Create New Topic Printable Version Translate Topic

   I noticed in the newer pre-release versions:
// New to VersionNumber = '2.0.1.333';
{TODO -cNew : The keyword filter now also searches in the Received: headers}
I think that this will be a big plus to the Bayesian filtering in that spam that is routed through the same open-relays and/or from the same sources will be filtered.
Roberto can you clarify the limitations of this new feature and how you see it being best utilized?

Author	Message Topic Search Topic Options Post Reply Create New Topic Printable Version Translate Topic
Alan Members Profile Send Private Message Find Members Posts Add to Buddy List Guest Group	Post Options Post Reply Quote Alan Thanks(0) Quote Reply Topic: The keyword filter now also searches in the "Received:" headers Posted: 28 April 2004 at 6:09pm
	I noticed in the newer pre-release versions: // New to VersionNumber = '2.0.1.333'; {TODO -cNew : The keyword filter now also searches in the Received: headers} I think that this will be a big plus to the Bayesian filtering in that spam that is routed through the same open-relays and/or from the same sources will be filtered. Roberto can you clarify the limitations of this new feature and how you see it being best utilized?

LogSat Members Profile Send Private Message Find Members Posts Add to Buddy List Admin Group Joined: 25 January 2005 Location: United States Status: Offline Points: 4106	Post Options Post Reply Quote LogSat Report Post Thanks(0) Quote Reply Posted: 28 April 2004 at 11:48pm
	Alan, The usage of the ability to search the headers is something we'll leave to the user's inventive. What SpamFilter does is to retrieve all the "Received:" header values, and adds them to the body of the email so that the keyword filter will scan thru them as well. We have not included them in the Bayes analysis yet, as during our initial testing (that included all other headers as well) we were loosing some performance. This is something we may revisit in the near future however. Roberto F. LogSat Software

Alan Members Profile Send Private Message Find Members Posts Add to Buddy List Guest Group	Post Options Post Reply Quote Alan Thanks(0) Quote Reply Posted: 29 April 2004 at 12:15pm
	Roberto, can you make it so the header info (or just parts of the header info such as "Receieved:") can be included in Bayesian filtering as an option? Maybe using a check box?I think it would really be a powerful new tool to catch spam that passed thorough some of the known open relays that some spammers find and continue to reuse. That way those that have powerful hardware can take advantage of the feature and those that feel they don't need/want to take a performance hit can leave it turned off.

LogSat Members Profile Send Private Message Find Members Posts Add to Buddy List Admin Group Joined: 25 January 2005 Location: United States Status: Offline Points: 4106	Post Options Post Reply Quote LogSat Report Post Thanks(0) Quote Reply Posted: 30 April 2004 at 1:04am
	Alan, We're testing build 2.0.1.345 which is, as you requested, looking at all the Received: headers in the Bayesian filtering. More testing will be needed to see the effect this has on performance and the average size increase of the corpus database. This build is available for download in the registered user area on our website. If you'd like to test it we'd like to hear back from you how it's performing. Roberto F. LogSat Software

LogSat Members Profile Send Private Message Find Members Posts Add to Buddy List Admin Group Joined: 25 January 2005 Location: United States Status: Offline Points: 4106	Post Options Post Reply Quote LogSat Report Post Thanks(0) Quote Reply Posted: 30 April 2004 at 1:08am
	As a followup on the previous answer, if you use the new 345 build, to be accurate you will probably need to start with a fresh corpus so that the received headers have the proper weight in the corpus database. Roberto F. LogSat Software

Alan Members Profile Send Private Message Find Members Posts Add to Buddy List Guest Group	Post Options Post Reply Quote Alan Thanks(0) Quote Reply Posted: 30 April 2004 at 12:13pm
	I am giving the 345 release a try. It may take a few days to build up tokens. It looks like you haven't implimented any way to turn the feature on/off ? One possible problem did come to mind. If you use backup spooling servers in your MX record, some spammers target them as a secondary entryway. If you get a lot of spam sent using this method, I suspect the spooling servers could eventually be detected as spam by the Bayesian filtering? Roberto does this sound correct?

LogSat Members Profile Send Private Message Find Members Posts Add to Buddy List Admin Group Joined: 25 January 2005 Location: United States Status: Offline Points: 4106	Post Options Post Reply Quote LogSat Report Post Thanks(0) Quote Reply Posted: 30 April 2004 at 11:02pm
	I would let statistics do their work... If spammers send mail to your backup MX server, most of the email you will receive from it will be spam. The Received headers will contain your backup's IP, and they will then be taken into consideration. When SpamFilter receives email from your backup, it will see the IP/server name in the received headers, which will cause the probability to be spam to increase slightly, but this is correct since most email from your backup is spam. If the message is good to begin with, statistically the number of "good" tokens will likely make up for the "bad" score caused by the ip. This is all theory however, actual use will prove its validity. Roberto F. LogSat Software

LogSat Software

Site Navigation[Skip]

Spam Filter ISP Support Forum

The keyword filter now also searches in the "Received:" headers