The keyword filter now also searches in the "Received:" headers |
Post Reply ![]() |
Author | |
Alan ![]() Guest Group ![]() |
![]() ![]() ![]() ![]() ![]() Posted: 28 April 2004 at 6:09pm |
I noticed in the newer pre-release versions: // New to VersionNumber = '2.0.1.333'; I think that this will be a big plus to the Bayesian filtering in that spam that is routed through the same open-relays and/or from the same sources will be filtered. Roberto can you clarify the limitations of this new feature and how you see it being best utilized? |
|
![]() |
|
LogSat ![]() Admin Group ![]() ![]() Joined: 25 January 2005 Location: United States Status: Offline Points: 4104 |
![]() ![]() ![]() ![]() ![]() |
Alan, The usage of the ability to search the headers is something we'll leave to the user's inventive. What SpamFilter does is to retrieve all the "Received:" header values, and adds them to the body of the email so that the keyword filter will scan thru them as well. We have not included them in the Bayes analysis yet, as during our initial testing (that included all other headers as well) we were loosing some performance. This is something we may revisit in the near future however. Roberto F. |
|
![]() |
|
Alan ![]() Guest Group ![]() |
![]() ![]() ![]() ![]() ![]() |
Roberto, can you make it so the header info (or just parts of the header info such as "Receieved:") can be included in Bayesian filtering as an option? Maybe using a check box?I think it would really be a powerful new tool to catch spam that passed thorough some of the known open relays that some spammers find and continue to reuse. That way those that have powerful hardware can take advantage of the feature and those that feel they don't need/want to take a performance hit can leave it turned off. |
|
![]() |
|
LogSat ![]() Admin Group ![]() ![]() Joined: 25 January 2005 Location: United States Status: Offline Points: 4104 |
![]() ![]() ![]() ![]() ![]() |
Alan, We're testing build 2.0.1.345 which is, as you requested, looking at all the Received: headers in the Bayesian filtering. More testing will be needed to see the effect this has on performance and the average size increase of the corpus database. This build is available for download in the registered user area on our website. If you'd like to test it we'd like to hear back from you how it's performing. Roberto F. |
|
![]() |
|
LogSat ![]() Admin Group ![]() ![]() Joined: 25 January 2005 Location: United States Status: Offline Points: 4104 |
![]() ![]() ![]() ![]() ![]() |
As a followup on the previous answer, if you use the new 345 build, to be accurate you will probably need to start with a fresh corpus so that the received headers have the proper weight in the corpus database. Roberto F. |
|
![]() |
|
Alan ![]() Guest Group ![]() |
![]() ![]() ![]() ![]() ![]() |
I am giving the 345 release a try. It may take a few days to build up tokens. It looks like you haven't implimented any way to turn the feature on/off ? One possible problem did come to mind. If you use backup spooling servers in your MX record, some spammers target them as a secondary entryway. If you get a lot of spam sent using this method, I suspect the spooling servers could eventually be detected as spam by the Bayesian filtering? Roberto does this sound correct? |
|
![]() |
|
LogSat ![]() Admin Group ![]() ![]() Joined: 25 January 2005 Location: United States Status: Offline Points: 4104 |
![]() ![]() ![]() ![]() ![]() |
I would let statistics do their work... If spammers send mail to your backup MX server, most of the email you will receive from it will be spam. The Received headers will contain your backup's IP, and they will then be taken into consideration. When SpamFilter receives email from your backup, it will see the IP/server name in the received headers, which will cause the probability to be spam to increase slightly, but this is correct since most email from your backup is spam. If the message is good to begin with, statistically the number of "good" tokens will likely make up for the "bad" score caused by the ip. This is all theory however, actual use will prove its validity. Roberto F. |
|
![]() |
Post Reply ![]() |
|
Tweet
|
Forum Jump | Forum Permissions ![]() You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |
This page was generated in 0.146 seconds.