2.0 Beta Question |
Post Reply |
Author | |
Dan B
Guest Group |
Post Options
Thanks(0)
Posted: 20 October 2003 at 4:29pm |
R, We have been running the 2.0 beta for 4 days of so. On what percentage does the spam filter mark the email message as spam? Is there a way to change that percentage possible with the GUI in future release or even turn off the Bayesian portion? I did read on your website that there is a issue of legitimate emails to be blocked after the corpus has grown to several MB in size. Any updates on this? We are seeing legitimate emails being blocked. Thanks, |
|
LogSat
Admin Group Joined: 25 January 2005 Location: United States Status: Offline Points: 4104 |
Post Options
Thanks(0)
|
Dan, The current beta blocks emails if the Bayesian probability is above 0.9 (90%). The second beta we are going to release in a few days will have either this value hardcoded to 0.99 or it will be user-selectable. The final version will definetly have it user-selectable. The s2nd beta will also automatically prune the corpus database removing old keyword tokens that have not been seen in emails recently. This should decrease the size of the corpus and help with memory leak issues present in the current beta. We still do not have any valid data on possible high rejections with large corpus databases. Any user input on this, where the corpus db.dat file is > 10MB will be appreciated. Roberto F. |
|
Dan B
Senior Member Joined: 09 February 2005 Location: United States Status: Offline Points: 105 |
Post Options
Thanks(0)
|
R, >>We still do not have any valid data on possible high rejections with large corpus databases. Any user input on this, where the corpus db.dat file is > 10MB will be appreciated. What info do you need? Some emails that are rejected that are legitimate emails, our corpus files? Dan B |
|
LogSat
Admin Group Joined: 25 January 2005 Location: United States Status: Offline Points: 4104 |
Post Options
Thanks(0)
|
Ideally we'd like statistics, not email content itself. We'd like to know what happens to the percentages of false positives (good emails being blocked), and percentage of "misses" (spam slipping thru the filters) as the corpus increases in size. Roberto F. |
|
Post Reply | |
Tweet
|
Forum Jump | Forum Permissions You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |
This page was generated in 0.205 seconds.