Memory leak or a feature of Windows 2008? |
Post Reply ![]() |
Author | |
Neolisk ![]() Newbie ![]() Joined: 13 July 2009 Location: Toronto, ON Status: Offline Points: 27 |
![]() ![]() ![]() ![]() ![]() Posted: 17 July 2009 at 10:54am |
After about 2 days runtime, memory usage of SpamFilterSvc.exe is 728MB. Isn't it too much for such a small program? Will it grow even bigger?
Edited by Neolisk - 17 July 2009 at 10:54am |
|
![]() |
|
LogSat ![]() Admin Group ![]() ![]() Joined: 25 January 2005 Location: United States Status: Offline Points: 4104 |
![]() ![]() ![]() ![]() ![]() |
Neolisk,
All the blacklists/whitelists and the Bayesian database are kept in RAM to optimize lookups. Usually the bayesian database is the one that grows the larger. Can you please let us know what the filesize is for the corpus.data and the corpus.dat.prb files in the \SpamFilter\corpus directory? |
|
![]() |
|
Neolisk ![]() Newbie ![]() Joined: 13 July 2009 Location: Toronto, ON Status: Offline Points: 27 |
![]() ![]() ![]() ![]() ![]() |
db.dat ~ 127 MB
db.dat.prb ~ 103 MB Current memory consumption = 829 MB. If it goes up that fast, we'll have to restart the service quite often. P.S. I wanted to post a screenshot of the whole folder, but the forum engine would not allow that. Edited by Neolisk - 17 July 2009 at 4:17pm |
|
![]() |
|
LogSat ![]() Admin Group ![]() ![]() Joined: 25 January 2005 Location: United States Status: Offline Points: 4104 |
![]() ![]() ![]() ![]() ![]() |
Actually the Bayesian database should be loaded within about a minute or two after SpamFilter is started, so you should see the RAM usage go up within a few minutes.
Please note that the Bayesian filter is the last one to be used by SpamFilter, and thus will catch a very small percentage of spam compared to the other filters. In our own ISP for example, the Bayesian filter used to catch only about 0.1% of spam, compared to 99.9% of the other filters (we disabled this filter about a year ago on our own live server). Adding to this, the Bayesian filters were “the thing” 5 years ago, and for a while this was the “star” filter in our SpamFilter. However the spammers have since learned how to easily bypass them, making the Bayesian filter even less effective. As the Bayesian filter is the one that uses the most CPU and the most RAM, if that is affecting your server you may want to consider disabling it as well. If you wish to instead reset the Bayesian database, you can stop SpamFilter, delete (or rename) the SpamFilter/corpus directory, and then restart SpamFilter. Please note however that the database size will grow again. You can however have SpamFilter cleanup the database more frequently, thus reducing in size, by reducing this parameter in the SpamFilter.ini file: ;Remove any stale token in the corpus db.dat file that did not appear in incoming emails for the past n days CleanUpCorpusIntervalDays=7 |
|
![]() |
|
Neolisk ![]() Newbie ![]() Joined: 13 July 2009 Location: Toronto, ON Status: Offline Points: 27 |
![]() ![]() ![]() ![]() ![]() |
CPU hovers around 20% while running on 1 core. Spam gate is the only role of that server. So it's acceptable.
It looks like we had a power outage at the weekend (or the server randomly rebooted or the program did something to its memory consumption), so now it's just 67MB. Anyway, thanks for a suggestion. We'll see its behavior this week and decide if we really need to do anything. But even if one service restart per week is necessary, I think it's not a big problem. We're not an ISP and don't need Exchange running 24/7. |
|
![]() |
|
Neolisk ![]() Newbie ![]() Joined: 13 July 2009 Location: Toronto, ON Status: Offline Points: 27 |
![]() ![]() ![]() ![]() ![]() |
Now it eats 1.18 GB! Even after I restarted the service. Corpus files are smaller than 400MB, if taken together. What's wrong with it?
|
|
![]() |
|
LogSat ![]() Admin Group ![]() ![]() Joined: 25 January 2005 Location: United States Status: Offline Points: 4104 |
![]() ![]() ![]() ![]() ![]() |
As we mentioned in the previous post, SpamFilter keeps the bayesian database in memory (we're actually storing two copies of the database to optimize performance as both read/write access are required on it at any time). If the files total 400MB, that would add up to 800MB of RAM, plus a small percentage for overhead memory swaps. In this case, 1GB of RAM are thus legitimate.
The issue is thus not with a memory leak, but rather to see if this database size is justified. The more emails/day are received, and the longer the statistical tokens (words) in the emails are kept, will cause the database to increase. Can you please let us know ballpark how many emails per day you receive, and what value the setting above mentioned in the SpamFilter.ini file - CleanUpCorpusIntervalDay - has? In addition, if you can please zip and email us one of your latest SpamFilter's activity logfile for an entire day we'll debug them to ensure there are no errors occurring while cleaning up the bayesian database. IF the zip is over 8MB in size, I'll send you a PM with the login info for our FTP site. |
|
![]() |
|
Neolisk ![]() Newbie ![]() Joined: 13 July 2009 Location: Toronto, ON Status: Offline Points: 27 |
![]() ![]() ![]() ![]() ![]() |
Now it's even bigger: 1287MB! And it wasn't like this yesterday: only ~70MB was occupied although the corpus DB was about the same size. That's weird!
We can allocate any reasonable amount of memory. The question is: How much do we need to forget about this problem? CleanUpCorpusIntervalDays=7 I uploaded today's current logs on your FTP. Another question: since we don't manage spam emails and delete everything that is considered spam at any level of protection, is there any point to store Bayesian database? How does it work after all? |
|
![]() |
|
LogSat ![]() Admin Group ![]() ![]() Joined: 25 January 2005 Location: United States Status: Offline Points: 4104 |
![]() ![]() ![]() ![]() ![]() |
The Bayesian database holds statistical information about the various "tokens" (words/symbols) that your incoming emails contain. As email arrives, and it's categorized by the other filters as "spam" or "clean", the bayesian filter "learns" about the various token patterns, and is, statistically, able to distinguish similar patterns in the future to help identify emails as spam. As spam emails are dynamic (the type of spam you receive today will usually be different from the one you'll receive next week), these statistical "tokens" are set to expire after a few days if during that time new emails doe not contain them any more.
....however The answer to your question is there any point to store Bayesian database was in a way already answered earlier in this thread ![]()
We've received your log and will be looking over it shortly. Edited by LogSat - 23 July 2009 at 7:07pm |
|
![]() |
|
Neolisk ![]() Newbie ![]() Joined: 13 July 2009 Location: Toronto, ON Status: Offline Points: 27 |
![]() ![]() ![]() ![]() ![]() |
Well, until it was using the most CPU and RAM, but the values were generally low, I didn't care about it being enabled. Any additional protection is never odd, but now... again, it depends on how big it will grow. Currently 75% of memory is occupied. I also noticed that when it comes to 90%, huge lags happen, so the antispam gateway is in kind of critical state.
Anyway, thanks for looking! |
|
![]() |
Post Reply ![]() |
|
Tweet
|
Forum Jump | Forum Permissions ![]() You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |
This page was generated in 0.219 seconds.