Print Page | Close Window

100% cpu usage

Printed From: LogSat Software
Category: Spam Filter ISP
Forum Name: Spam Filter ISP Support
Forum Description: General support for Spam Filter ISP
URL: https://www.logsat.com/spamfilter/forums/forum_posts.asp?TID=5198
Printed Date: 21 December 2024 at 11:16pm


Topic: 100% cpu usage
Posted By: Guests
Subject: 100% cpu usage
Date Posted: 27 May 2005 at 9:34am
I'm trying to figure out an issue I've been having with spamfilter the past couple of weeks.  It seems that our "current inbound connections" has grown exponentially over the past couple of weeks.  We tried clustering two boxes together to handle this addtional volume, but one of the boxes was also our primary mail server and we experienced performance issues with that.

So now we have reloaded a new server box, reinstalled spamfilter, moved our white/blacklists and quarantine database over to it.  Spamfilter installed with a default of 100 simultaneous connections this time.  Gradually during a 12 to 24 hour period, the number of simultaneous connections will build up to the max of 100.  Before a few weeks ago, this number never got over 10 and our max was 20 concurrent sessions.

During all this, our CPU usage is maxed out at 100%, however spamfilter seems to be able to accept and pass through new email sessions fairly quickly despite task manager saying the cpu is maxed.

Currently we have spamfilter on a 1.8GHZ Pentium 4 system with 512MB of RAM running Windows 2003 Server Standard Edition.   We have one domain that we host that seems to be a spam and virus magnet and is creating the most volume of traffic, averaging about 15,000 legitimate emails per month   As of 05/27/2005 at 2:39AM, our new install of Spamfilter has processed 3,428 inbound connections, blocked 3,348 emails and forwarded 612 emails.  I don't believe that this is a very high volume of email so I'm a bit puzzeled as to the high cpu usage I'm seeing.  Also puzzled as to why this has become an issue in the past few weeks as it has.

Any insight from the spamfilter/mail server gurus out there would be greatly appreciated.

Thanks.



Replies:
Posted By: Desperado
Date Posted: 27 May 2005 at 9:50am

Fred,

About how many message / day do you handle?  Also can you set the following in your ini file?

IdleDisconnectMinutesTimeout=15

This will force connections that are not actually doing anything to close.

Regards,



-------------
The Desperado
Dan Seligmann.
Work: http://www.mags.net
Personal: http://www.desperado.com



Posted By: Guests
Date Posted: 27 May 2005 at 10:35pm

I would estimate about 6,000 to 7,000 per day given the current stats on the spamfilter.   I checked the ini file and there is already and idledisconnect set to 15 minutes. 

Originally posted by Desperado Desperado wrote:

Fred,

About how many message / day do you handle?  Also can you set the following in your ini file?

IdleDisconnectMinutesTimeout=15

This will force connections that are not actually doing anything to close.

Regards,



Posted By: Desperado
Date Posted: 27 May 2005 at 11:17pm

Fred,

I will not be responding for a bit as I am heading down to Indiana for the weekend ... BUT, I am doing over 150,000 / day and do not see the same issue.  However,  What version / build are you running?

Regards,



-------------
The Desperado
Dan Seligmann.
Work: http://www.mags.net
Personal: http://www.desperado.com



Posted By: LogSat
Date Posted: 28 May 2005 at 11:08am
Fred,

Could you email us at support@logsat.com your SpamFilter.ini file and a couple of SpamFilter's activity logfiles so we can try to see what's going on? Can you please also let us know what version of SpamFilter you're using?

In the meantime, what database platform are you using? Could you check to see if the database is able to handle the incoming load of emails without problems? We've seen *one* case where MySQL was freezing when quarantining large emails, and the MySQL ODBC driver that SpamFilter used, in turn, was also freezing the connections.


-------------
Roberto Franceschetti

http://www.logsat.com" rel="nofollow - LogSat Software

http://www.logsat.com/sfi-spam-filter.asp" rel="nofollow - Spam Filter ISP


Posted By: Guests
Date Posted: 29 May 2005 at 1:37am

I'm running the latest build, v 2.5.2.457.  I had upgraded to v 2.5.1.441 when we first began experiencing issues with concurrent sessions maxing out and bogging down the cpu.  At first I thought maybe we were getting SMTP flooded or something.  I boosted our max sessions up and configured two server boxes with spamfilter into a load balancing cluster in hopes this would resolve the issue.  Then we began seeing both machines max out cpu's and max sessions.  So I went back to a single, faster server box with v 2.5.2.457  assuming that perhaps it was a cpu capacity issue.

We are currently using the default Access database and I'm wondering if that could be the issue now.  I'm also thinking about flushing the database to see if that may resolve this issue.  The current size of the qurantine database is 449MB.

I'll email my .ini file and some of the log files to support tommorow or Monday one when I get a chance.

Originally posted by Desperado Desperado wrote:

Fred,

I will not be responding for a bit as I am heading down to Indiana for the weekend ... BUT, I am doing over 150,000 / day and do not see the same issue.  However,  What version / build are you running?

Regards,



Posted By: LogSat
Date Posted: 29 May 2005 at 10:54am
Fred,

You *definetly* do not want to use MS Access for that kind of traffic. Access was designed for single user applications, it is able to handle multi-users, but only to a certain extent, as performance is very bad. It is not designed for high usage applications. We include support for it in SpamFilter so users can easily test their implementations, and possibly use it in low volume installations, but we do not recommend it's use in live environments with multiple concurrent incoming connections as in your case.


-------------
Roberto Franceschetti

http://www.logsat.com" rel="nofollow - LogSat Software

http://www.logsat.com/sfi-spam-filter.asp" rel="nofollow - Spam Filter ISP


Posted By: Guests
Date Posted: 31 May 2005 at 8:01am

Ok, that may explain the 100% cpu usage.  I've always noticed spikes in the cpu whenever traffic was coming in.

In watching my activity though, it seems like it's the same emails that keep coming back and building up the concurrent sessions to max.  It's as if the sending mail server(s) never realize that the email was actually sent.   This is happening intermittently with some emails we are receiving and is causing some of our end users to receive multiple copies of the same email.  I'm sending you a copy of some of my logfiles and my ini file now to see if there's something else involved here besides the access database.

Thanks for all of your feedback.

Originally posted by LogSat LogSat wrote:

Fred,

You *definetly* do not want to use MS Access for that kind of traffic. Access was designed for single user applications, it is able to handle multi-users, but only to a certain extent, as performance is very bad. It is not designed for high usage applications. We include support for it in SpamFilter so users can easily test their implementations, and possibly use it in low volume installations, but we do not recommend it's use in live environments with multiple concurrent incoming connections as in your case.


Posted By: vrspock
Date Posted: 31 May 2005 at 8:58pm
Today I wiped the corpus database and wiped the quarantine database.  I've confirmed with some of my end users that senders are getting NDR's from their mail servers despite the fact that their email is being delivered to the recipient's mailbox.  This may explain why users are getting multiple copies of the same emails.

I activated the connections grid and the sessions that appear "hung" are sitting at the RCPT TO status.  When doing an SMTP debug of one of these sessions it seems to send the first few SMTP commands followed by the message body then just sits there indefinetly until the spamfilter service is restarted.

As I said before, this suddenly became an issue just a couple of weeks ago.  I'm at a loss as to what could be going on.  Going to try to reboot our firewall just to see if there is something weird going on with it.


Posted By: JimMeredith
Date Posted: 01 June 2005 at 12:31am

Originally posted by Fred Dickey Fred Dickey wrote:

I'm trying to figure out an issue I've been having with spamfilter the past couple of weeks...

About two weeks ago... did you add the "German Spam" subject lines to your keywords filter around that time?  Or do you make regular additions to your keywords list?  Either way, this thread might apply: http://www.logsat.com/spamfilter/forums/forum_posts.asp?TID=5093 - http://www.logsat.com/spamfilter/forums/forum_posts.asp?TID= 5093

 



Posted By: vrspock
Date Posted: 01 June 2005 at 9:51am
Thanks for the post.  I'll take a very careful look at our keyword list.  Most of our keywords are URLS of known spam, so the SURBL may be an answer to significantly reducing the size of our keyword list and thus make our spam filter more efficient.  We will have to experiment with it to find the right balance.

I think I fixed the hanging sessions issue...at least, with a work around for now.  It seems to be the same from addresses that were hanging with multiple sessions all the time, so I added them to the exclfrom white list to see what that would do and it seems to have worked.  No more hung sessions from those people...at least, no more 100+ sessions all sitting at the RCPT TO status.

Just 2 to 8 simultaneous sessions at any given time...whew...back to normal....I think.  Would still like to track down the cause of this issue as I'm sure I'll run into more sessions that I will have to apply the same work around to.

Originally posted by JimMeredith JimMeredith wrote:

Originally posted by Fred Dickey Fred Dickey wrote:

I'm trying to figure out an issue I've been having with spamfilter the past couple of weeks...

About two weeks ago... did you add the "German Spam" subject lines to your keywords filter around that time?  Or do you make regular additions to your keywords list?  Either way, this thread might apply: http://www.logsat.com/spamfilter/forums/forum_posts.asp?TID=5093 - http://www.logsat.com/spamfilter/forums/forum_posts.asp?TID= 5093

 



Posted By: vrspock
Date Posted: 03 June 2005 at 12:50pm
Haven't found anything obvious in the keywords filter that may be causing this.  Sessions seems to be hanging at about the time it checks the corpus database, however wiping that database didn't fix the issue.

I'm still running into sessions hanging intermittently and the only temporary fix seems to be to white list the from address and restart the spam filter service so that the next time their mail server attempts to resend the email it passes without incident.  Some of our users are still sporadically gettings duplicate emails due to this issue.




Posted By: vrspock
Date Posted: 03 June 2005 at 11:04pm

I found something in the white lists that didn't look too kosher tonight.  Not sure if it has anything to do with my issue or not but it may have showed up about the time this issue started.  In my autowhitelistforcedelivery.txt file was an entry for "System Administrator <>|nospam@v-sources.com"  This is obviously not a valid address so I'm wondering if it may be what has been causing some smtp sessions to hang like they have been.

I'll continue to monitor my server and post an update if that fixed it.



Posted By: Guests
Date Posted: 13 July 2005 at 10:36am

Any official way to fix this?

Yesterday 4 different servers running v 2.5.1.441 since it was released.  No problems since install.  Suddenly yesterday at about the same time all 4 started freezing connections in the "rcpt to" state & maxing cpu as described above. 

Removed all filters (keyword, blocked ips, etc.) one by one - same

Removed the use of Quaratine db - same

If I enter the addresses in "Excluded from emails" ( as vrspock states above ) it does allow those to pass correctly. 

This is not happening on all email, some aol.com and some list servers in particular.

Ideas?

Thanks,

Marcus



Posted By: LogSat
Date Posted: 13 July 2005 at 4:05pm
Marcus,

To-date we hae not been able to reproduce the problem. Any info you may have can be useful. In your case we can probably use:

If you have copies of the emails that made it thru after adding them to the whitelist that would be of great help (we'd need to original email, headers and body)
One of SpamFilter's activity log for a day this occurred
The email addresses you think are causing the problem
Your SpamFilter.ini file and all your local blacklist/whitelist and keyword files

Could you zip everything and email it to us at support_at_logsat.com?


-------------
Roberto Franceschetti

http://www.logsat.com" rel="nofollow - LogSat Software

http://www.logsat.com/sfi-spam-filter.asp" rel="nofollow - Spam Filter ISP


Posted By: Guests
Date Posted: 13 July 2005 at 4:55pm

I will gather as much data as i can and email to you.

Marcus



Posted By: Dan B
Date Posted: 14 July 2005 at 11:27am

A few weeks ago we had the same thing only with 2 of the 4 servers.  The 2 that was having an issue were are primary domain that receives a couple hundred thousand per day on each server  While it was peg at 100% CPU the memory was 200Mb Plus.  The only way I was able to get it back to normal was to set the IdleDisconnectMinutesTimeout=5  Then within mins it went back to normal.  I have left the setting to 5 mins and we havn't had any issues since.  We are using MySQL with 4 SF servers.

Thanks,
Dan B

 



Posted By: Guests
Date Posted: 14 July 2005 at 1:33pm

Same problem here with 2.5.1.441 version.

I get only 1-5 concurent connections, and around 10.000 mails per day. Running SpamFilter on Win2003 OS on Dual Xeon 3.06 GHz with 2 GB RAM.

SpamFilter configured to use MSSQL database, and uses from 50-90% CPU!!!

...until I figured out what consumes so much CPU - SpamFilter's AntiVirus  I removed AntiVirus and CPU consumption dropped down to normal 1-9%



Posted By: Guests
Date Posted: 14 July 2005 at 1:58pm

Dan

I have tried as low as IdleDisconnectMinutesTimeout=3 and doesn't seem to have any effect.

Logsat

I sent you some data via email 7/13 10:30pm cst and 7/14 12:15pm cst from Site-1M and Site-1C recspectively.

A complete redownload -reinstall-reconfig .ini at site-1M and problem has not recurred, same for site-1B, and site-1A.  Qdb is not active at these sites currently.  Site-1T has shown no instances of the problem & recieves the most email and spam. Site-1J has shown no instances of the problem.  Site-1C a complete different story.  Have done complete redownload -reinstall-reconfig .ini without and with Qdb and problem keeps recurring.  Lots of info in the email.

Marcus



Posted By: vrspock
Date Posted: 14 July 2005 at 2:08pm
I haven't had any more issues with this since I removed that keyword entry that appeared to be causing the issue.


Posted By: LogSat
Date Posted: 14 July 2005 at 11:38pm
With Marcus we found a specific keyword RegEx that was causing 100% CPU with a specific email's body. We're trying to determine what is the cause of the issue and will hopefully have a fix soon now that the problem is reproduceable.

-------------
Roberto Franceschetti

http://www.logsat.com" rel="nofollow - LogSat Software

http://www.logsat.com/sfi-spam-filter.asp" rel="nofollow - Spam Filter ISP


Posted By: LogSat
Date Posted: 17 July 2005 at 6:20pm
As mentioned before thanks to Marcus' email samples and help we were able to find a bug in the RegEx processor that caused the CPU to peak at 100%. We've posted a fix for it with build 2.6.3.473, now available in the registered user area. While the build appears very stable, it has not received all the testing we would have liked to perform before releasing it. However due to the seriousness of the problem we decided to make it available now for users who wish to deploy it immediately.

-------------
Roberto Franceschetti

http://www.logsat.com" rel="nofollow - LogSat Software

http://www.logsat.com/sfi-spam-filter.asp" rel="nofollow - Spam Filter ISP


Posted By: Guests
Date Posted: 18 July 2005 at 4:46pm

Please let me correct one of my statements above.  Evidently I did not (due to old-age disease) remove my keyword file.  I found the keyword problem about the same time Roberto did.

Thanks to Roberto and the team at LogSat.  Excellent support of your software. 

Marcus




Print Page | Close Window