Too many connections. Disconnecting |
Post Reply |
Author | |
swaber
Newbie Joined: 21 February 2006 Location: United States Status: Offline Points: 15 |
Post Options
Thanks(0)
Posted: 09 October 2006 at 8:06pm |
I just updated our system to version 3.1.3.597 last Saturday 9/30/06, and have been running into an issue ever since. Our server will run fine for a day or two then drop into a mode where it starts rejecting all connections indicating “Too many connections” requiring a restart of the service to correct. While in this state you can view the connections tab and find that there are no current connections in the table, but the system indicates 25. The settings we have configured for Max connections are the same as previously configured under 2.7.1.515. Log File Sample 10/06/06 06:52:18:614 -- (8432) Connection from: 218.15.1.19 - Originating country : China |
|
Scott Waber, MCSE, CCNP
Systems Administration Specialist City of Las Vegas |
|
jerbo128
Senior Member Joined: 06 March 2006 Status: Offline Points: 178 |
Post Options
Thanks(0)
|
Scott, take a look here: http://www.logsat.com/spamfilter/forums2/5637?TID=5637&P N=1 I think it's the same issue that we are having - our just does not reach the max. jerbo128 |
|
swaber
Newbie Joined: 21 February 2006 Location: United States Status: Offline Points: 15 |
Post Options
Thanks(0)
|
Thanks I had seen your Post, your issue looked similar, but seems to be different. I'll confirm on the next occurrence. |
|
Scott Waber, MCSE, CCNP
Systems Administration Specialist City of Las Vegas |
|
swaber
Newbie Joined: 21 February 2006 Location: United States Status: Offline Points: 15 |
Post Options
Thanks(0)
|
I checked the system after it had been up for about a day and found what appeared to be 10 stuck connections, then checked it the next day and found about 18 stuck connections. So, it appears that this is the same issue, we slowly loose all available connections until no connections are available. |
|
Scott Waber, MCSE, CCNP
Systems Administration Specialist City of Las Vegas |
|
LogSat
Admin Group Joined: 25 January 2005 Location: United States Status: Offline Points: 4104 |
Post Options
Thanks(0)
|
Scott,
Could you please email us SpamFilter's activity logfile for the day, a screenshot of your "Connections" tab, and the output of the netstat -n command from a DOS prompt (screenshot and netstat command should be performed at the same time if possible)? |
|
Stephane
Newbie Joined: 16 October 2006 Status: Offline Points: 5 |
Post Options
Thanks(0)
|
Hi, I started to get this same error it happened 2 days ago, and now this morning again.... It looks like it is related to some time out with the SFDB.. 10/16/06 03:07:34:460 -- (92564) HTTP Error in SFDBUploadIP check:HTTP/1.1 500 Server Error 10/18/06 10:09:29:159 -- (173708) HTTP Error in DoSFDBCheck:Socket Error # 10060 -- Connection timed out. As soon as i get a timeout or error with the SFDB.... connections will stay.. and will eventually get the too many connections..and no more emails coming through.. |
|
mikek
Senior Member Joined: 22 February 2005 Location: Switzerland Status: Offline Points: 133 |
Post Options
Thanks(0)
|
Stephane could have a point here - I reached "max connections" this afternoon as well and after restarting the service I see the number of connections "hanging" at the "RCPT TO" status rising constantly...
Running 3.1.3.601 registered |
|
swaber
Newbie Joined: 21 February 2006 Location: United States Status: Offline Points: 15 |
Post Options
Thanks(0)
|
Yes, I to have seen SFDB related errors in my log. I discounted them initially since they seem to occur on a regular bases. In the last couple days I have been actively watching the server trying to collect the data tech support asked for, and developed a new theory. Two days ago I disabled SFDB out of frustration with major ISPs being blocked, and I have not lost one connection yet. My guess is the problem is related to SFDB, our previous version (2.7.1.515) did not have SFDB and we never had this issue. |
|
Scott Waber, MCSE, CCNP
Systems Administration Specialist City of Las Vegas |
|
WebGuyz
Senior Member Joined: 09 May 2005 Location: United States Status: Offline Points: 348 |
Post Options
Thanks(0)
|
Starting to suspect SFDB overload myself. Seeing a lot of SFDB hanging as well.... Any chance the bad guyz are attacking the SFDB server(s)?
Edited by WebGuyz |
|
http://www.webguyz.net
|
|
Stephane
Newbie Joined: 16 October 2006 Status: Offline Points: 5 |
Post Options
Thanks(0)
|
Hi, Again after lunch.. 10/18/06 12:26:56:724 -- (102792) HTTP Error in SFDBUploadIP check:Socket Error # 10054 -- Connection reset by peer. |
|
Stephane
Newbie Joined: 16 October 2006 Status: Offline Points: 5 |
Post Options
Thanks(0)
|
Sorry, Forgot to mention .. my version is 3.1.3.598
|
|
LogSat
Admin Group Joined: 25 January 2005 Location: United States Status: Offline Points: 4104 |
Post Options
Thanks(0)
|
We're trying to figure out what the relationship is, but it is indeed a strange coincidence that we're looking into.
This morning, between 8am-1pm EST, we experienced severe slowdowns in our internet connection (it was an internal problem on our ISP, no hackers attacking the SFDB, no worries there), and from all your reports, the issues do seem related. We're looking into HTTP timeouts, and are trying to replicate the scenario in our labs. Hopefully we'll have updates soon. |
|
LogSat
Admin Group Joined: 25 January 2005 Location: United States Status: Offline Points: 4104 |
Post Options
Thanks(0)
|
SpamFilter has failsafes in place so that mail processing continues even if the SFDB service is not available.
We did find a problem when the SFDB web services are *slooow* rather than unavailable. In this case, the HTTP request timeouts are too long when reporting a spammer IP to the SFDB. The reporting occurs upon disconnect, and this can affect the counter that keeps track of the current connections. The counter missses are very rare, and were thus hard to locate. We were able to replicate this by placing SpamFilter behind a 56K modem and hitting it with 200 concurrent connections. Here we were finally able to reproduce the "stuck connections" problem! We're testing build 3.1.603 with a fix. So far it looks fine, but it will need to be tested quite a bit more to ensure there are no other issues. If anyone is still suffering from major issues with the "stuck connections", we've made the build available on the website. |
|
mikek
Senior Member Joined: 22 February 2005 Location: Switzerland Status: Offline Points: 133 |
Post Options
Thanks(0)
|
Build 603 looks very promising so far - connection stats have been accurate for the last 3 hours...
|
|
WebGuyz
Senior Member Joined: 09 May 2005 Location: United States Status: Offline Points: 348 |
Post Options
Thanks(0)
|
We've run 12k msgs thru on .603 so far this morning and looks great.<fingers crossed>
|
|
http://www.webguyz.net
|
|
kfries
Newbie Joined: 16 August 2006 Status: Offline Points: 7 |
Post Options
Thanks(0)
|
Any updates on the .603 version? I have been having this issue as well but am hesitant to install .603. Those of you that have it running, has it been stable so far?
|
|
jerbo128
Senior Member Joined: 06 March 2006 Status: Offline Points: 178 |
Post Options
Thanks(0)
|
I have had no problems with 603. We've processed about 75K messages.
|
|
BigDog
Newbie Joined: 26 January 2005 Location: United States Status: Offline Points: 11 |
Post Options
Thanks(0)
|
Been having bad problems with SF just as described, seems that SFDB is creating problems; SF has been down a lot in the last couple of months. It goes for hours if not noticed and accepts no messages, tunring hte SFDB option off appears to make it stable.
Running 3.1.3.597 since it came out, have not tried any pre-release as there is a Barracuda sitting on the workbench waiting implementation. :( Here are some excerts from the logs.... 10/25/06 16:50:41:446 -- Starting to process queue directory...
10/25/06 16:50:41:493 -- (24572) Blacklist cache - starting cleanup 10/25/06 16:50:41:524 -- (30004) HTTP Error in GetSFDBStats:Access violation at address 7C81D150 in module 'ntdll.dll'. Read of address FFBE001F 10/25/06 16:50:41:571 -- (24572) Blacklist cache - removed IP 190.45.235.212 from limbo during cleanup 10/25/06 16:50:41:618 -- (29492) Sending email from parmentieuoralie@hydranautics.com to sjw@mydomain.com-- 10/25/06 16:50:41:649 -- (24572) Blacklist cache - removed IP 202.96.114.27 from limbo during cleanup 10/25/06 16:50:41:696 -- (29492) Exception - Access Violation Access violation at address 7C81D150 in module 'ntdll.dll'. Read of address FFBE001F 10/25/06 16:50:41:743 -- (31452) Sending email from Hortencia@aallonlineprofits.net to bjbrooks@mydomain.com -- 10/25/06 16:50:41:774 -- (24572) Blacklist cache - removed IP 209.16.28.247 from limbo during cleanup 10/25/06 16:50:41:821 -- (31452) Exception - Access Violation Access violation at address 7C81D150 in module 'ntdll.dll'. Read of address FFBE001F 10/25/06 16:50:41:852 -- (24572) Blacklist cache - removed IP 216.150.25.108 from limbo during cleanup 10/25/06 16:50:41:899 -- (32284) Sending email from Hortencia@aallonlineprofits.net to cje@mydomain.com -- 10/25/06 16:50:41:946 -- (24572) Blacklist cache - removed IP 217.76.36.51 from limbo during cleanup 10/25/06 16:50:41:977 -- (27712) Sending email from Hortencia@aallonlineprofits.net to jml@mydomain.com -- 10/25/06 16:50:42:024 -- (32284) Exception - Access Violation Access violation at address 7C81D150 in module 'ntdll.dll'. Read of address FFBE001F 10/25/06 16:50:42:071 -- (24572) Blacklist cache - removed IP 221.162.107.163 from limbo during cleanup 10/25/06 16:50:42:102 -- (27712) Exception - Access Violation Access violation at address 7C81D150 in module 'ntdll.dll'. Read of address FFBE001F 10/25/06 16:50:42:149 -- (24572) Blacklist cache - removed IP 222.154.30.23 from limbo during cleanup 10/25/06 16:50:42:196 -- (29544) Sending email from Hortencia@aallonlineprofits.net to ljp@mydomain.com -- 10/25/06 16:50:42:227 -- (24572) Blacklist cache - removed IP 222.37.134.95 from limbo during cleanup 10/25/06 16:50:42:274 -- (28576) Sending email from Hortencia@aallonlineprofits.net to blt@mydomain.com -- 10/25/06 16:50:42:321 -- (29544) Exception - Access Violation Access violation at address 7C81D150 in module 'ntdll.dll'. Read of address FFBE001F 10/25/06 16:50:42:352 -- (24572) Blacklist cache - removed IP 24.123.22.198 from limbo during cleanup 10/25/06 16:50:42:399 -- (28576) Exception - Access Violation Access violation at address 7C81D150 in module 'ntdll.dll'. Read of address FFBE001F 10/25/06 16:50:42:446 -- (24572) Blacklist cache - removed IP 24.123.28.53 from limbo during cleanup 10/25/06 16:56:41:539 -- (11696) HTTP Error in GetSFDBStats:Access violation at address 7C81D150 in module 'ntdll.dll'. Read of address FFBE001F 10/25/06 18:07:03:117 -- Bayesian Thread is not running - starting...
10/25/06 18:07:03:149 -- (4524) BayesianThread starting 10/25/06 18:07:03:196 -- (4524) TBayesianThread - Begin LoadFromFile for corpus.db (db.dat) 10/25/06 18:07:03:305 -- (4524) TBayesianThread - LoadFromFile for Corpus.db - copied db.dat -> IndA142.tmp 10/25/06 18:07:03:399 -- (4524) TBayesianThread - LoadFromFile for Corpus.db - copied db.dat.prb -> IndA143.tmp 10/25/06 18:07:03:446 -- (4524) TBayesianThread - LoadFromFile for Corpus.db - setting Buffer size to 20930398 10/25/06 18:07:03:477 -- (4524) TBayesianThread - LoadFromFile for Corpus.db - Reading Buffer in mem 10/25/06 18:07:03:571 -- (4524) TBayesianThread - LoadFromFile for Corpus.db - loaded files in memory - IndA142.tmp 10/25/06 18:07:03:649 -- (4524) TBayesianThread - LoadFromFile for Corpus.db - loaded files in memory - IndA143.tmp 10/25/06 18:07:06:117 -- (4524) TBayesianThread - End LoadFromFile for corpus.db (db.dat) (2670) 10/25/06 18:07:41:446 -- (9384) Blacklist cache - starting cleanup 10/25/06 18:08:41:446 -- Starting to process queue directory... 10/25/06 18:08:41:492 -- (4848) HTTP Error in GetSFDBStats:Cannot allocate socket. 10/25/06 18:08:41:524 -- (8660) Blacklist cache - starting cleanup 10/25/06 18:09:41:446 -- (13180) Blacklist cache - starting cleanup 10/25/06 18:10:41:446 -- (16824) Blacklist cache - starting cleanup 10/25/06 18:10:41:492 -- (18148) HTTP Error in GetSFDBStats:Cannot allocate socket. 10/25/06 18:11:41:446 -- Starting to process queue directory... 10/25/06 18:11:41:477 -- (17376) Blacklist cache - starting cleanup 10/25/06 18:12:41:446 -- (23116) Blacklist cache - starting cleanup 10/25/06 18:12:41:492 -- (22684) HTTP Error in GetSFDBStats:Cannot allocate socket. 10/25/06 18:13:41:446 -- (23664) Blacklist cache - starting cleanup
Edited by BigDog |
|
Post Reply | |
Tweet
|
Forum Jump | Forum Permissions You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |
This page was generated in 0.291 seconds.