RegEx filter |
Post Reply |
Author | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Erik Reed
Guest Group |
Post Options
Thanks(0)
Posted: 27 June 2003 at 12:01pm |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
I tried using the "magic" block suggested previously in this forum of: (<[!--]*[a-zA-Z0-9]{11,}) Immediatly upon applying this EVERY email that was received was blocked... What did I do wrong here? And no, the messages were perfectly valid emails. I immediatly removed this line. Upon inspecting the messages I noticed there were a lot of !--fjkdfdfkjdjf like stuff in the headers but none preceeded by a <. Most of the messages were from AOL accounts, which appears to add a TON of junk into the headers... any help would be appreciated as I want to eliminate the junk with invisible (invalid) html tags... Thanks |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Desperado
Senior Member Joined: 27 January 2005 Location: United States Status: Offline Points: 1143 |
Post Options
Thanks(0)
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Erik, If you actually are using the following: (<[!--]*[a-zA-Z0-9]{11,}) It should work .... However, it does produce some "False Positives" but not ALL messages. An earlier posting, and I can not remember by whom, made the following very subtle change" (<[!--]+[a-zA-Z0-9]{11,}) This is the one I am using and have not, seen or received comments about ANY "false positives" Double check your actual entered expression (and change the "*" to "+") and see if it works then. Dan S.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Erik Reed
Guest Group |
Post Options
Thanks(0)
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Thanks. I just applied the "subtle" change, it seems to be working... |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Dan B
Senior Member Joined: 09 February 2005 Location: United States Status: Offline Points: 105 |
Post Options
Thanks(0)
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
We had that in our keywords black list but I had to remove it. It was catching messages from anyone who was using IncrediMail. IncrediMail inserts a comment in the body of the message <!--IncrediMail--> It triggers that due to IncrediMail is 11 chars long. It's too bad that we couldn't do logic within the black list like if, then, else statements. If it finds the <!--IncrediMail--> in a message bypass the regex filter. If anyone has a fix for this, please send it my way. Thanks,
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Desperado
Senior Member Joined: 27 January 2005 Location: United States Status: Offline Points: 1143 |
Post Options
Thanks(0)
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
2 things ... I guess you could change the 11 to a 12 and I do not think I am getting false positives on the expression. Got a good way to search a 2GIG DB for incredimail ? A way that I won't grow old waiting for? Dan S.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Dan B
Senior Member Joined: 09 February 2005 Location: United States Status: Offline Points: 105 |
Post Options
Thanks(0)
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Dan, Question for you. We are having problems with memory consumption. Here are the specs of our 4 servers. 4 servers are the same. Database server Every time the SF does a quarantine auto refresh the CPU on that server spikes 100%. The CPU spike lasts for about 4 mins. I monitor the system resources and while it's at 100% the memory usage history starts to climb. After the CPU comes back to normal the memory does not get released. I do not know what it interval is that the quarantine auto refreshes but when it does the CPU again is 100% and the memory climbs more. This continues until there is no more memory available. At that time either the SF stops responding or it throws an application error on the screen. The database has 2 days of quarantine data in it. About 366,000 records in the tblquarantine and 296,000 records in the tblmsgs. We were using MySQL to handle the load until the tblquarantine reaches about 600,000 records. Once that was reached we started getting more table read & write locks. When that occurred all the spamfilter servers just stopped responding since it couldn’t insert or select records. We actually had to stop and restart the MySQL service to get the spamfilters back to normal. I prefer to us MySQL instead of MS SQL not because of price. But I feel that it’s faster on select & write statements. Thanks, |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
LogSat
Admin Group Joined: 25 January 2005 Location: United States Status: Offline Points: 4104 |
Post Options
Thanks(0)
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Dan B, Dan S., The large memory consumption is mainly caused by the quarantine grid display. We're testing a new build internally with some enhancements, one of which is to clear the grid display, which releases all the memory used by it. I was not aware you had such large databases, this should have a great decrease of RAM, possibly down to using just a few dozens Megs. We've also added the "0" option to disable the quarantine delete as requested. The release notes for this build are the following: // New to VersionNumber = '1.2.0.171';
I'll be sending you both a private email with the link to the updated EXEs if you wish to test it before we release it officially Roberto Franceschetti |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Desperado
Senior Member Joined: 27 January 2005 Location: United States Status: Offline Points: 1143 |
Post Options
Thanks(0)
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Dan, I have been up for a VERY long time. I will get more detailed after I get some rest but I will address some of your questions. All this is PRIOR to the "Private Build"
>>>What database are you using? >>>We are having problems with memory consumption. My Primary Server Sits on the same Ethernet Segment as my DB server. Primary Specs: Secondary Server is 20 Miles away in my home on a T1 Database server >>>Every time the SF does a quarantine auto refresh the CPU on that server spikes 100%.... >>>>The database has 2 days of quarantine data in it. >>>>About 366,000 records in the tblquarantine and 296,000 records in the tblmsgs.
>>>We have a scheduled task to delete the tblquarantine & tblmsgs records it’s much faster deleting them then SF is doing it.
Again, I am not seeing this ... perhaps the higher RAM and perhaps the dual procs is helping. I have my delete interval set to (the default?) 60 minutes. >>>We were using MySQL .....
As a side note, I am almost worried about how Little memory this test build is using. I am still only at 15 Meg. I am running about 10Meg on my secondary. SOME KEY NOTES: I have the MS SQL set to use 1GB of RAM and NOT DYNAMIC but static. Also, and this is real critical, The db transaction logging is set to "SIMPLE". I also do not allow my system PageFile to resize .... If I run out of memory ... well, I guess I crash but I have sized it so that does not happen. The data partition for the databases ABSOLUTELY CAN NOT USE NTFS COMPRESSION. Unless you like corrupt data and want SQL to try to constantly repair it. My SQL "Maintenance plan" DOES NOT re-index or reorganize the data. I have to get home now. More later. I asked Roberto to give you my e-mail address if you want to shout at me directly. Regards, Dan S. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Desperado
Senior Member Joined: 27 January 2005 Location: United States Status: Offline Points: 1143 |
Post Options
Thanks(0)
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Dan,
Here are my current, as of "Now" DB Statistics:
Dan S |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Desperado
Senior Member Joined: 27 January 2005 Location: United States Status: Offline Points: 1143 |
Post Options
Thanks(0)
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Roberto, So far, the memory ussage is VERY low around 13 meg or so. I will let you know what happens when traffic picks up. The lack of quarantine information is "Just what the Dr. ordered". I like that better. Question ... What effect will lowering or disabling the Delete interval? I am not aware of it causing any problem ... Regards, Dan S. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Desperado
Senior Member Joined: 27 January 2005 Location: United States Status: Offline Points: 1143 |
Post Options
Thanks(0)
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Roberto, Some stats: SpamFilter Mem after 2 hours running ~ 12MB SpamFilter Mem after refreshing DB ~ 255MB SpamFilter Mem after clearing Data Grid ~ 12MB The more I think about it, the more I feel that 99% of the problems Dan is having is directly related to high Memory useage. I didn't see it because I have 4 times as much RAM. Lookin' good. Dan S.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
LogSat
Admin Group Joined: 25 January 2005 Location: United States Status: Offline Points: 4104 |
Post Options
Thanks(0)
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Yes, 10MB-20MB usage is much more like it! Again I was not aware of the DB size, and didn't realize the memory consumption it could cause. Our experience with the Quarantine Delete Interval is the following. The larger the interval, the more emails (records) will be deleted from the database when it kicks off. The more records, the slower and more memory intensive this operation is, both for SpamFilter and for the database. In SpamFilter most "things" are performed in separate threads, so it should not affect normal operations, but the CPU does go high when this is done. Lowering the interval makes this process happen more often, but makes it also less resource intensive. We found that 60 minutes is a good balance for us. Roberto F. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
George
Guest Group |
Post Options
Thanks(0)
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Dan, |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Desperado
Senior Member Joined: 27 January 2005 Location: United States Status: Offline Points: 1143 |
Post Options
Thanks(0)
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
George, It is a bunch of SQL Queries running from ASP. I just came in from mowing to get H2O ... I can provide the information when I come in for real ... 5 Acres to mow and I need to "Hog" another 2. Takes a while so be patient. Much to my dismay, I lost the cooler part of the day servicing my side bar. However, it IS good to get away from computers and networks for a bit! Dan |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Alan
Guest Group |
Post Options
Thanks(0)
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Yes please do share your coding with the rest, Dan.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Desperado
Senior Member Joined: 27 January 2005 Location: United States Status: Offline Points: 1143 |
Post Options
Thanks(0)
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
All, A new "Thread" should be started for this ... it has nothing to do with RegEX ... we kinda drifted a little! The ASP code I have is specific to my own menu system but the SQL Queries are "general". Do you want just the SQL Queries or a copy of the 2 ASP pages I have that use them? I have 2 pages because one just gets the general stats and is very fast. The second one adds the "spread" of the DNSBL's that I use and with around 350K messages in my database, it takes just under a minute to get 4 of them so I only do that when I am trying to see if my order of bl's is optimal (what ever optimal is). Dan S |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Alan
Guest Group |
Post Options
Thanks(0)
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Why not supply both so others can simply get what they need?
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Post Reply | |
Tweet
|
Forum Jump | Forum Permissions You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |
This page was generated in 0.227 seconds.