Dan S Keyword RegEx Update |
Post Reply ![]() |
Author | |
Trinidad ![]() Guest Group ![]() |
![]() ![]() ![]() ![]() ![]() Posted: 26 September 2003 at 9:11am |
Your regex key words are the best, do you have any updates since your last posting?
|
|
![]() |
|
Desperado ![]() Senior Member ![]() ![]() Joined: 27 January 2005 Location: United States Status: Offline Points: 1143 |
![]() ![]() ![]() ![]() ![]() |
Trinidad, Hmmmm good question. I have added some, removed some, changed some and added some temporary stuff to stop a couple of floods we got. Below is a list of my "mainstream" keywords .... by that I mean that I use these as a baseline for all my instances that I run. Some of these word wrap so be careful. ((http|3dhttp)://.{0,26}(((%.+%))|@|:)[(\d|\w)]) I hope these don't "break" anything for you. I check these for false positives often and as I stated in an earlier post, I do have to allow quite a few listservers in y white list because they do "bad" things in theur content. Here is my "Excluded From Addresses" List *@listproc.pcworld.com I also have this single entry in my keyword whitelist to resolve an issue with paypal. I used to have *@paypal.com in my from whitelist but there are a boadload of spoofed paypal addresses so that opened up a big hole. The keyword whitelist solved that. Here are my "Blocked From Addresses" (\b[\d+]+([\-a-za-z0-9_\.\+])+(@hotmail|@juno)\.com) Hope this all helps. Regards, Dan S. |
|
![]() |
|
Trinidad ![]() Guest Group ![]() |
![]() ![]() ![]() ![]() ![]() |
Your the man when it comes to RegEx Thanks much |
|
![]() |
|
Desperado ![]() Senior Member ![]() ![]() Joined: 27 January 2005 Location: United States Status: Offline Points: 1143 |
![]() ![]() ![]() ![]() ![]() |
Trinidad, Minor correction to 2 filters: (<[!--]+[a-zA-Z0-9]{2}(-\->)) (<(!\-\- )+[a-zA-Z0-9]{1}(\x20[a-zA-Z0-9]{3,20}){3,5}(-\->)) Dan S. |
|
![]() |
|
Carl Giljam ![]() Guest Group ![]() |
![]() ![]() ![]() ![]() ![]() |
The messages that slip through the filter now tend to be where the text has been broken up by a lot of comments, example:
mppa iofyydehhz pn
z ojdjgnugps hk i-->er - uer - uer - up to 36 ho
|
|
![]() |
|
Carl Giljam ![]() Guest Group ![]() |
![]() ![]() ![]() ![]() ![]() |
The messages that slip through the filter now tend to be where the text has been broken up by a lot of comments, example:(I'm leaving out the example this time, the forum truncated my previous message so trying to see if this works better - you will have to imagine a lot of nonsense comments)..etcetera. This doesn't seem to be caught by your current RegEx although it to the human eye is very obvious junk (no matter what language you speak!). I'm not good at writing RegEx'es but have a few ideas - comments anyone?1) Filter out on "nonsense" words in comments. Nonsense = 4 or more consecutive consonants.2) Maybe change the above to 4 or more consecutive non-vowels (to filter out things like glk4zm2pq)3) Same as 1) and 2) but for 3 or more consecutive vowels.4) Same idea as all the above but applied only to the Subject of the email. This often contains nonsense words and I can see no good reason at all to allow them.5) In all the above, maybe necessary to allow for more than 4 consonants (or 3 vowels) to get things like KPMG (for accountants) etc through but there should be some reasonable number, say 5-6 where the filter could kick in.6) Filter out if number of chars in comments > number of non-comment chars, either over one single line or in the message as a whole. number of non-comment chars, either over one single line or in the message as a whole. number of non-comment chars, either over one single line or in the message as a whole.7) Filter out if number of chars in comments are > X (in the message as a whole). X (in the message as a whole). X (in the message as a whole).8) Filter out if any comments at all...Well - can any of this be done in RegEx and would it create any problems?
|
|
![]() |
|
Desperado ![]() Senior Member ![]() ![]() Joined: 27 January 2005 Location: United States Status: Offline Points: 1143 |
![]() ![]() ![]() ![]() ![]() |
Well ... I like your ideas but one thing is at the moment, we can't specify filters for the "Subject" only. I will take a very serious look at your ideas and see if I can come up with a "Clean" RegEx or 2. Dan S. |
|
![]() |
Post Reply ![]() |
|
Tweet
|
Forum Jump | Forum Permissions ![]() You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |
This page was generated in 0.191 seconds.