Print Page | Close Window

Dan S Keyword RegEx Update

Printed From: LogSat Software
Category: Spam Filter ISP
Forum Name: Spam Filter ISP Support
Forum Description: General support for Spam Filter ISP
URL: https://www.logsat.com/spamfilter/forums/forum_posts.asp?TID=2070
Printed Date: 14 March 2025 at 4:02am


Topic: Dan S Keyword RegEx Update
Posted By: Guests
Subject: Dan S Keyword RegEx Update
Date Posted: 26 September 2003 at 9:11am
Your regex key words are the best, do you have any updates since your last posting?



Replies:
Posted By: Desperado
Date Posted: 29 September 2003 at 3:52am

Trinidad,

Hmmmm  good question.  I have added some, removed some, changed some and added some temporary stuff to stop a couple of floods we got.  Below is a list of my "mainstream" keywords .... by that I mean that I use these as a baseline for all my instances that I run. Some of these word wrap so be careful.

((http|3dhttp)://.{0,26}(((%.+%))|@|:)[(\d|\w)])
((<[!--]+[\x20]{0,1}[a-zA-Z0-9]{10,}[\x20]{0,1}[!--](.+)){2,})
( http://+[\d]{1,3}\.{1}[\d]{1,3}\.{1}[\d]{1,3}\.{1}[\d]{1,3" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - http://+" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - http://+ [\d]{1,3}\.{1}[\d]{1,3}\.{1}[\d]{1,3}\.{1}[\d]{1,3})
(<[!--]+[a-zA-Z0-9]{2}(-->))
(( http://http:/" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - http://http:/ \w)|(<(\w){3,10}(\x20/>)|(\* http://w" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - http://w )))
((limited time (special|offer)))
(((arge your p)|(1-4 inches)|(3 - 5 inches\!)|(generic viagra)|(123respmarket)|(herbalpillsonline)|(herbaltrials\.com)|(naturalherbal)|(pillsavings)|(gsc\-100)|(go771world)))
((your privacy is extremely important to us)|(this is not spam))
(((www\.)|( http://))(\w){1,20}(4u)\.(biz|com|net)|(medsonsale\.biz)|(freeandgetsave)|(opportunit12)|(thirdw\.com)|(teflondoninc)|(epromotionad" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - http://" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - http:// ))(\w){1,20}(4u)\.(biz|com|net)|(medsonsale\.biz)|(freeandgetsave)|(opportunit12)|(thirdw\.com)|(teflondoninc)|(epromotionad))
((lsgone\.php)|(Isgone\.php)|(exit\.asp)|(mc4\.idetermination)|(medsusa\.biz)|(getitwhileucan)|(\&\#105)|(4improvement\.biz)|(best\-ratez\.biz)|(genoveseinc\.biz))
((remove\.php)|(hit\.php))
(<(!\-\- )+[a-zA-Z0-9]{1}(\x20[a-zA-Z0-9]{3,20}){3,5}(-->))
((text\-decoration: blink)|(click here to start))

I hope these don't "break" anything for you.  I check these for false positives often and as I stated in an earlier post, I do have to allow quite a few listservers in y white list because they do "bad" things in theur content.

Here is my "Excluded From Addresses" List

mailto:*@listproc.pcworld.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@listproc.pcworld.com
mailto:*@industryweek.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@industryweek.com
mailto:*@gpsadvantage.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@gpsadvantage.com
mailto:*@gwbakeries.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@gwbakeries.com
mailto:*@peoples.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@peoples.com
mailto:*@*.lga2.nytimes.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@*.lga2.nytimes.com
mailto:*@*.*.nytimes.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@*.*.nytimes.com
mailto:*@softshare.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@softshare.com
mailto:*@regulusgroup.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@regulusgroup.com
mailto:*@e-news.fsonline.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@e-news.fsonline.com
mailto:*@lists.n-email.net" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@lists.n-email.net
mailto:*@lists.techtarget.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@lists.techtarget.com
mailto:*@lyris.stockupticks.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@lyris.stockupticks.com
mailto:*@multexinvestornetwork.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@multexinvestornetwork.com
mailto:*@newsletter.online.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@newsletter.online.com
mailto:*@insightmedia.info" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@insightmedia.info
mailto:*@nhfairfield.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@nhfairfield.com
mailto:*@newhorizons.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@newhorizons.com
mailto:*@rootsweb.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@rootsweb.com
mailto:*@*.rootsweb.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@*.rootsweb.com
mailto:*@returns.groups.yahoo.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@returns.groups.yahoo.com
mailto:*@cygnuspub.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@cygnuspub.com
mailto:*@*.classmates.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@*.classmates.com
mailto:*@listserv.usairways.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@listserv.usairways.com
mailto:*@jkp.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@jkp.com
mailto:*@laurin.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@laurin.com
mailto:*.*@dell.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *.*@dell.com

I also have this single entry in my keyword whitelist to resolve an issue with paypal. 

https://www.paypal.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - https://www.paypal.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - https://www.paypal.com

I used to have mailto:*@paypal.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@paypal.com  in my from whitelist but there are a boadload of spoofed paypal addresses so that opened up a big hole.  The keyword whitelist solved that.

Here are my "Blocked From Addresses"

(\b[\d+]+([\-a-za-z0-9_\.\+])+(@hotmail|@juno)\.com)
(\b[\d]+@(aol\.com|msn\.com|bellsouth\.net|brandeis\.edu))
(\w{17,}@(canada|aol|hotbot|msn)\.com)
((@hello\.com|@veriopt\.com|ha@sexyfun\.net|@himailer.com|clubhotlist@aol.com))
((\*@)(\w){1,30}(\.(com|net|org)){1})
(([\x20]{7,})|([\x09]{1,}))
(@(.){1,22}(\x20)(.){1,22})
(test[\d]{0,5}\.com)
(dsl\-verizon\.net)
anyone@*
noone@*
friend@*
someone@*
mailto:*@fcc-network.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@fcc-network.com
mailto:*@topprodsource.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@topprodsource.com
mailto:*@myobdeals.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@myobdeals.com
offers@*
senders@*
mailto:*@loyus.com:null" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@loyus.com:null
mailto:*@163.net:null" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@163.net:null
mailto:*@21cn.com:null" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@21cn.com:null
mailto:*@24horas.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@24horas.com
mailto:*@263.net:null" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@263.net:null
mailto:*@263.net.cn:null" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@263.net.cn:null
mailto:*.*@bounce.e-i1.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *.*@bounce.e-i1.com
mailto:*@amazingoffersdirect.net" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@amazingoffersdirect.net
mailto:*@godomains.com.au" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@godomains.com.au
mailto:w-mstenson@mindspring.com:null" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - w-mstenson@mindspring.com:null
mailto:*@mail.play4keeps.com:null" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@mail.play4keeps.com:null
mailto:directmail@badmail.worldnow.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - directmail@badmail.worldnow.com
mailto:admin@internet.com:null" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - admin@internet.com:null
mailto:admin@mags.net:null" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - admin@mags.net:null
mailto:postmaster@vienybe.lt:null" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - postmaster@vienybe.lt:null
mailto:*@selectgroupmedia.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@selectgroupmedia.com
mailto:domainreg@paypal.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - domainreg@paypal.com
mailto:*@biginkspot.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@biginkspot.com
mailto:*@shaw.ca" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@shaw.ca
mailto:*@the-dot-com-ink.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - *@the-dot-com-ink.com

Hope this all helps.

Regards,

Dan S.



Posted By: Guests
Date Posted: 29 September 2003 at 3:37pm

Your the man when it comes to RegEx

Thanks much



Posted By: Desperado
Date Posted: 30 September 2003 at 12:49am

Trinidad,

Minor correction to 2 filters:

(<[!--]+[a-zA-Z0-9]{2}(-\->))

(<(!\-\- )+[a-zA-Z0-9]{1}(\x20[a-zA-Z0-9]{3,20}){3,5}(-\->))

Dan S.



Posted By: Guests
Date Posted: 30 September 2003 at 3:40am
The messages that slip through the filter now tend to be where the text has been broken up by a lot of comments, example:

mppa iofyydehhz pn z ojdjgnugps hk i-->er - uer - uer - up to 36 ho


Posted By: Guests
Date Posted: 30 September 2003 at 3:44am
The messages that slip through the filter now tend to be where the text has been broken up by a lot of comments, example:

(I'm leaving out the example this time, the forum truncated my previous message so trying to see if this works better - you will have to imagine a lot of nonsense comments)

..etcetera. This doesn't seem to be caught by your current RegEx although it to the human eye is very obvious junk (no matter what language you speak!). I'm not good at writing RegEx'es but have a few ideas - comments anyone?

1) Filter out on "nonsense" words in comments. Nonsense = 4 or more consecutive consonants.

2) Maybe change the above to 4 or more consecutive non-vowels (to filter out things like glk4zm2pq)

3) Same as 1) and 2) but for 3 or more consecutive vowels.

4) Same idea as all the above but applied only to the Subject of the email. This often contains nonsense words and I can see no good reason at all to allow them.

5) In all the above, maybe necessary to allow for more than 4 consonants (or 3 vowels) to get things like KPMG (for accountants) etc through but there should be some reasonable number, say 5-6 where the filter could kick in.

6) Filter out if number of chars in comments > number of non-comment chars, either over one single line or in the message as a whole. number of non-comment chars, either over one single line or in the message as a whole. number of non-comment chars, either over one single line or in the message as a whole.

7) Filter out if number of chars in comments are > X (in the message as a whole). X (in the message as a whole). X (in the message as a whole).

8) Filter out if any comments at all...

Well - can any of this be done in RegEx and would it create any problems?


Posted By: Desperado
Date Posted: 30 September 2003 at 5:36am

Well ... I like your ideas but one thing is at the moment, we can't specify filters for the "Subject" only.  I will take a very serious look at your ideas and see if I can come up with a "Clean" RegEx or 2.

Dan S.




Print Page | Close Window