Print Page | Close Window

key words case sensitive?

Printed From: LogSat Software
Category: Spam Filter ISP
Forum Name: Spam Filter ISP Support
Forum Description: General support for Spam Filter ISP
URL: https://www.logsat.com/spamfilter/forums/forum_posts.asp?TID=1043
Printed Date: 05 February 2025 at 1:47am


Topic: key words case sensitive?
Posted By: Guests
Subject: key words case sensitive?
Date Posted: 23 June 2003 at 3:20pm
I have both "viagra" and "v1agra" in my key word list, as seperate entries, but I still get message with these listed in the subject line - why?



Replies:
Posted By: LogSat
Date Posted: 23 June 2003 at 3:29pm

Without looking at the source of the email we can't give a definite answer, but the following forum posting usually explains the reason for most failures:

========================================================

Starting with build v1.2.0.151 SpamFilter is able to scan the whole email content + subject header for RegEx (Regular Expression) keywords.

This allows very powerful keyword searches. Many spammers send html emails containing invalid (thus invisible) html tags or html comments in between letters to avoid normal keyword detection.

For example, the following html source:

<!--fxkbu8116c72f6-->SP<mynqhy2d9bswg-->AM 
    <!--ei2rq7erjldy3y-->MER<!--ywf1ph1zmgcik9-->

will actually display SPAMMER in an email client.

We've been using the following RegEx search string to, so far, successfully block a lot of this spam:

(<[!--]*[a-zA-Z0-9]{11,})

This is what the above expressions looks for (remember that SpamFilter requires a RegEx expression to be sorrounded by parenthesis () in order to distinguish it from regular keywords):

  • <   look for an open tag start character, immediately followed by...
  • [!--]*   this looks for zero or more occurrences of the  !--  characters indicating an html comment, immediately followd by...
  • [a-zA-Z0-9]  any letter or digit....
  • {11,}  repeated at least 11 times. This has to be a combination of only either letters or numbers. Any space, tab, single quote, double quote etc will break the sequence.

For example, <a href="aaaa.htm"> will not cause a trigger since there is a space immediately following the a before href.
We choose a minimum repetition of 11 since <blockquote> is a valid tag 10 characters long...

If anyone has comments, problems, or improvements with this "apparently magic" keyword search, please let us know!

Roberto Franceschetti
LogSat Software



Posted By: Guests
Date Posted: 23 June 2003 at 3:37pm

Yes, I'm using this.  It works well.

Below is an excerpt from my key eord list:

sex, energy
hot, sex
hot, babe
hot, teen
hot, girl
viagra
v1agra

Today I received an email with the following subject line:

"Viagra and Diet Pills prescribed online!  US doctors and pharmacies! Overnight Shipping  fdf pqgfgg"



Posted By: LogSat
Date Posted: 23 June 2003 at 5:23pm

Jack,

The keywords are not case sensitive. If the subject header source is as you posted, then yes, it should have been blocked. However even subject headers can be encoded so that the source is not what you see.

Without the full source of the email, again, we can't give a definite answer. Please indicate also which version of SpamFilter you're using.

Roberto Franceschetti
LogSat Software



Posted By: Guests
Date Posted: 23 June 2003 at 5:55pm

Version is 1.2.0.162

My mailserver is groupwise, which does not send me pure source of html message.  The subject I included was taken from the message header though.  Is there an address I can forward the message to?



Posted By: LogSat
Date Posted: 23 June 2003 at 7:59pm

Yes, mailto:support@logsat.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - support@logsat.com

Roberto F.
LogSat Software




Print Page | Close Window