Print Page | Close Window

To all RegEx Experts

Printed From: LogSat Software
Category: Spam Filter ISP
Forum Name: Spam Filter ISP Support
Forum Description: General support for Spam Filter ISP
URL: https://www.logsat.com/spamfilter/forums/forum_posts.asp?TID=1497
Printed Date: 13 March 2025 at 1:17pm


Topic: To all RegEx Experts
Posted By: Desperado
Subject: To all RegEx Experts
Date Posted: 27 July 2003 at 10:29pm

All,

Perhaps we should have a contest!  This is one that has me going around in circles.  Anyone have a good block for this?

<html>
<TABLE class=rmsg cellSpacing=0 cellPadding=10 width="100%" align=center
border=0 nowrap>
<TBODY>
<TR>
<TD>
<DIV>
<SCRIPT>
<!--
function Filtered()
{
return 0
}
//-->
</SCRIPT>
X-Server: LogSat Software SMTP Server

<CENTER>
<p></p>
</CENTER></DIV>
<TD><FONT face=VERDANA,ARIAL color=#0000a0 size=4>
<CENTER>Hellodomainreg@mags.net<BR><B>Get The Lowest Price On Enlargement Pills<BR>Without  any hassle in minutes!</B><FONT size=2>
<P>
<P align=center>Would you like to get extra 3 inches<BR>to your penis?<BR></:sz27tdndomainreg@mags.net               mailto:sz27tdqdomainreg@mags.net" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - sz27tdqdomainreg@mags.net                mailto:sz27tdidomainreg@mags.net" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - sz27tdidomainreg@mags.net   mailto:sz27tdbdomainreg@mags.net" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - sz27tdbdomainreg@mags.net                mailto:sz27tdxdomainreg@mags.net" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - sz27tdxdomainreg@mags.net                >
<CENTER>
<P>
<TABLE border=0>
<TBODY>
<TR>
<TD align=left><FONT color=#ff0000
size=2><B></:sz27td3domainreg@mags.net  mailto:sz27tdvdomainreg@mags.net" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - sz27tdvdomainreg@mags.net      mailto:sz27tdydomainreg@mags.net" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - sz27tdydomainreg@mags.net                mailto:sz27tnrdomainreg@mags.net" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - sz27tnrdomainreg@mags.net               mailto:sz27tnadomainreg@mags.net" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - sz27tnadomainreg@mags.net    >
<LI>No prescription needed.
<LI>No exercises to do.
<LI>No shame.
<LI>Real Results.
<LI>Confidential overnight shipping take your pick.</FONT>
<P></:sz27tn1domainreg@mags.net        mailto:sz27tngdomainreg@mags.net" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - sz27tngdomainreg@mags.net                mailto:sz27tnsdomainreg@mags.net" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - sz27tnsdomainreg@mags.net                mailto:sz27tn5domainreg@mags.netsz27tnodomainreg@mags.net" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - sz27tn5domainreg@mags.netsz27tnodomainreg@mags.net                ><FONT
face=VERDANA,ARIAL color=#0000a0 size=5><a href=" undefined" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - Click'> http://red.ecablenetwork.com/4dre/" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - http://red.ecablenetwork.com/4dre/ ">Click Here</a></CENTER> 
<P></P></FONT></FONT></CENTER></FONT></TD></TR></TBODY></TABLE></:sz27tn2domainreg@mags.net               mailto:sz27tnndomainreg@mags.net" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - sz27tnndomainreg@mags.net     mailto:sz27tnqdomainreg@mags.net" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - sz27tnqdomainreg@mags.net                mailto:sz27tnidomainreg@mags.net" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - sz27tnidomainreg@mags.net             mailto:sz27tnbdomainreg@mags.net" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - sz27tnbdomainreg@mags.net   ><FONT
size=2><BR><BR>&nbsp;<A
target=_blank><FONT color=#000000><DIV></DIV></FONT></A></FONT>
<head>
<meta name="GENERATOR" content="Microsoft FrontPage 5.0">
<meta name="ProgId" content="FrontPage.Editor.Document">
<meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
<title>Hello</title>
</head>

<body>

</body>

</html>

Dan S.

 




Replies:
Posted By: Guests
Date Posted: 28 July 2003 at 12:23pm

I cannot see of any HTML tags that use more than one concurrent space in a tag or that use the "@" symbol.  Why not filter on those?

The one that gets me is using common keywords in the tags instead of HTML comments as a way of trying to get around statistical Bayesian filtering.  Any good browser ignores them, yet they will not get caught by other filtering methods either.  Now this would be a challenge.



Posted By: Desperado
Date Posted: 28 July 2003 at 12:57pm

Alan,

In this email, I was more concerned by the function that seems to create or decode the tags.  I do not see where it is actually being envoked but if I remove it the message gets garbled.  Do you understand the function?

Dan S.

 



Posted By: Guests
Date Posted: 29 July 2003 at 11:08am

Laughing... I know this is not what you are asking but I could not resist!

Simple Keywords in Black List:

penis,3 inches

this works great for me.  I have to laugh that they always use 3 inches!  Why not 2 or 4, nooooo, there is something about that 3 extra inches, LOL!

The real problem that I am having is the newest spam that has almost ZERO text and 1 image tag that pulls the spam image from their website.  I HATE THAT and it appears there is nothing you can do about it since many valid newsletters and support emails contain those linked images also...

Erik!

 



Posted By: Guests
Date Posted: 29 July 2003 at 11:50am

Hi,

I think you are wrong. I have this one!

Increase your penis size by 2 to 5 full inches  booth

Do you want more information? :-)

Gaby



Posted By: Guests
Date Posted: 29 July 2003 at 11:52am

Are you posting the actual email content or just the generated page code?  I am unclear what you are removing that is causing it to generate garbled output. 

I am no expert but to me it doesn't appear to do much of anything (beyond the basic html portion that is.)  It looks like there may be some other components missing for it to do what it's supposed to.   It looks almost like something that an amateur spammer cobbled together.  Maybe this was created using a basic template and this script portion was not properly utilized?  If I had to make a guess based on the various hanging fragments, I suspect the intent was to get you to click on the link (uh, yeah right) and you would get a multitude of child windows opening up which all submit your email address and their referal id (sz27t) for credit, but the sender didn't know what they were doing.

Either way you had originally asked about blocking this.  Seems to me you can still block via the "@" or extra spaces in a tag.  So do I get a cupie doll?



Posted By: Desperado
Date Posted: 29 July 2003 at 7:34pm

Wow!  5 Inches? (thats gotta be a life changer!) ...

Seriously ... the "Function" is my concern.  Something that I am not yet able to understand is using that function (I believe) to create or decode the obscured code. Every message I get is totally different EXCEPT the function part.

ALso, "Simple Keywords" are out of the question from my vantage point.  First, we have a "No Censorship" policy and second, we find that literal keywords have little value due to all the obscured code.

Last though ... This for LogSat support also:  I wonder if any would be spammers look at any of out posts about filters.  Is this a possible concern?

Dan S.

Dan S.

 



Posted By: Guests
Date Posted: 30 July 2003 at 8:42am

I agree, a spam that only has a picture to click on defeats a good regex filter. In that case i resort to a keyword filter:

@mags.net

They wont put it in the email address but it's always in the link they want you to click on. The down side is this filter only applies to this one spammer.



Posted By: Desperado
Date Posted: 30 July 2003 at 8:56pm

Should I take it personally that you used my domain as an example?

Dan S.

 



Posted By: Guests
Date Posted: 30 July 2003 at 10:29pm

Occam's Razor!

;)



Posted By: Desperado
Date Posted: 30 July 2003 at 10:41pm

OK .... Or ... KISS!  So I won't read anything into it!  (I didn't anyway)

das



Posted By: Guests
Date Posted: 31 July 2003 at 7:41am
Did I anything wrong with (\b[\d+]+([\-a-za-z0-9_\.\+])+(@hotmail|@juno)\.com)) maybe a typing error??

07.31.03 13:13:29:800 -- (752) String matching error for (mos.187902.33882.gemeindebrief.dbounce@news.messagizer.de --and-- (\b[\d+]+([\-a-za-z0-9_\.\+])+(@hotmail|@juno)\.com)) : TRegExpr(exec): Loop Stack Exceeded 07.31.03 13:13:29:800 -- (752) Mail from: mos.187902.33882.gemeindebrief.dbounce@news.messagizer.de 07.31.03 13:13:30:080 -- (752) - MAPS search done... . 07.31.03 13:13:30:080 -- (752) RCPT TO: [del]@brainlift.de accepted 07.31.03 13:13:35:749 -- (752) EMail from mos.187902.33882.gemeindebrief.dbounce@news.messagizer.de to [del]@brainlift.de was queued. Size: 41 KB 07.31.03 13:13:35:769 -- (1004) Sending email from gemeindebrief@geizkragen.de to [del]@brainlift.de 07.31.03 13:13:35:819 -- (752) Disconnect


Posted By: Desperado
Date Posted: 31 July 2003 at 8:44am

Frank,

You have an extra close paren on the end.

YOUR RegEx:

(\b[\d+]+([\-a-za-z0-9_\.\+])+(@hotmail|@juno)\.com))

Should be:

(\b[\d+]+([\-a-za-z0-9_\.\+])+(@hotmail|@juno)\.com)

Also, what build are you running?

Dan

 



Posted By: Guests
Date Posted: 31 July 2003 at 9:03am
Dan, thanks a lot (to stupid), its build 190, have a nice day...


Posted By: Desperado
Date Posted: 31 July 2003 at 9:19am

Hold on a second ... On build 190 up, I am getting a string match error also.   I have been working  directly with LogSat on this for another RegEx so I will look into this also.  I had ZERO errors in the past and this has been a good block for me so it is a high priority.

Dan S.

 



Posted By: Guests
Date Posted: 31 July 2003 at 9:26am
Ok, just as I looked again to fix it I saw there is no extra close paren in the filter, its only the close paren for the hole log text line (mos.187902.33882.gemeindebrief.dbounce@news.messagizer.de --and-- (\b[\d+]+([\-a-za-z0-9_\.\+])+(@hotmail|@juno)\.com))


Posted By: Desperado
Date Posted: 31 July 2003 at 9:32am

Frank,

The extra Paren, as you saw, is only in the log.  This is what I have so far and what I sent off to LogSat:

Using the RegEx test in the SF GUI, the following RegEx In the fromemail bl fails as follows:
 
(\b[\d+]+([\-a-za-z0-9_\.\+])+(@hotmail|@juno)\.com)
 
If the chars before the @ are 32 or less, the test works.  If the chars before the @ are greater than 32, it fails.  Could this be related to the String Matching error.
 
Dan
 


Posted By: Desperado
Date Posted: 31 July 2003 at 9:43am

Frank,

Do not panic on this yet.  LogSat has placed a limit on the string length due to a REALLY bad issue that came up on my server.  I have requested that we take another look at the limit setting but in the meantime, it is NOT a failure but the message may "sneak" past the filter.  I am looking into that now but I am leavint the RegEx in my server for now.

Dan



Posted By: Guests
Date Posted: 31 July 2003 at 10:23am
Dan, I am relaxed and leaving it in my server too for the next days. Frank.


Posted By: Desperado
Date Posted: 31 July 2003 at 10:41am

Frank,

This is from Roberto of LogSat:

If the loop stack is exceeded for a RegEx, that single RegEx expression will be skipped, so there won't be a match on it. All other keywords/blacklists are still processed.
 
I'll [LogSat] see how much I can increase the loop stack without runnning into the problems you [Dan S.] had.
 
[LogSat's] been working on the statistical filtering for a while now, and will most likely have an alpha version ready for internal testing by this weekend. If all goes well, a public beta will be released within a week or two. This hsould help a LOT in catching more spam.
 
Dan S.



Print Page | Close Window