| Keyword may have not been scanned | 
| Post Reply   | 
| Author | |||
| sgeorge   Senior Member   Joined: 23 August 2005 Status: Offline Points: 178 |  Post Options  Thanks(0)  Quote  Reply  Topic: Keyword may have not been scanned Posted: 16 February 2007 at 9:58am | ||
| 
   
Hi All, long time no see. :) One message came through that I was hoping a RegEx blacklist keyword would match. I've checked my logs to see if there was any whitelisting, or if part of the message was skipped for being over the max scan size, and from the logs it looks like neither was the case. Here's the RegEx keyword: 
 Here is a copy of the plain-text content of the message: 
 And here's the relevant snippet from the log files (i.p.s and addresses have been changed...): 
 I am running v 3.1.3.615. Also, my max scan setting in SpamFilter.ini is: MaxMsgSizeForKeywordScan=64 Thanks for your help. I'm hoping that I'm just missing something, but it seems kind of funky. Stephen Edited by sgeorge | |||
|  | |||
| sgeorge   Senior Member   Joined: 23 August 2005 Status: Offline Points: 178 |  Post Options  Thanks(0)  Quote  Reply  Posted: 16 February 2007 at 10:04am | ||
| 
   Also, I meant to mention something interesting I noticed in my "RegEx Test" tab in SpamFilter.  If I enter the RegEx search string "(?i)\w ?\w ?\w ?\w ?\. ?p ?k" (no quotes),  I found the following... The pattern was found in this text: 
 But it was not found in this text: 
 Thanks for listenin'. :) Stephen | |||
|  | |||
| sgeorge   Senior Member   Joined: 23 August 2005 Status: Offline Points: 178 |  Post Options  Thanks(0)  Quote  Reply  Posted: 23 February 2007 at 6:49pm | ||
| 
   Just a mini-update... I tried doing a full uninstall & reinstall of v 3.1.3.615. Oddly, it did not fix the problem. Stephen | |||
|  | |||
| ImInAfrica   Groupie     Joined: 27 June 2006 Location: FL, USA Status: Offline Points: 60 |  Post Options  Thanks(0)  Quote  Reply  Posted: 25 February 2007 at 4:20pm | ||
| I tested this on 650 and can confirm same issue Amir | |||
|  | |||
| sgeorge   Senior Member   Joined: 23 August 2005 Status: Offline Points: 178 |  Post Options  Thanks(0)  Quote  Reply  Posted: 26 February 2007 at 5:11pm | ||
| 
   
Hey, thanks for testing it man. 
 Edited by sgeorge | |||
|  | |||
| mikek   Senior Member     Joined: 22 February 2005 Location: Switzerland Status: Offline Points: 133 |  Post Options  Thanks(0)  Quote  Reply  Posted: 08 March 2007 at 10:54am | ||
| 
   I can confirm this, I was always wondering why so many spams with inline images came through, although I had the correct "src=cid:..." keywords set. Just tested my keyword with a mail that came through. If I paste the whole email, the regex test outputs "not found". If I just paste a few lines around the src=cid, it will output "found", like it should... This is a serious issue that has to be looked into! Cheers, Mike | |||
|  | |||
| LogSat   Admin Group     Joined: 25 January 2005 Location: United States Status: Offline Points: 4106 |  Post Options  Thanks(0)  Quote  Reply  Posted: 08 March 2007 at 11:05am | ||
| 
   Mike, Can you please froward us the whole email (headers and email body included)? | |||
|  | |||
| mikek   Senior Member     Joined: 22 February 2005 Location: Switzerland Status: Offline Points: 133 |  Post Options  Thanks(0)  Quote  Reply  Posted: 08 March 2007 at 11:06am | ||
| 
   
Just did some more tests and it looks like it has something to do with the regex that is used... For me, the error shows with this regex: src="cid:(.)*\$(.)*@(.)*" E-Mail is on it's way... Edited by mikek | |||
|  | |||
| LogSat   Admin Group     Joined: 25 January 2005 Location: United States Status: Offline Points: 4106 |  Post Options  Thanks(0)  Quote  Reply  Posted: 09 March 2007 at 11:34pm | ||
| 
   Everyone, It seems that some of your RegEx are causing a stack overflow for their complexity, and while SpamFilter will recover from the error, this will cause it to miss the keyword match in that particular string. We're currently looking at the "greedy" option in RegEx, that is enabled by default in SpamFilter. In the sample mikek provided, we modified his RegEx to include the modifier: (?-g) at the beginning of the expression. This disables the "greedy" mode in RegEx and successfully detects the string. Mike, if you change your string from: ((?i)(src="cid:(.)*\$(.)*@(.)*")) to ((?-gi)(src="cid:(.)*\$(.)*@(.)*")) or ((?-g)(?i)(src="cid:(.)*\$(.)*@(.)*")) your expression will work. Unfortunately this means you may have to add the (?-g) modifier in all your RegEx. We're looking into what side-effects we'd have if we were to disable greedy mode by default in SpamFilter... | |||
|  | |||
| mikek   Senior Member     Joined: 22 February 2005 Location: Switzerland Status: Offline Points: 133 |  Post Options  Thanks(0)  Quote  Reply  Posted: 12 March 2007 at 6:52am | ||
| 
   Hi Roberto turning off "greedy" mode worked! personally, i would not change the default behaviour, but maybe update the documentation to state that greedy mode is on by default (as it is with most regex implementations) and mention the -g parameter. it would also be nice if an exception caused by a regex would be logged... Cheers, Mike | |||
|  | |||
| sgeorge   Senior Member   Joined: 23 August 2005 Status: Offline Points: 178 |  Post Options  Thanks(0)  Quote  Reply  Posted: 12 March 2007 at 10:46am | ||
| 
   ...Nice detective work.  Thanks you two! Stephen | |||
|  | |||
| Post Reply   | |
| Tweet | 
| Forum Jump | Forum Permissions  You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum | 
This page was generated in 0.152 seconds.
 
  
 
 
  
  
  
  Topic Options
 Topic Options