Spam Filter ISP Support Forum

  New Posts New Posts RSS Feed - .TOKEN files
  FAQ FAQ  Forum Search   Register Register  Login Login

.TOKEN files

 Post Reply Post Reply
Author
Ric Marques View Drop Down
Guest Group
Guest Group
Post Options Post Options   Thanks (0) Thanks(0)   Quote Ric Marques Quote  Post ReplyReply Direct Link To This Post Topic: .TOKEN files
    Posted: 20 November 2003 at 8:13pm

Roberto,

Is it normal for the files in the /corpus/queue directory to be as large as 273k?

I did the upgrade to .263, deleted the files in the /corpus and /corpus/queue directories and restarted.  The same issue is happening.  Right now there are 65 files in that directory, 7 of which are 273k each. The others range in size from 1k-5k and there are some 0k .tmp files there as well.

-Ric

Back to Top
LogSat View Drop Down
Admin Group
Admin Group
Avatar

Joined: 25 January 2005
Location: United States
Status: Offline
Points: 4104
Post Options Post Options   Thanks (0) Thanks(0)   Quote LogSat Quote  Post ReplyReply Direct Link To This Post Posted: 21 November 2003 at 3:31pm

Ric,

No, actually it's not normal. There's an upper limit on the msg size scanned for tokens, and for performance reasons that default to 64KB. Furthermore binary attachments are not scanned, only text/html content in the emails is.

Can you check the spamfilter.ini file for the line:

MaxMsgSizeForKeywordScan

and ensure the value is 64 or less?

Roberto F.
LogSat Software

Back to Top
Ric Marques View Drop Down
Guest Group
Guest Group
Post Options Post Options   Thanks (0) Thanks(0)   Quote Ric Marques Quote  Post ReplyReply Direct Link To This Post Posted: 21 November 2003 at 3:52pm

Roberto -

The key is there.  I was only seeing the large files created from messages that were released from quarantine. The .TOKEN file was flagged as '.falsepositive' in the first line and the entire file content appeared as mime encoded gobbletygook - here's the first few lines:

<snip>
.falsepositive
&
_
_nextpart_002_01c3afab
0
00
000c05a6
0080iahdo4qy45zpugu6g75fw1er4ykbuwicv2iasvxoocr5dpk9aiofdcq1l7madpg
00bmwd7joomh9rc6jypazit1hswms2klfhmmhniycoragwwp1
00jg59vgc
00tk5nbj0burvixfhhp3zmou7v1zznpt8masgj3gmbmvx
00wkzw
02bpslnq7
02l1oiyxe44a60
033y8goesdoiijp6u
0344ns5h7eyoxij58yvgswjczzc5pxfbesn018tufpxjjo3ncwpikl3bwryzwohcnig
039k
03bpfj
03y8mqyavhj0wjkklbjaa5rhebyb8
04arwuinm8xreji6ggbdnvuplkksr20l0a6vighpwicvlipgouau
04mobtwnzyn6x5ig8vwwjb9
04tj0mg30twjpe9oisgqbia
04v5h8xae
05
05239q5tw82f4dnrklc3mc5zjcpapqv2j2qctlek8
05ewsrs7mhxdn
05naa6fetku5
05oipxkt5lmgmuklkha3pynaabgmc7f8ul1st6vql8nmw4o1uohkgafxfkyub7b5j4smsjl40hxi
</snip>

It looks like messages that are released from quarantine are being parsed for tokens in the binary area possibly???

I re-installed .263 this morning, and I haven't seen anything like this appear yet - but there haven't been any messages bouncing back with attachments like yesterday.

Unfortunately, it also looks like there's another problem - the corpus file isn't saving.  I ran for a couple of hours this AM and there were no changes to the corpus.ini or db.dat files.  I stopped the service and restarted - the two files updated at that time, but only showed 1 good/1 spam in the .ini.... and SpamFilterISP had processed over a thousand messages.

The logfile shows the corpus file being saved at startup, but not again:

At startup:
<snip>
11/21/03 11:59:07:320 -- Listening on xx.xx.xx.xx:25,
11/21/03 11:59:08:271 -- (3060) Connection from: 200.63.157.190  -  Originating country : Argentina
11/21/03 11:59:09:173 -- Starting to process queue directory...
11/21/03 11:59:10:685 -- ***Memory info*******
11/21/03 11:59:10:685 -- TotalAddrSpace = 1,048,576
11/21/03 11:59:10:685 -- TotalUncommitted = 196,608
11/21/03 11:59:10:685 -- TotalCommitted = 851,968
11/21/03 11:59:10:685 -- TotalAllocated = 806,988
11/21/03 11:59:10:685 -- TotalFree = 14,852
11/21/03 11:59:10:685 -- FreeSmall = 14,852
11/21/03 11:59:10:685 -- FreeBig =
11/21/03 11:59:10:685 -- Unused =
11/21/03 11:59:10:685 -- Overhead = 30,128
11/21/03 11:59:10:685 -- HeapErrorCode =
11/21/03 11:59:10:685 -- AllocMemCount = 7,559
11/21/03 11:59:10:685 -- AllocMemSize = 808,500
11/21/03 11:59:10:685 -- **********
11/21/03 11:59:10:685 -- Begin Cleanup of Corpus.db
11/21/03 11:59:10:695 -- End Cleanup of Corpus.db
11/21/03 11:59:10:695 -- Begin Sync Corpus.db
11/21/03 11:59:10:695 -- Sync Corpus.db - 1 - 0
11/21/03 11:59:10:695 -- Sync Corpus.db pass 1 (0)
11/21/03 11:59:10:695 -- Sync Corpus.db pass 2 (0)
11/21/03 11:59:10:695 -- Sync Corpus.db pass 3 (0)
11/21/03 11:59:10:695 -- Sync Corpus.db pass 4 (0)
11/21/03 11:59:10:695 -- Begin Saving Corpus.db
11/21/03 11:59:10:895 -- End Saving Corpus.db (200)
11/21/03 11:59:10:895 -- End Sync Corpus.db (200)
11/21/03 11:59:11:906 -- (3060) Resolving 200.63.157.190 - Not found
11/21/03 11:59:11:916 -- (3060) - Reverse DNS not found -
</snip>

subsequent log entries:
<snip>
11/21/03 12:44:50:464 -- Begin Sync Corpus.db
11/21/03 12:44:50:464 -- Sync Corpus.db - 9864 - 594
11/21/03 12:44:50:595 -- Sync Corpus.db pass 1 (130)
11/21/03 12:44:50:595 -- Sync Corpus.db pass 2 (130)
11/21/03 12:44:50:595 -- Sync Corpus.db pass 3 (130)
11/21/03 12:44:50:605 -- Sync Corpus.db pass 4 (141)
11/21/03 12:44:50:605 -- End Sync Corpus.db (141)
</snip>

I hope this is helpful... I'll keep a close eye on what's happening...

-Ric

Back to Top
LogSat View Drop Down
Admin Group
Admin Group
Avatar

Joined: 25 January 2005
Location: United States
Status: Offline
Points: 4104
Post Options Post Options   Thanks (0) Thanks(0)   Quote LogSat Quote  Post ReplyReply Direct Link To This Post Posted: 21 November 2003 at 3:59pm

Ric,

Thanks for the reports. The msgs released form the quarantine are scanned and tokenized, since this allows the filter to learn that message like them are not spam. Subsequent emails with similar content will be much less likely to be stopped. We're taking a look now if there a bug with scanning the binary attachments in them, as they also should only be scanned for text/html.

The corpus database is saved much less frequently now, every 2 hours if traffic is not high, and then when SpamFilter shuts down, this is normal.

Roberto F.
LogSat Software

Back to Top
 Post Reply Post Reply
  Share Topic   

Forum Jump Forum Permissions View Drop Down



This page was generated in 0.217 seconds.