New feature request - counters in database |
Post Reply ![]() |
Author | |
peet ![]() Newbie ![]() ![]() Joined: 01 August 2007 Location: United States Status: Offline Points: 21 |
![]() ![]() ![]() ![]() ![]() Posted: 05 March 2010 at 11:58pm |
Hi,
I've written an enhanced web GUI(still working on it), using SQL DB from where I pickup the quarantined e-mail data. As each new e-mail is added by the filter app, I trigger a counter that adds a new e-mail address to a separate table and a counter per day to count how many e-mails come in per day for that e-mail address. Also do same for a domain name of that e-mail address to get totals. SQL does all the work using a stored procedure. But, I can only capture what the filter quarantines. I'd love to see a more comprehensive counter per e-mail account and domain name. - incoming e-mail count - blocked e-mail count - forwarded on good e-mail count - quarantined e-mail count (this is all I have) I build a chart based on the daily count so the user can see how traffic fluctuated over time. It is amazing how an account with average 30-50 quarantined spam per day can all of a sudden for a week drop to under 10 spams, and in another month all of a sudden jump for just one day into the hundreds. But I'm only seeing quarantined. So would it be possible, and would others also benefit from this? Basically the filter would do a one-way communication to the quarantine DB or a local file or cache and later written to file. Per email address, per day one record in a counter table for each of the counters. Perhaps call a storedprocedure and just let that do the updating and counting freeing up the Filter's process of it. It would be great to know the total of good vs. bad e-mails per e-mail account. Please, others add your comment/support for this feature if you'd like to see it!)
|
|
![]() |
|
yapadu ![]() Senior Member ![]() Joined: 12 May 2005 Status: Offline Points: 297 |
![]() ![]() ![]() ![]() ![]() |
I have actually been working on a stats system for our users, it has been several weeks already that I have been working on it.
I am using sawmill to process the raw logs, then I extract the data I want from the database that sawmill makes and put that data into my own tables. We generate about 1 gig of logs per day, so the major issue has been the volume of data to deal with. I am tracking messages received, quarantined, virus, forwarded as good. This data is broken down by email address (as well as email addresses that are invalid), by domain and by day. So users can see what is going on. I would like to also capture country data and inbound email senders... but that is just too much data. |
|
![]() |
|
peet ![]() Newbie ![]() ![]() Joined: 01 August 2007 Location: United States Status: Offline Points: 21 |
![]() ![]() ![]() ![]() ![]() |
Wow, processing the entire raw log. I didn't think about that, but on the other hand I didn't want the server processing so much data, just a simple counter.
My current storedprocedure utilizes MS SQL's temporary hold of records being added and based on the event, such as INSERT, it then triggers the storedprocedure and grabs the e-mail address, subtracts the domain and for that day it makes a count. It is really easy and fast, and also fast to generate a bar-chart from that. So for example a user can see the daily fluctuation of quarantined e-mails for their domain around 50,000 emails daily, and compare it to a graph next to that to their personal mailbox's quarantine with is around 50 for that particular e-mail account daily. So if possible I'd like to avoid using raw logs. Server is busy enough already blocking junk e-mails based on the many filters.
Edited by peet - 06 March 2010 at 1:51am |
|
![]() |
|
yapadu ![]() Senior Member ![]() Joined: 12 May 2005 Status: Offline Points: 297 |
![]() ![]() ![]() ![]() ![]() |
I don't know how much additional load it would place on the spamfilter server to have the software do it, but under heavy loads (hundreds of connections) I would imagine the overhead of the stats would be quite a bit.
Even my servers, which usually only have a few connections (maybe 10) I have had to setup two servers for sawmill to process the data. One for sawmill and one for mysql database to store the stats. Crunching data on a table with 50 million rows takes some serious power so there is no way I could do it on the spamfilter server itself. I am actually trying to figure out a way to compress the data and store it in the amazon cloud or something. There when the users want it but not taking up massive amounts of space on the production servers. |
|
![]() |
Post Reply ![]() |
|
Tweet
|
Forum Jump | Forum Permissions ![]() You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |
This page was generated in 0.139 seconds.