Spam filters can only give a guess about whether a message is spam or not …
Last week we talked about messages disappearing before you receive them and the problem of over-zealous spam filters. In this issue we’ll look inside spam filters to see why they are not perfect.
You and I can glance at a message and know right-away if it’s spam or not. Computers are nowhere near as smart and probably never will be, all a spam filter can do is analyse a message and work out the likelihood that it is spam. It’s not a simple Yes/No but a sliding scale of ‘spaminess’.
When an email is analysed by a spam filter it is given a ‘score’ – the higher the score, the more likely that the message is spam. Sometimes the score is added to the email header information so you can see what’s happening if you dig into the message header:
X-Note: Total spam weight of this E-mail is 0.
This message has a zero score and is almost certainly not spam
X-Note: Total spam weight of this E-mail is 10.
This is a high score and is very probably spam.
High and Low
Messages with a high score are almost certainly spam.
Messages with a low score are probably not spam.
The problem is usually with the middle-range scores. These are a mix of spam and non-spam messages – some messages you want and others you don’t.
If middle-ranking messages are all considered spam then you’ll miss some messages you want to get (a ‘false positive’). This is the problem with some spam filters at companies or ISP’s – they can remove messages that you expect to receive.
Note: There’s no certainty in spam filtering, only scores and likelihoods’ which is why we talk about messages ‘probably’ or ‘maybe’ being spam based on their scores. The scoring system can be wrong either way (ie giving a low score to spam or a high score to a message you want)
The solution to this ‘gray-area’ of spam and non-spam is to have spam filtering at various points in the passage of an email message from sender to you. Your company or email host can filter out the high-scoring messages, while passing through the middle and low scoring messages for your email client spam filter to handle.
A spam filter on your computer can do things that an ‘off-site’ spam filter cannot. On your computer, the spam filter can check your own Contacts or ‘Safe Senders’ list to work out if a message is from someone you know. You have much more control over your local spam filter than anything available on your email host.
This multi-level approach means that even if a message is wrongly put into your Junk Email filter, you can find it. If the company or email host is too aggressive about deleting spam (ie it deletes messages with middle-rank spam scores) then you don’t have an opportunity to do any filtering yourself.
Rather than expecting your ISP to remove all spam, it’s better for them to remove the most obvious spam only and leave the rest for your computer to handle.
Spam scores in Outlook
A good example of this is Outlook 2003 and 2007 where there’s a spam filter, which operates separately from anything at your company or ISP. Behind the scenes, Outlook is assigning a spam score to each incoming message then works out what to do based on your configuration of the Junk E-mail filter.
Under Actions | Junk E-mail | Junk E-mail options you’ll find settings for your spam filter which equate with how to deal with messages with varying spam scores.
- Low means only the high scoring messages are filtered
- High means filter messages with high and lower scores
In the next issue we’ll show you a practical example of how this works with an ‘under the hood’ look at the spam filtering available on Exchange Server. Even if you don’t work with Exchange Server, it will give you a good idea of how effective spam filtering works and how messages with different spam ‘scores’ can be treated differently.
- LinkedIn fake messages
- ‘Summary of Payments’ infected ‘Excel’ email
- Places to find missing emails
- Spam emails with better mail merging
- Setting Outlook’s spam filter for Office Watch
- Where could that message be?