Tuning Your Mailscan Account 07/28/06 – 3 – AJR Daily Training – We recommend that you daily process your Cache Contents for each non-zero cache – be sure to click “Confirm the Status of these Items” when done with each cache: Adjusting Your Threshhold – Your spam threshhold is a number representing Mailscan's dividing line between spam and non-spam (good email) – this number is usually 5.0 or less – any email that scores at or above the threshhold is considered spam, and anything less is non-spam – An email's spam score is the sum of many small numbers, each of which gets added whenever the email fails one of Mailscan's 700+ spam tests – Suppose an email accumulates a total spam score of 4.1 – if your threshhold is 4.25 that email is classified as non-spam – if your threshhold is 4.0 it's classified as spam – the lower your threshhold, the more spam you'll catch BUT... you also increase the likelihood of a high-scoring piece of good email being classified as spam – setting your threshhold is an art, not a science – To adjust your threshhold, click the gear icon on your Mailscan home page: – Then click on your email address in the upper right corner of the E-Mail Addresses box: Tuning Your Mailscan Account – page 1 of 7 – You'll now be in your Mail Filter Settings page: – Change your threshhold by editing the value Consider mail 'Spam' when Score is >= (in this screenshot it's being set to 4.0), then click Update This Address' Settings – the new value takes effect immediately for all new email – We recommend adjusting your threshhold up or down in increments of 0.25 or less – it's smart to adjust slowly and watch the results for a few days Examining Mailscan Statistics – It is instructive and impressive to check Mailscan's stats to see what it's really doing – Click the chart icon on your Mailscan home page: – This will take you to your Statistics page: Tuning Your Mailscan Account – page 2 of 7 – The information in this section also applies to your personal stats, but the systemwide stats are a better indication of Mailscan's overall performance – at the bottom of this page click View Systemwide Statistics: – The colors of the pie chart wedges correspond to the rows of the table (ie, medium pink = confirmed spam) – Look at the top three Mail Types: – – – – Unconfirmed Non-spam is good email that has not yet been classified by trainers Confirmed Non-spam is good email that has been classified by trainers False Positives are bad, since Mailscan has mistakenly classified good email as spam The sum of these three Mail Types represents all the non-spam we get (18% of incoming email in this screenshot) – Look at the next three Mail Types: Tuning Your Mailscan Account – page 3 of 7 – Suspected spam is spam that has not yet been classified by trainers – Confirmed Spam is spam that has been classified by trainers – False Negatives are good – when we trainers find spam in our non-spam cache and reclassify it, we're telling Mailscan, “here's a piece of junk you missed... learn about it so you'll be more likely to recognize it next time and classify it as spam” – false negatives are a sign that Mailscan is “learning” – The sum of these second three Mail Types represents all the spam we get (78.8% of incoming email in this screenshot) – this corresponds with recent studies which suggest 75% or more of all Internet email is spam – Look at the bottom of this table: – Efficiency is the best overall measure of the Mailscan's performance – the more diligently we trainers confirm/reclassify spam/non-spam, the higher the efficiency will become – we hope to push this slow-moving number into the high 90s – Viruses/Malware are found in 2.9% of incoming email – “viruses” is a catch-all term for viruses, worms and trojans (bad stuff which can directly infect a PC) – “malware” is a catch-all term for spyware, adware, phishing scams, and HTML-borne programming (more bad stuff which can slow down or disable your PC, steal your data and passwords, turn your PC into a spam generator, turn your PC into a robot [“bot”] that attacks other computers, reprogram your browser, etc) – all infected email is immediately discarded – Click on View Virus Statistics: Tuning Your Mailscan Account – page 4 of 7 – The most frequently detected viruses and malware are at the top of the Viruses list – notice that the top five are phishing scams embedded in HTML-formatted email, and the top one alone (HTML.Phishing.Pay-168) accounts for 54.3% of all viruses received to date – can you see why HTML-formatted email should be avoided like the plague? – Return to the systemwide stats page, then click on View Spamassassin Rule Statistics: – SpamAssassin is the name of the software which actually does the spam analysis – it uses over 700 rules to examine each email, and every time an email meets a rule's criteria, the rule “triggers” – this table is sorted with the most frequently triggered rules at the top – Razor2 is one of the two global anti-spam networks to which we've connected Mailscan, and the fact that Razor2 rules are so high on the list shows how much this participation helps us fight spam – the other network is DCC, which appears in the fifth rule – The second rule (RAZOR2_CHECK) has a Score of 1.511 – this means each time an email triggers this rule, 1.511 points are added to its accumulating spam score, helping tip it toward a classification as spam – Even more significant is the BAYES_99 rule – the type of artificial intelligence used by SpamAssassin is known as a “bayesian filter”, and it is the bayesian filter that we are training – when the BAYES_99 rule triggers it means “because of my training I'm 99- Tuning Your Mailscan Account – page 5 of 7 100% certain this email is spam, and I'm adding 3.500 points to its spam score” – 3.5 points are a lot, and help to quickly tip the email toward a classification as spam White and Black Lists – A whitelist is a list of email addresses (ie, joe@boguscom.com) and/or domains (ie, boguscom.com) for which email addressed to you is always accepted, no matter what – this means you must completely trust the address or domain because its email will bypass Mailscan's tests, including the virus check – in other words, don't take whitelists lightly – A blacklist is just the opposite – it's a list of addresses and/or domains for which email addressed to you will always be blocked and discarded, no matter what – The purpose of whitelists and blacklists is to allow or block email which Mailscan is not handling correctly for you – it's your last resort (not first) in tuning Mailscan – Click on the divided rectangle icon to reach your White/Blacklist Settings: – To add a single address to your whitelist, type it into the box as shown below (joe@boguscom.com in this example), then click Add to List: – To remove an entry from your whitelist or blacklist, click in the empty circle under Remove, then click Update: Strategy and Expectations Tuning Your Mailscan Account – page 6 of 7 – Now that you understand how Mailscan works, here's a strategy for tuning it... – When confirming your non-spam, watch for two trends in spam values: 1) the lowest normal value for false negatives you reclassify as spam, and 2) the highest normal value for non-spam (your actual good email) – If you're lucky, your lowest normal false negative value (ie, 3.9) will be higher than your highest normal value for good email (ie, 3.0) – if this is the case, set your threshhold a little lower than your false negative value (ie, 3.75) – watch the results for a few days and adjust as necessary – If you're not so lucky and are constantly reclassifying low-scoring spam, consider lowering your threshhold and whitelisting the few known-good addresses that regularly score above it – an alternative is to blacklist recurring spam addresses if there aren't too many of them (but there will always be more and you'll always be playing catch-up) – A reasonable goal is to keep tweaking your threshhold, whitelists and blacklists until Mailscan approaches 100% accuracy in classifying your email – It is unreasonable to expect Mailscan to consistently achieve 100% accuracy because “things change” – your idea of spam may change, and without question the tactics of spammers will change – the smarter spammers (who increasingly are well trained and may be financed by organized crime syndicates) know exactly how SpamAssassin works and keep trying to find ways around its rules Opting Out – If you eventually get tired of handling your own email and training Mailscan, you can easily opt out and return to “non-interactive mode” – To opt out, email the HelpDesk <helpdesk@nicc.edu> with your request – As soon as we delete your Mailscan account (including your personal threshhold, blacklists and whitelists), all your email will be processed according to Mailscan's settings for the NICC domain – you may get a little more spam, but you won't have to do anything more than delete it from your New mail folder Thank You – Once again, thank you for being a Mailscan trainer – since all of NICC's email is processed by the same rules and training, the few minutes you daily contribute to the training of the bayesian filter pays off for everyone – If you have further questions or suggestions about Mailscan, please use the server's builtin Help system or contact the HelpDesk <helpdesk@nicc.edu> Tuning Your Mailscan Account – page 7 of 7