UCLA Mathematics Spam FAQ |
What is SPAM? Spam is basically undesirable or unsolicited mail, generally coming from a previously unknown or unfamiliar source. Describe the SPAM Filtering Method The SPAM Filtering Method (at Math) is a way for users to 'manage' spam, and/or dispense with it easily. The tool we are implementing as part of the Spam Filtering Method is a program called Spam Assassin. Spam Assassin is an automated program that helps you determine if a piece of mail qualifies as spam, and allows you to choose what you would like to do with it. What do I need to do? You will need to log in to your UNIX account to set up Spam Filtering. Log in to your home directory, and run (type) spamscript. Then, follow the instructions. **Please** read the rest of this FAQ before you run the command. You need to understand what is happening before you implement it, or you will easily lose mail that *could be important* to you. How does the SPAM Filtering Method work? Mail arrives at your mailbox on these UNIX machines. This mail is intercepted by Spam Assassin. Based on your configuration settings, Spam Assassin will evaluate if it considers mail to be spam, and then will deposit 'potential' spam into a separate location. If you use UNIX 'PINE' mail, it's a folder called 'spam'. If you use the POP mail program Eudora, you can also have it deposit potential spam to a folder named 'spam'). What will I need to do after installing? Once you've installed it, it's running. After that, you will need to view (open) the 'spam' folders, and delete messages which are actually spam (watch out for `FALSE POSITIVES`). What is 'potential' spam? and what is a FALSE POSITIVE? 'Potential' spam is called this, because even though Spam Assassin evaluates a piece of mail to be Spam, it can be considered at least a 'partially subjective' process, and so a piece of mail it considers spam may not actually be spam. A "FALSE POSITIVE" is a piece mail that Spam Assassin has considered to be 'potential' spam, but YOU know that it is really a valid, useful, or important piece of mail. How do I view the 'potential' spam? Your 'potential' spam will be deposited into a folder, in your home UNIX directory, called mail/spam. You can either edit the file directly to view it, or run the UNIX mail program called PINE, to look at the list of messages / Subject lines. etc. If you use your Eudora mail to filter spam, you would create a folder called 'spam', where 'potential' spam is deposited (and, for subsequent viewing). How do I use PINE? Please go the following web site, to learn how: Learn about PINE on Unix What if I want to just dump 'potential' spam? Our Spam Filtering Method is not set up to do this. The reason is that if you just dump what Spam Assassin thinks is `potential` spam, you RUN THE RISK of losing mail. How 'specifically' does Spam Assassin work? Spam Assassin works by performing "SPAM Tagging", and then allows you to make a decision on what to do, based on "SPAM Filtering". What is SPAM Tagging? the Spam SCORE? If Spam Assassin is fully configured, it looks for various qualities of the mail that suggest it could be spam. It assigns a numerical value to each quality. It adds up all the values of the qualities, and the total is the spam 'SCORE'. This value is copied to your mail header before delivery. This is called Spam Tagging. What is SPAM Filtering? SPAM Filtering is performing an action based upon the spam SCORE. The value (SCORE) produced by SPAM Assassin is compared with the value you choose as your limiting value. This limiting value is called the "SPAM Hit Level". What is the "SPAM Hit Level"? It is the value (a number) that you assign, to decide above which level a piece of mail is considered spam. For example, if your "Spam Hit Level" is set to 5.0, and SPAM Assassin determines a piece of mail has Spam SCORE 5.1, it will classify that as SPAM. What is a whitelist? A whitelist is a list of sites / email addresses that you want to "always NOT be considered spam". Spam Assassin could tag something as spam that you might not consider spam, from a specific site. So, you can add the site to the whitelist. It would subsequently allow all mail from that site to pass through, no matter what the Spam Score is). What is a blacklist? A blacklist is a list of sites / email addresses that you want to "always be considered spam". Sometimes the spam filtering lets mail pass through (based on your Spam Hit Level), but you are familiar with that particular site, and would never like to receive mail from there. You then add the site to the blacklist, and it always files it away as spam. How do I use Bayesian Filtering? Bayesian Filtering is a way to do two things: 1) Take 'spam' mail that has somehow passed through SpamAssassin's mail filter, and 'teach' SpamAssassin how to consider this type of mail as spam. 2) Take 'non-spam' mail (called 'ham') and teach SpamAssassin how to consider this type of mail as 'non-spam'. To do 1), using the mail program 'pine', save what you consider spam into a folder called 'badmail' (for example). Then, from your home directory, type > cd mail > sa-learn --spam --mbox badmail To do 2), using the mail program 'pine', save what you consider 'ham' (good mail) into a folder called 'goodmail' (for example). Then, from your home directory type > cd mail > sa-learn --ham --mbox goodmail Summary of Instructions 1) Log into your UNIX account 2) type 'cd' to make sure you're in your home directory 3) type 'spamscript' 4) Follow the instructions |