Incoming filter technology

SpamPort uses a massive amount of filtering technology to sort mails into spam and ham (valid mail). We will describe the process shortly.

Because spammers send a lot of mail, they don't always follow the same process as normal mailers. For example, spamservers usually don't try again if we temporary reject a mail; this is because the mail queue is usually full at that time. A computer of a home user that is hacked and sends out spam, will show a hostname that is usually different from those of 'normal' mailservers. By combining those kinds of information we can distinguish hacked servers, spamservers and hacked computers from normal mailservers.


Spam filtering at the gate

We start by checking the techniques used by the sender of the mail. When a server wants to deliver a mail, they identify themselves by telling us who they are. We get their hostname, IP and the e-mail address they are trying to send a mail from.
This provides us with a lot of information that we can work with:

  • We can check if the hostname is a valid hostname
  • We can check if the hostname exists
  • We can check if the hostname resolves to an IP.
  • We can check if the IP has a valid reversed DNS record.
  • We can check if the domain name they are trying to send from, exists.
  • We can check if the domain name they are trying to send from, resolves to an IP.
A lot of mails (currently around 33%) are blocked at this moment.

Next, we use a small number of external, well maintained blacklists to see if the IP of the sender is on it. This blocks around 48% of all mails.

A total of around 81% of all incoming mails is now already blocked, even before we let the sending server push their mail into our system...


Spam filtering based on the sent message

After we allow the sender to send us their message, we can continue scanning to filter the remaining 19% of mails to see if there is any unwanted content in there.

We first check if the user has made a blacklist or whitelist for the sender. If a mail is blacklisted, we immediately stop processing the message. If a mail is whitelisted, we don't do any scans but deliver it instead.

We then continue to scan:

  • For viruses, trojans, and other malware. (around 1.4% of all mails)
  • We perform over a 1000 checks to see if a mail is spam. Around 2.6% of all mails are stopped here. These checks include:
    • More external blacklists that are less reliable. We don't block based on them, but give the mail an increased 'spam score'. Above a certain score, we consider it to be spam.
    • We check DKIM and SPF records to see if the sender might be forged, or lower the spam score if it all checks out correct.
    • We check if the checksum of the message is known at other external lists, which means they received the exact same message and base a spam score increase on that.
    • We check if any URL's from the message are listed in blacklists.
    • We check if the techniques used in the mail, are valid.
    • We check for common 'spam texts' and increase the score based on that.
  • We check for any attempts at phishing, for example mails that a link that is different from the URL they are pretending to be. Even when a mail is considered 'clean', we place a warning for this in the mail.
  • We check a list of daily updated known bad domains / good domains to increase / decrease the spamscore.
  • We use a bayesian filter to calculate the probability of a text being a valid mail and base a score on that.

The mails that are stopped at this level, can be viewed and released in the rare case that a mail is falsely identified as spam.


Clean mails will be delivered

After all the scanning is done, we will release the clean mails to the hostname or IP that is the end destination for the mail of your domain. Around 15% of all mails we receive are clean, which means 85% of all email is garbage...