| ePrism
Spam Email Filtering Appliance
ePrism
is a total spam solution that delivers the most comprehensive spam
mail management available.
It features
a combination of easily managed white lists, Realtime Blackhole
List (RBL) lookups, Distributed Checksum Clearinghouse (DCC) bulk-mail
control and industry-leading Stastical Token Analysis (STA). With
local learning, ePrism provides complete control over the classification
of e-mail and its disposition. ePrism's spam control package is
highly accurate and has a low incidence of false positives.
ePrism gives
you the flexibility to control your spam management system. Built-in
flexibility will allow users to trap or block spam or to simply
tag the mail that is considered spam. This flexibility in a spam
filter limits the amount of time a mail administrator spends dealing
with users who think their mail is being deleted.
Overview
of ePrism's Automated Anti-Spam Tools
ePrism provides
two complimentary mechanisms for controlling spam:
Rules-based
tools that can provide automated protection.
These are RBL (for identifying known spammers), DCC (for identifying
bulk mail) and STA (for advanced lexical analysis). Used properly,
these tools will handle the great majority of spam.
Locally-specified
filters for exceptions, overrides, whitelists and blacklists.
These tools allow you to avoid the problems that can come from over-reliance
on automated methods. It is inevitable that some spam will not be
caught by the tools mentioned; it is also inevitable that some mail
will be wrongly classified as spam (for example, mailing lists wrongly
marked as "bulk").
ePrism uses a combination of the following automated anti-spam tools
to provide a comprehensive spam filter.
Realtime
Blackhole Lists (RBLs)
RBLs are databases
of known spammers (or servers reported as sources of spam). There
are many of these, and St. Bernard provides a list of the better
known ones. Some of these lists are free; others charge an access
fee.
The RBL mechanism
is based on the Domain Name Server (DNS). DNS is a data query service
used on the Internet for translating hostnames into Internet addresses.
Every server that attempts to connect to ePrism will be looked up
on the specified RBL servers using DNS. This makes it a low-cost
operation that does not impact performance. If the lookup succeeds,
then the server is listed as a spammer and the connection dropped.
Distributed
Checksum Clearinghouse (DCC)
DCC, or Distributed
Checksum Clearinghouse, is based on a number of open servers (rather
like the RBL scheme) that maintain databases of message checksums
(derived numeric values that uniquely identify a message).
Mail users and
Internet Service Providers all over the world submit checksums of
all messages received. The database records how many of each message
is submitted. If requested, the DCC server can return a count of
how many instances of a message has been recorded.
The ePrism Mail
Filter uses this count to determine the disposition of a message.
DCC can reduce
spam by up to 90% in many cases. It is almost entirely hands-free,
creates little overhead and can be configured to block, quarantine
or tag and deliver messages.
Statistical
Token Analysis (STA)
STA is a new,
sophisticated method of identifying spam based on content. It is
based on the latest application of Bayesian logic to the problem
of classifying mail by content.
Simple text
matches can lead to false positives, since a word or phrase can
have many meanings depending on context. What is needed is a way
to accurately measure how likely any particular message is to be
spam without having to specify every word and phrase.
STA achieves
this by deriving a measure of a word or phrase contributing to the
likelihood of a message being spam based on the relative frequency
of words and phrases in a large number of spam messages. From this
analysis, it creates a table of "discriminators" (words
associated with spam) and associated measures of "spam-ness."
When a new incoming
message is received, STA analyzes the message, extracts the discriminators
(words and phrases) and their measures from the table and then aggregates
them to create a metric. ePrism uses this count to determine the
disposition of a message.
A special feature
of STA is that it "learns" from local legitimate and spam
mail to build more accurate measures of what constitutes spam based
on local language usage.
How
Messages Are Processed for Spam
ePrism applies
a series of spam filters to messages, starting with the simplest
and proceeding to the most complex. The sequence to process spam
is as follows:

- Mail
is processed for spam only if it arrives from an "untrusted"
source, which is defined as any system not on the local network,
or not specifically "trusted" by the administrator.
- The
source of the message is compared against the list locally specified
in the Source Address Filters. If found, it may be "rejected",
"accepted" for immediate delivery, or accepted for relay.
- Optionally,
the source may also be checked against an RBL and rejected if
found.
- The
message will now be passed through the content filters, which
look for a text or pattern match against a specified part of the
message. If a filter rule is triggered, an associated action is
executed, which can include "reject" or "accept"
for immediate delivery.
- The
message is optionally checked by DCC, which reports if the message
is "bulk" or has been reported on the Internet n times.
If this exceeds the locally set threshold, the message may be
rejected, quarantined or tagged and delivered as required.
- The
message is optionally checked by STA to see if its contents exceed
a locally specified threshold of "spam-ness". If so,
the message may be rejected, quarantined or tagged and delivered
as required.
- Prior
to delivery, ePrism will apply locally specified attachments and
virus checks on the contents of the message.
Prior
to delivery, ePrism Mail Filter will apply any locally specified
attachments and, if available, virus checks on the contents of the
message.
|