DNA Analysis Spots e-mail Spam
Introduction
Biologists at IBM's Watson Research Centre have devised an anti-spam filter based on the way scientists analyse genetic sequences. It has proved to be 96.5% efficient. The formula, or algorithm, helps in automatically determining the properties of a protein, such as function and structure, directly from a string. Obviously algorithms that pertain to pattern discovery are applicable to a vast range of problems. One of the properties of the algorithm is that it will spot two or more occurrences, whenever they are in the message. It can be trained so that it will not be fooled by cunning replacements of "S" with "$", a common ploy used by spammers to bypass conventional e-mail filters. Further, the method builds up its database of known true-spam patterns and constantly adds new patterns it spots. It compares its vocabulary with e-mail which it knows do not contain spam. So, an incoming message hit with this pattern analysis will be rejected if it contains a large proportion of the same vocabulary patterns.
The system has to go though some more pilot studies and testing before it is let loose to protect inboxes.
Source
Wista Innovation