1 www.bitdefender.c om An Anti-Spam filter based on Adaptive Neural Networks Alexandru Catalin Cosoi Researcher / BitDefender AntiSpam Laboratory [email protected]
Jan 20, 2016
1
www.bitdefender.com
An Anti-Spam filter based
on Adaptive Neural
Networks
Alexandru Catalin Cosoi
Researcher / BitDefender AntiSpam Laboratory
2
www.bitdefender.com
Neural Networks
a large number of processing elements, called neurons
a different approach in problem solving
neural networks and conventional algorithmic computers complement each other
3
www.bitdefender.com
Adaptive Resonance Theory
Proposed by Carpenter and Grossberg in 1976-86
Solves the stability – plasticity dilemma
ART architecture models can self-organize in real time producing stable recognition while getting input patterns beyond those originally stored
Contains two components: an attentional and an orienting subsystem
The orienting subsystem works like a novelty detector
4
www.bitdefender.com
ARTMAP
ARTMAP a class of Neural
Network architectures perform incremental
supervised learning multi-dimensional
maps input vectors
presented in arbitrary order
Fuzzy ARTMAP features presented in
fuzzy logic
5
www.bitdefender.com
System
A complex system that will
gather the spam and ham corpus
study its characteristics learn no human involvement
6
www.bitdefender.com
Inputs
words like viagra, mortgage, xanax
obfuscated words information extracted
from headers other heuristics used in
Anti-Spam filters
7
www.bitdefender.com
Hierarchy
Initial implementation: single neural network Increasing number of heuristics Increasing number of training items Train both on spam and ham Improvements
Next step: multiple neural networks (a hierarchy) Run only requested heuristics Perform a refined classification Split email into several categories Increase detection speed Learn new patterns without losing detection on older spam
8
www.bitdefender.com
Hierarchy
9
www.bitdefender.com
Correction module and noise reduction
Performs noise reduction on the input data before entering the learning phase
Increases discrimination rate between the input patterns Eliminates or modifies patterns that can cause misclassification
(same pattern for multiple categories)
10
www.bitdefender.com
Results
11
www.bitdefender.com
Results
Table 3: Detection results on an increasing number of training items. Both train and test corpus were analyzed.
Detection results on training items
Detection results on test items
12
www.bitdefender.com
Conclusions
Fast learning method Solves the stability – plasticity dilemma (property preserved from the
ART-modules) Improves consistently the heuristic filter
• Faster• The analysis is based on pattern recognition
Performs a refined analysis High detection rates Advanced categorization Multiple spam categories Can also be used for parental control Can perform email classification (business, school, personal)
In conclusion, this system improves both speed and detection