A Survey on Anomaly Ditection
by Akira Imada (December 2004)
All the topic of this page is, in short, designing an algorithm
to detect patterns called 'non-self' (anomalcy)
among patterns called 'self' (nomalcy).
Those patterns are basically in a n dimensional
Eucledean space.
Anomaly Detection in General --- Artificial Immune System (AIS) Approach
We explore, first of all, anomaly detection by AIS in this section of this page.
Title of this section includes "in General" while some of the papers
bellow describe specifically "Network Intrusion." IMHO, however,
the algorithms proposed are easily extended to more general case of
"Anomaly Detection."
- Detection of Binary Encoded Anomaly
Let's start with binary-pattern detection as anomaly, since one of my concerns
is search for a-needle-in-a-haystack problem.
- [1]
Ayara, M., J. Timmis, R. D. Lemos, L. N. D. Castro, and R. Duncan (2002)
"Negative Selection: How to Generate Detectors."
Proceedings of 1st International Conference on Artificial Immune Systems (ICARIS)
pp. 89--98.
After a short summary of
"Exhaustive Detector Generating Algorithm"
as a basis for comparison, the authors showed
"Greedy Detector Generating Algorithm"
in which
matching rule "r-chunk" is exploit. The author wrote that a more detailed
algorithm was described in:
- [2]
P. Helman and S. Forrest (1994)
"An Efficient Algorithm for Generating Random Antibody Strings."
Technical Report 94-07, University of New Mexico, Albuquerque, NM.
But so far I have not succeded in getting it via ftp. Don't worry, however,
I finally found a detaild description of "r-contiguous bits matching"
in the paper
A simplest question of "how many strings can a single detector match
with at least r contiguous bit positions?" is addressed in
The other analysis for detector of anomalies in binary space is found in
- Clonal Selection
- [6]
J. Kim and P. Bentley
"An Artificial Immune Model for Network Intrusion Detection"
Althought this paper specifically intends to propose
a Network Intrusion Detection System, the algorithm could be
easily extended to a general Anomaly Detection. Authors wrote:
"It combines three
evolutionary stages. Gene library evolution simulates the first stage of evolution,
which learns knowledge of currently existing antigens.
......
Gene expression and negative selection form the second stage of evolution,
generating diverse pre-detectors and selecting mature detector sets by
eliminating false pre-detectors in a self-organising way.
......
Clonal selection is the third stage of evolution, detecting various
intrusions with a limited number of detector sets using approximate binding,
and generating memory detectors.
- Positive Selection
- Negative Selection
What we should specially note in algorithms using negative selection is
"only the negative (or normal) training data are needed" as Dasgupta et all wrote in
- [8]
Dasgupta, et al (1999) "An Anomaly Detection Algorithm
Inspired by the Immune System."
Dasgupta et al. (Eds), Artificial Immune System and Their Application
Dasgupta's group in Memphis US gave us an excellent series of systematic proposal
every year of a detector
to detect "non-self" in the continuous multi-dimensional Eucledian space.
The following a couple of papers are by Dasgupta's group from newer to older ones.
- [9]
Z. Ji and D. Dasgupata (2004)
"Augmented Negative Selection Algorithm with Variable-Coverage Detectors."
Proceedings of the Congress on Evolutionary Computation
As authors denote, Negative selection in this series of papers are very simple
in the sense that
"The idea of negative selection was from T cell development process in the thymus.
If a T cell recognizes self cells, it is eliminated before deployment
for immune functionality. In an analogous manner,
the negative selection algorithm generates the detector set
by eliminating any detector candidates that match self
samples. It is thus used as an anomaly detection
mechanism with the advantage that only the negative (normal)
training data are needed.
Then authors proposed two algorithms: one is to generate detectors
of constant sized and the other variable sized sphers. (See my article
"Hints for Implementations".
This paper is currently the latest vergion of them in this series, but the described algorithm
is not easy to interpret, unfortunately. So I reccommend to read
The other ones in the series by Dasgupt's group:
-
[12] Gonzalez, F., D. Dasgupta, J. Gomez (2003)
"The Effect of Binary Matching Rules in Negative Selection."
GECCO-03.
Also negative selection by other groups.
Immuno-fuzzy Approach
Fuzzy Rules not based on Immune System but Evolved by Genetic Algorithm
- [17]
J. Gomez and D. Dasgupta (2001)
"Evolving Fuzzy Classifiers for Intrusion Detection"
Proc. of IEEE Workshop on Information Assurance.
Authors wrote:
"proposes a technique to generate
good fuzzy classifiers using genetic algorithms that can detect
anomalies ... The main idea is to
evolve two rules, one for the normal class and other for the
abnormal class ..."
And their algorithm is in
Also the detailed algorithm seems to be in
- [19]
J. Gomez and D. Dasgupta (2002)
"Using Competitive Operators and a Local Selection Scheme in Genetic Search."
Proceedings (Late-Breaking) of the Genetic and Evolutionary Computation Conference,
pp. 193-200.
But I have not succeeded in getting the paper via Internet
and it has not been possible to contact the first author (Gomes) in Columbia.
Messy-GA Approach
Yet Another Intrusion Detection --- Data Mining Approach