

|
Using
the Immune Metaphor for Machine Learning and Data Analysis
Summary of the Work Over the past number of years we have
created a novel data analysis technique inspired by the natural immune
system. Immunological metaphors were extracted, simplified and applied to
create an effective data analysis technique. We took foundations of
previous work, extracted salient features of the immune system and created
a principled and effective data analysis technique.
Earlier work created a system called JISYS , which
was an attempt at creating an immune based approach to fraud detection.
Some success was achieved with this work but a
decision was added to adopt a more principled approach and create a
generic learning algorithm.
Our initial attempt on an Artificial
Immune System for Data Analysis through the process of cloning and
mutation, built up a network of B cells that were a diverse representation
of data being analysed. This network was visualised via a specially
developed tool. This allows the user to interact with the network and
use the system for exploratory data analysis. Experiments were performed
on two different data sets, a simple simulated data set and the Fisher
Iris data set. Good results were obtained by the AIS on both sets, with
the AIS being able to identify clusters known to exist within them.
Extensive investigation into the algorithms behavior was undertaken and
how algorithm parameters effected performance and results was also
examined. The work was then compared with other similar techniques ,
such as Kohonen Networks and cluster analysis.
Despite initial success from the original AIS, problems were identified
with the algorithm and the second stage of research was undertaken. This
resulted in the resource limited artificial immune system, which we call
AINE (Artificial Immune NEtwork), which created a stable network of
objects that did not deteriorate or loose patterns once discovered.
Periods of stable network size are observed with perturbations of the
network size. Shown below are results from the Iris data set. These are
two different time periods during the training cycle, as you can see the
networks appear to be the same. In fact, the network population has
stabilised and a good fit has been found for the data. Additionally, AINE
has successfully identified three clusters within the data set; a known
feature of the data. The user can interact with the network, so as to gain
an understanding for why items may be connected and relationships between
them. AINE only has four parameters, the number of resources, the mutation
rate, a scalar that controls network connectivity and a value that
controls the maximum number of clones produced by an ARB.
We feel this work is a successful application of immune system
metaphors to create a novel data analysis technique. Furthermore, AINE
goes a long way toward being a viable contender for effective data
analysis.
An animated gif showing network evolution This network is evolved using simple data
that contains two clusters. The network starts as one large connected
structure then eventually separates into two distinct clusters.
The network after 10 iterations of the training data
The network after 12 iterations of the training data
         
|