Using the Immune Metaphor for Machine Learning and Data Analysis

Summary of the Work


Over the past number of years we have created a novel data analysis technique inspired by the natural immune system. Immunological metaphors were extracted, simplified and applied to create an effective data analysis technique. We took foundations of previous work, extracted salient features of the immune system and created a principled and effective data analysis technique.

Earlier work created a system called JISYS , which was an attempt at creating an immune based approach to fraud detection. Some success was achieved with this work but a decision was added to adopt a more principled approach and create a generic learning algorithm.

Our initial attempt on an Artificial Immune System for Data Analysis through the process of cloning and mutation, built up a network of B cells that were a diverse representation of data being analysed. This network was visualised via a specially developed tool. This allows the user to interact with the network and use the system for exploratory data analysis. Experiments were performed on two different data sets, a simple simulated data set and the Fisher Iris data set. Good results were obtained by the AIS on both sets, with the AIS being able to identify clusters known to exist within them. Extensive investigation into the algorithms behavior was undertaken and how algorithm parameters effected performance and results was also examined. The work was then compared with other similar techniques , such as Kohonen Networks and cluster analysis.

Despite initial success from the original AIS, problems were identified with the algorithm and the second stage of research was undertaken. This resulted in the resource limited artificial immune system, which we call AINE (Artificial Immune NEtwork), which created a stable network of objects that did not deteriorate or loose patterns once discovered. Periods of stable network size are observed with perturbations of the network size. Shown below are results from the Iris data set. These are two different time periods during the training cycle, as you can see the networks appear to be the same. In fact, the network population has stabilised and a good fit has been found for the data. Additionally, AINE has successfully identified three clusters within the data set; a known feature of the data. The user can interact with the network, so as to gain an understanding for why items may be connected and relationships between them. AINE only has four parameters, the number of resources, the mutation rate, a scalar that controls network connectivity and a value that controls the maximum number of clones produced by an ARB.

We feel this work is a successful application of immune system metaphors to create a novel data analysis technique. Furthermore, AINE goes a long way toward being a viable contender for effective data analysis.

An animated gif showing network evolution


This network is evolved using simple data that contains two clusters. The network starts as one large connected structure then eventually separates into two distinct clusters.


The network after 10 iterations of the training data


The network after 12 iterations of the training data





 UKC Department Search People Research


http://www.cs.ukc.ac.uk/people/staff/jt6/index.local
Last modified Thu Jun 8 12:01:45 BST 2000
Problems with this page? Contact the CS Webmaster