A Survey on Data Mining by Artificial Immune System

by Akira Imada (October 2004)

Data is not a knowledge but just a set of data. To simply put, Data Mining is to extract a knowledge from a set of data.

To tell you the truth, I even haven't known that. All I thought was "Data Mining is a tool to mine the important data from enormous amount of mixed data from a golden one to a maddy one. I know Data Mining is a tool to extract knowledge from data. But how? Actually, the title of the paper I happened to come accross is "An Artificial Immune System for Fuzzy-Rule Induction in Data Mining" My ignorance also went for Fuzzy Logic (FL). Once I thought "Fuzzy Logic is a logic to fuzzify an idea if we are not so clear. Well, the latter is a joke, of course, but juking aside, it's true that I start this topic almost from the scratch. It will be a fun, however, to start quit a new topic. Feel free to join us with this topic.

At this moment, this document is only a frame work preparing for the description of the results of my survey of this topic.

Data-Mining based on Fuzzy-Logic by Artificial Immune System.

In his nicely organized book, Kecman [99] wrote, praphrasing Zadeh --- the orignator of the Fuzzy Logic theory, "... the Fuzzy Logic is a tool for embedding human knowledge, which is approximate rather than exact, into computer algolithm by using a set of IF-THEN rules as a linguistic form of structured human knowledge..." (also paraphrased by me). From this point of view, Fuzzy Logic is a good candidate tool to create a method for data-mining. Here, we see two studys where Fuzzy Logic is used in Data Mining based on Artificial Immune System,

The first one bellow is our TARGET paper for the time beeing.

You might see also [2] bellow which is a kind of sister paper of [0]. You would find our current target paper was totally influenced by this paper. Hence it's very helpful to understand the background of the method described in [0] too.

The second paper using Fuzzy Logic and Artificail Immune System for Data Mining purpose is:

Evolution of Fuzzy Rules by Evolutionary Computations (AIS is not employed)

Not using Artificial Immune System but only using Evolutionary Computations to evolve Fuzzy rules. The bellow is good survey paper for the purpose.

Also a good survey from this vew point is in

Yet another approach is "Evolution of Fuzzy Rules." A little old paper but IMHO still worth to read is Or, much simplere and clearer version of the same author et al. can be found

Evolution of Rules to Classify Data but NOT based on Fuzzy Logic

The paper bellow gives as an overview of "What is Data Mining" and in the GA implementation a very simple design of chromosome is proposed. That is each gene is made up of two continuous values like (a, b) which means the attribute is supposed to lie betwee these two values.


Although our target paper [0] was fairly clearly written, some descriptions are not enough at all for real installation. For example, the authors employes "data pruning based on information gain" just as "a statistical procedure". The author, instead, refer to the next paper, and it's a must to read it in order to know "what is data pruning and how?"


Data Mining & Artificial Immune System.

Not using Fuzzy Logic nor Evolutionary Computations but only Artificial Immune System.

Data Mining & Ant Colony Optimization (ACO)

It is quite natural when we think of ACO to solve the Traveling Sales-person Problem since real ants in nature are good at looking for a near shortest path from their nest to a food souce. But here ACO is applied to Data Mining. Interesting topic isn't it? We found papers on this topics bellow.

More detailed version of the paper above by the same authors.


Above two papers are also by Fereitas's group as in [2], [3]. It seems to be really a good group work, right? The other paper concerning Data mining using Ant Colony is



Yet another application of ACO to Data Mining

An-Artificial-Immune-Model-for-Network-Intrusion-Detection


The Other Topics




Data Mining as an Anomaly Detection.

This is a big topic and a different page is provided. In case you are interested in, click the title immideately above.


Data Mining in Finance.

Though this item might be in a different way of categorization, we add this item for a possible corraboration with parties whose specialty is this topic.



Bibliography

[99] V. Kecman "Learning and Soft Computing Support Vector Machines, Neural Netorks, and Fuzzy Logic Models" MIT press 2001