ENIS SENERDEM


Gopi K. Kuchimanchi, Vir V. Phoha, Kiran S. Balagani, Shekhar R. Gaddam
"Dimension Reduction Using Feature Extraction Methods for Real-time Misuse Detection Systems"

Reduce dimensionality of the dataset using its randomly selected 25\% of the 
127,437 tcp sessions in the original dataset.

Firstly they excluded two features {\it number_of_outbound_commnads} and  {\it is_host_login}
because their values remains constant. Hence they compute the correlation matrix for the 25\%
subdataset. By sorting eigenvalues in decreasing order, and then 
using Scree Plot test they concluded the first six components or the first 19
conponents are the most significant in the data.
19 out of 41
src_bytes
dst_bytes
duration
is_guest_login
is_host_login
srv_diff_host_rate
diff_srv_rate
service
flag
protocol_type
num_root
hot
num_compromised
dst_host_same_srv_rate
dst_host_count
rerror_rate
srv_count
dst_host_srv_diff_host_rate
count
(dst_host_same_src_port?rate)

The goal of the author is remension reduction using


==========
An Immuno-Fuzzy Approach to Anomaly Detection
Jonatan Gomez et al.

10% dataset whinc includes 492,021 records
Each record is made up of 42 in which 22 atribute is numerical

33 atributes after removing categorical atributes (?)
are normalized using max and minimum values 
80 % of the normal samples were picked at random and used as training set


==================================================================================================
Tina Yu and J. Miller (2002)
"Finding Needles in Haystacks is Not Hard with Neutrality."
EURO GP

----------
there is one kind of search space, needle-in-haystack, which is difficult for heuristic search 
algorithms to outperform random search.

In a needle-in-haystack type of search space, a solution is either a needle or a piece
of hay. In other words, a search algorithm either finds a perfect solution (the needle) or 
otherwise (the hay). 
--------------------------------------------------------------------------------------------------
No knowledge about the location of the needles can be obtained from examining the hays. 
--------------------------------------------------------------------------------------------------
In this kind of situation, a heuristic search algorithm works like a random search algorithm.
When the number of solutions in the search space is small, finding a good solution is difficult,
no matter what search algorithm one uses.

What if the search space only has two possible fitness values (one for the needles and the other 
for the hays)? Evolutionary algorithms seem to become helpless in this kind of situation. In this
 study, 
--------------------------------------------------------------------------------------------------
we investigate building a network within the "hay" to provide a trail for the search process. In 
this way, the discovery of the "needle" solutions may become easier. 
--------------------------------------------------------------------------------------------------
Since the network connects solutions with the same fitness (within the hays), it is called 
"neutral network." Moreover, an evolutionary algorithm utilizing such a network for search is 
said to support neutrality, a term borrowed from evolutionary biology.

The theory of natural evolution established by Darwin has had profound impact on biology. Most 
biologists are convinced that selection acting on advantageous mutations is the driving force of
evolution. It was not until the late 1970s when molecular data became available, that the theory
was challenged. In particular, Motoo Kimura found that the number of mutant substitutions in
amino acid sequences of hemoglobin was too large to be explained by the theory of natural
selection. Based on this discrepancy, he proposed the neutral theory, which states that 
--------------------------------------------------------------------------------------------------
most mutants at the molecular level in evolution are caused by random genetic drift rather than 
by natural selection [3]. In other words, the mutants involved are neither advantageous nor
disadvantageous to the survival or reproduction of the individual. ([3] Kimura, M.: The Neutral Theory of Molecular Evolution. Cambridge Univ. Press (1983)).
--------------------------------------------------------------------------------------------------
But can neutral mutations (those are neither advantageous nor disadvantageous) benefit 
evolutionary search?
--------------------------------------------------------------------------------------------------
In particular, we measure the number of neutral mutations that occur in the evolved entities
during evolutionary search. In this way, the impact of neutrality on search performance can be
analyzed quantitatively. Using this approach, we have studied a Boolean function problem. The
results show that 
--------------------------------------------------------------------------------------------------
there is a positive relationship between neutral mutations and success rate: the larger the 
allowed neutral mutations quantity the greater is the possibility for the evolutionary search to 
find a solution.
--------------------------------------------------------------------------------------------------
To investigate these questions, we have devised a methodology for systematic study of this 
subject [12].

The amount of neutral mutations is measured in the selection step, which evaluates both the 
fitness and the number of neutral mutations in the evolved entities. Moreprecisely, 
-----------------------------------------------------------------------------
an offspring solution is selected to replace the current winner only when it has a better fitness
or it has the same fitness but its neutral mutants are within a specified range (the Hamming 
bound). 
-----------------------------------------------------------------------------
One can envisage all solutions with the same fitness and satisfy the Hamming bound are connected 
in a network (neutral network). The search process selects solutions in the network one after 
another in the manner of a neutral walk. We found that such a walk can lead to a solution with 
a better fitness if it satisfies the fitness improvement criterion. 

The criterion is concerned with the ratio of adaptive and neutral mutations. The analysis 
indicated that when this ratio for the neutral walks was close to the ratio for the fitness
improvement, a high probability of success occurred.


=====
Claus Wilke and colleagues studied the evolution of digital organisms (as computer programs) 
using the Avida system [11]. They reported that
__________________________________________________________________________________________________
under high mutation rates, an organism that has its neighbors (those accessible by one mutation
step) with a similar fitness (not necessary the same fitness) had a higher reproduction rate.
The reason is that such flat fitness landscape is more robust against mutations than a fitness
landscape that has high and narrow peak. 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Although they didn't mention neutral networks (where the neighbors have the same fitness), one
would expect the same findings.


=====
Marc Ebner and colleagues also studied the relationship between neutral networks and 
evolvability [2]. Particularly, they investigated a search space with 2^16 possible fitness 
values. Moreover, these fitness values were divided into 64 groups. Their selection criterion was
 similar to ours in that they allowed an individual with better or equal fitness to replace 
the current winner. They experimented with 3 different sizes of neutral network 
(1, 2^112 , 2^320 ) using a single point mutation. They reported that the larger the network
(more neutrality), the higher the average population fitness. ([2] Ebner, M., Langguth, P., Albert, J., Shackleton, M. and Shipman, R.: On neutral networks and evolvability. In: Proceedings of the 2001 Congress on Evolutionary Computation, IEEE Press (2001) 1-8.)

=====
Evolutionary Algorithm
1. Randomly generate an initial population of 5 genotypes with the lowest possible fitness and 
   select one (randomly) as the winner.
2. Carry out point-wise mutation on the winning parent to generate 4 offspring;
3. Construct a new generation with the winner and its offspring;
4. Select a winner from the current population using the following rules: 
       1) If any offspring has a better fitness, it becomes the winner.
          Otherwise, an offspring with the same fitness is randomly selected.
       2) If the parent-offspring pair has a Hamming distance within the permitted range the 
             offspring becomes the winner.
          Otherwise, the parent remains as the winner.
5. Go to step 2 unless the maximum number of generations reached or a solution with needle 
      fitness is found.

6.2 Control Parameters
Eleven different mutation rates and 7 neutrality levels were used in the experiments. 
   Mutation Rate (%) on genotype 1,2,4,6,8,10,12,14,16,18,20
   Max Generation 10,000
   Neutrality Level (Hamming distance range) 0,50,100,150,200,250,300
   Population Size 5
   Number of Runs 100


==================================================================================================
We can give it a consideration on the discussion that neutral mutaion makes search easier or 
harder by applying this algorithm to our problem conparing with our random search (Fig)
==================================================================================================


For even-5-parity, all implementations (mutation rates and Hamming distances) have a 100% 
successful rate, i.e. all 100 runs find a solution (see Figure 2A). 

In contrast, even-8-parity (a harder problem) has lower success rates in some cases. In
particular, the combination of low Hamming distance and low mutation rate has produced some 
unsuccessful runs. When mutation rate is 1% or 2%, a small amount of neutrality (50) is able to
improve success rates. However, when mutation rate is 4%, a neutrality level equal to Hamming
distance 100 is required to improve the performance. For example, increasing mutation rate from
2% to 4% gives Hamming distance 0 a success rate jump from 82% to 94%. In comparison, increasing
neutrality level from 0 to 50 on mutation rate 4% does not improve success rate. This suggests
that raising mutation rate has a stronger impact than raising neutrality level on the 
evolutionary search. The relationship between neutrality and mutation rates will in more details 
in Section 8.2

Regardless of mutation rates, increasing neutrality level beyond 150 does not improve or impair 
the performance; all of them have 100% success rate. This suggests that equilibrium between the
benefits of exploitation and exploration is reached at this point. Indeed, the adaptive/neutral
mutation analysis shows that the exploitation/exploration ratios are very similar for all
Hamming distance beyond 150. Increasing neutrality or mutation rates do not affect this
equilibrium.

Even-10-parity is harder than even-8-parity: there are more unsuccessful runs, especially when
low Hamming distance values were used. Similar to even-8-parity, the success rate remains
approximately the same after a certain Hamming distance value is reached (200 is the equilibrium 
point). Moreover, mutation rates are more influential in this problem: neutrality level 150 is
required to give consistent improvement of success rates (in contrast to neutrality level 100 in
even-8-parity).

Even-12-parity is the hardest among all; none of the implementations has 100% success rate 
Although not as precise as other problems, even-12-parity reaches the equilibrium point around
neutrality level 200. We also made 100 experimental runs using Hamming distance 0 and 100% 
mutation rate. Among them, 48 runs find a solution (48% success rate). This suggests that high 
neutrality and mutation rate are not sufficient for the search algorithm to find a solution to
this problem. Modification of other parameters, such as gene length and the maximum number of
generation, is required to improve performance. The performance of the 7 different Hamming
distances implementations can be roughly divided into two groups: the first group consists of
neutrality level 0, 50, 100 while the second group consists of neutrality level 150, 200, 250 
and 300. In the first group, increasing mutation rates increases success rates while in the 
second group little performance improvement is gained after mutation rate exceeds 4%.
Nevertheless, the group with higher neutrality level also gives higher success rates.


Even-12-Parity:
The ratio pattern of even-12-parity has higher active gene changes than those in the other 
problems. They are also associated with lower success rates. This suggests that the algorithm
does not provide sufficient inactive gene change (exploration) for the search to find a solution.
Moreover, each time problem difficulty is increased more inactive gene changes were required
for the evolutionary search to be successful. 
----------------------------------------------------------------------------------------------
This suggests exploration is more important than exploitation for search in needle-in-haystack
type of space.
----------------------------------------------------------------------------------------------

<my>
---------------------------------------------------------------------------------------------------
Concluding Remarks
We observed impressive improvement by ...
We have compared the improvement with three other method so far proposed.
None of them have not revealed better results.

Moreover, when we applied this algorithm in the context of Network Intrusion Detection,
we have a more togh problem of how the system learn only be self data to detect
non self. Although we have enormous amount of self data (hay) but we have no information
about non-self (needles) until it's too late.   
---------------------------------------------------------------------------------------------------


==================================================================================================
M. Collins
Finding Needles in Haystacks is Harder with Neutrality.


This research presents an analysis of the reported successes
of the Cartesian Genetic Programming method on a simplified form of 
the Boolean parity problem.

We present results indicating that the loss of performance
is caused by the sampling bias of the CGP, due to the neu-
trality friendly representation.

We implement a simple intron free random sampling algorithm 
which performs considerably better on the same problem.


We show that this algorithm considerably exceeds expectations un-
expectedly performing better than random sampling. An
analysis of how simply removing introns from the representation 
can produce such a result is provided.


3. REPORTED PERFORMANCE
The CGP has performance reported for the 5 8 10 and
12 parity problems. We will concentrate on the 12 parity
problem, since this is a) the hardest of the reduced Boolean
parity problems for which CGP results are reported, b) Yu
and Miller reported the failure of random sampling on this
problem see ([7] page 5, table 1), and, c) Yu and Miller
obtained an outstanding >55 % success rate from a hundred
runs, each of a mere 10,000 iterations. We use only reported
results for comparison in this investigation.

[7] T. Yu. J. Miller. Finding Needles in Haystacks Is Not
Hard with Neutrality J. Foster et al (Eds.) EuroGP 2002,
LNCS 2278, pp. 13-25.


Introns in CGP are sections of the genotype which are left out
of the interpretation due to simply not being included in the
chain of references from the chosen output gate.

Mutation of the introned code has no effect on the expression of the
genotype or the phenotype and is termed 'neutral'.


Since the CGP representation has introns as an enevitable
part of its representation, 
--------------------------------------------------------------------
it has a bias towards representing smaller solutions more frequently 
than a representation in which all the code was expressed: 
--------------------------------------------------------------------
There is no one to one mapping from the formula which are represented by CGP
solutions and the space of possible formula. The question
is, does this help or hinder? As a benchmark for researching
neutrality the reduced Boolean parity problem is actually a
strange choice - due to the fittness function, all movement except that 
which finds a solution is score neutral. Differences
in performance between the CGP and other tactics on the
same landscape must then be a function of the effect of neutrality 
on the sampling strategy, and not due to permitting
score neutral movement.

For Boolean formulas using between 1 and 100 operations,
the solution density of the reduced Boolean 12 parity prob-
lem space is 0.0019531 percent for two operator types, and
0.003756 percent for one operator type. Since the reduced
Boolean search space is devoid of fittness gradient clues, and
consequently search is unguided, this represents the best
expected performance for an algorithm which samples the
space without bias.


In [7] Yu and Miller report CGP has a peak success rate of over 55% 
on the reduced Boolean 12 parity problem. The
average success rate is approximately 45% for the optimal
combination of parameters. These results are achieved using
the CGP algorithm over 10,000 iterations, which is a sample
of at most 40,000 points.
A search using one operator type (EQ), sampling evenly
from the search space of possible formulas until finding a solution, 
performing 40,000 samples has an expected success rate of
 1- (1-0.00003756)~4000 = 0.778
This contrasts with the empirically obtained expectation of just over 0.55
which was reported for the CGP. The difference in performance is then 
a consequence of the mapping imposed by the CGP implicitly neutral 
representation and the CGP search tactics. 


==================================================================================================
[1] M. Collins (2004) 
"Counting Solutions in Reduced Boolean Parity" 
GECCO 2004.

-> empirically proved the accuracy of a method for counting the number of solutions 
to the reduced Boolean parity problem in the space of all possible formulas.


Using the Cartesian GP, Yu and Miller have shown limited success in solving
a simplified form of the boolean parity problem [4]. 
The form of the boolean parity problem examined uses only the Boolean eq and xor operators, 
and shall be referred to as the reduced Boolean parity problem.

[4] T. Yu. J. Miller. Finding Needles in Haystacks Is Not Hard with Neutrality J. Foster
et al (Eds.) EuroGP 2002, LNCS 2278, pp. 13-25.


If the parity problem is even and the maximum number
of functions is even, then all the arrangements which use the maximum number
of functions | by far the majority of the space | are non-solutions. The same effect occurs 
when the problem to be solved is odd party and the maximum number of functions is odd.

Conclusion
This paper presents eAcient methods for counting the solutions to the reduced
Boolean parity problem, and provides example results for some common pa-
rameters. The structure of the solution space; distinct solutions permuted by
various intron assignments, indicates the space is regularly populated with ele-
mentary solutions which have functionally identical alternative representations
at distances governed by the possible permutations of the intron groups. This
suggests that the space may be better explored by moving between equivalence
classes; a promising topic for future work.


==================================================================================================
again
An Immuno-Fuzzy Approach to Anomaly Detection
Jonatan Gomez Fabio Gonzalez Dipankar Dasgupta


Abstract
This paper presents a new technique for generating
a set of fuzzy rules that can characterize the non-self space (ab-normal)
---------------------------------
using only self (normal) samples. 
---------------------------------
Because, fuzzy logic can provide a better definition of the boundary between normal and
abnormal, it can increase the accuracy in solving the anomaly de-tection
problem. 

Experiments with synthetic and real data sets are
performed in order to show the applicability of the proposed ap-proach
and to compare with other works reported in the literature.


2) Results and Analysis: The performance reached by the
PHC and EFR algorithms are almost the same while are better
than the performance reached by ERD, see Figure 8. Table III
compares the performance of the tested algorithms and some
results reported in the literature. The FA-DR reported in table
III is the closest value to the optimal point (0,1). Amazingly, the
number of detectors using fuzzyfication is very small compared
to the number of detectors using the crisp characterization. It
can be due to the high dimensionality of the data set (33 at-tributes).
TABLE III
COMPARATIVE PERFORMANCE IN THE KDD CUP 99 PROBLEM
Algorithm DR%   FA% # Detectors
EFR      98.22   1.9      14
PHC 99.17 3.9 32
ERD 96.02 1.9 699
EFRID[25] 98.95 7.0 -RIPPER-
AA[26] 94.26 2.02


about kdd-dataset 99
==================================================================================================
Shrijit S. Joshi and Vir V. Phoha
Investigating Hidden Markov Models Capabilities in Anomaly Detection
---
For our experiment,we have used the KDD Cup 1999 intrusion detection data set 
prepared by Lee et al .[9 ]. The data set contains 41 features representing 
selected measurements of <<normal and intrusive TCP sessions>>.
Each labeled TCP session is either normal or 
<<a member of one of the 22 attack classes in the dataset>>.

[9] Lee, W. and Stolfo, S. J., (2000)
A Framework for Constructing features and models for Intrusion Detection Systems.
In ACM Transactions on Information and System Security, pp 227 -261.


==========
K-means+: An Autonomous Clustering Algorithm
Yu Guan 1, 2 * , Ali A. Ghorbani 1 , Nabil Belacel 1,2

The KDD-99 dataset was used for The Third International Knowledge Discovery
and Data Mining Tools Competition, which was held in conjunction with KDD-99,
the fifth International Conference on Knowledge Discovery and Data Mining [12].

The competition task was to build a network intrusion detector. This database was
acquired from the 1998 DARPA intrusion detection evaluation program. 

An environment was set up to acquire raw TCP/IP dump data for a local-area network (LAN) 
simulating a typical U.S. Air Force LAN, which was operated as if it was a
true environment, but blasted with multiple attacks. 

There are totally 4,898,431 connections recorded, of which 3,925,650 are attacks. 
For each TCP/IP connection, 41 various quantitative and qualitative features were 
extracted [12].
-----
[12] KDD Cup 1999 Data, University of California, Irvine, October, 1999,
http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html, July, 2002.
-----
There are total 42 features of each datum. The first three qualitative features are
protocol_type, service and flag. Currently, only three protocol types (tcp, udp or icmp)
are used. In KDD-99 data, there are 70 different services (such as, http or smtp) 
and 11 flags (such as, SF or S2). 

We map these three qualitative features into quantitative features so as to calculate 
the similarities of instances. There are also some other qualitative features, such as 
  - root_shell (1 if root shell is obtained; 0 otherwise),
  - logged_in (1 if successfully logged in; 0 otherwise),
  - land (1 if connection is from/to the same host/port; 0 otherwise). 
They are also used as quantitative features here because they are in the form of an integer. 
The rest of the features except the last one are positive quantitative features, such as 
  - src_bytes (number of data bytes from source to destination),
  - urgent (number of urgent packets) and
  - serror_rate (percentage of connections that have SYN errors). 
They can be used directly to calculate the similarity of instances. 
The last feature is the label, which indicates the identification of the instance. 
If the instance is a normal instance, the label is normal; otherwise it is 
<<a string of an attack type.>>

From the KDD-99 dataset, which contains 4 898 431 labeled data, we randomly
select 101 000 data for training. Among them, 100 000 are normal and 1 000 are
intrusive. From the KDD-99 dataset, we also randomly select 200 000 normal data
and 200 000 intrusive data for test. If a datum has been selected for training, it will
not be selected for test.


also with iris
---
3.2 Tests with the Iris data
The Iris data, which is created by R.A. Fisher, is a well-known dataset for
classification. It has been used for testing many classification methods. 
This dataset contains 3 classes: Setosa, Versicolor, and Virginica [19]. 
Each class has 50 instances and refers to a type of Iris flower. 

Figure 3 illustrates the Iris data distribution in 3-dimensional space. 

There are totally 4C3 = 4 combinations of three attributes of the Iris data. 
The graphs show that the Setosa class can be linearly separated from the Versicolor 
and Virginica classes, and the latter two classes are overlapping so that they are 
not linearly separable.

Two-fold cross-validation is used for evaluating the classification methods. 
The Iris data are divided into two halves. One half is used for training data 
while another half for testing. Each of the both data sets has 25 Setosa data, 
25 Versicolor data and 25 Virginica data. After the training data are partitioned
into clusters, each cluster is identified according to the majority of the data inside. 
For example, if the Setosa data in a cluster has the largest population, 
the cluster is identified Setosa. During the test, each datum was assigned to its closest 
centroid, and identified with the same label of the closest cluster.


=====
Gopi K. Kuchimanchi, Vir V. Phoha, Kiran S. Balagani, Shekhar R. Gaddam (2004)
Dimension Reduction Using Feature Extraction Methods for Real-time Misuse Detection Systems
Proceedings of the IEEE Workshop on Information Assurance and Security

We present two neural network methods for feature extraction:
(1) NNPCA and (2) NLCA for reducing the 41-dimensional KDD Cup 1999 data. 

Intrusion detection Systems (IDS) [1] have become popular tools for identifying anomalous 
and malicious activities in computer systems and networks. There are two types of IDS: 
  (1) Misuse Detection Systems [2] that detect abnormal patterns in system usage 
      by comparing them with known signature patterns and 
  (2) Anomaly Detection Systems [3] that construct profiles of normal behavior and 
      flag all deviations from estimated profiles as intrusions. 
Recently, a new class of IDS employing data mining techniques called
Intrusion Detection using Data Mining (IDDM) [4] have gained popularity because of 
their abilities to automatically extract attack signatures, detect unseen anomalies,
maintain high detection accuracies with low false alarm rates, and scale on large 
distributed datasets. A considerable subset of IDDMs [5], [6], [7] perceive misuse
intrusion detection as a data partitioning problem in which data samples are
classified as attacks or a non-attack. 


For our experiments we use the KDD Cup 1999 intrusion detection dataset prepared by 
Lee et al. [5]. The dataset contains 41 features representing selected measurements of 
normal and intrusive TCP sessions. Each labeled TCP session is either normal or 
a member of one of the 22 attack classes in the dataset.


[5] W. Lee and S. J. Stolfo (2000)
A Framework for Constructing Features and Models for Intrusion Detection Systems," 
ACM Transactions on Information and System Security, vol. 3, pp. 227{261,


The first task of feature selection was undertaken by Lee et al. [5] and the datasets are 
available in [19]. 

[19] http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html (UCI KDD Archive).

In this paper we perform the second task of feature extraction by using the 10% subset 
of KDD Cup 1999 dataset for identifying key features that contribute to classifier based 
misuse detection. The dataset contains approximately 500,000. Each session contains 41
selected measurements of TCP connections and is labeled either as an attack or a non-attack.


IV. Results and Discussion
We present the performance of the Non-linear classifier (NC) and the decision tree 
classifier (DC) in terms of detection accuracy and false positive rate. 
The detection accuracy of an IDS is the percentage of attack samples detected
normal as attacks. The false positive rate of an IDS is the percentage of normal samples 
detected as attacks.


The first 20 features and corresponding eigenvalues and
standard deviations (S.D). The first 19 features are selected
based on results of Scree test and Critical Eigenvalue test.


src bytes 51926 9.74
dst bytes 29423 4.75
duration 773.65 3.65
is guest login 247.93 2.76
is host login 217.9 1.87

srv di host rate 106.64 1.60
di srv rate 69.196 1.27
service 1.2974 1.14
ag 1.1499 1.11
protocol type 0.9664 1.04

num root 0.7784 1.02
hot 0.7769 1.00
num compromised 0.6720 0.96
dst host same srv rate 0.4849 0.95
dst host count 0.4128 0.88

rerror rate 0.3886 0.85
srv count 0.3816 0.82
dst host srv di host rate 0.3815 0.76
count 0.3813 0.74
dst host same src port rate 0.3812 0.45

===================================================================================================

__________________________________________________________________________________
R. Shipman, M. Shackleton and I. Harvey
"The use of neutral genotype-phenotype mappings for improved evolutionary search."
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
=> OK. Quoted in Bibliograpy

__________________________________________________________________________________
Starting from a given genotype, a single mutation neighbour was randomly
selected that mapped into the same phenotype, i.e. a neutral mutation was made. 
Each single mutation neighbour of the new genotype was then assessed to determine 
whether any new phenotypes were discovered. This process was repeated for 
a fixed number of steps and the cumulative number of different phenotypes recorded. 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
=> OK. Quoted in Sub-section "Experiment-3"


This resulted in a genotype of length 16 5 21 = 336 bits.
A total of 2 336 genotypes map on to 2 16 possible pheno-types
and there is therefore a very large degree of neutrality
in this mapping.

The total number of phenotypes discovered by the end of the walk 
was over 4600 on average.


===================================================================================================

[my-icnnai]
  - where q is a random number uniformly distributed in [0, ..., 7]
  - KDDcup99 best performance?

so that the number of 0 through 7 is uniformly distributed
---
Section Artificial data for IDS
---
This invites a tradeoff in  using synthetic vs. real data.

[Quote?}
BUT most economists aren't prepared to call the globally flattening yield curves - which resemble bunny hills more than ski slopes - a harbinger of slower growth.
e
ng => They (How about "We" to quote?) need a big piece of good luck. I don't know what it is.
Those who possess big dreams are stronger than those who know the realities."


--------------------------------------------------------------------------------------------------
quotation
An ant climbs a blade of grass, over and over, seemingly without purpose, seeking neither nourishment nor home. It persists in its futile climb, explains Daniel C. Dennett at the opening of his new book, "Breaking the Spell: Religion as a Natural Phenomenon" (Viking), because its brain has been taken over by a parasite, a lancet fluke, which, over the course of evolution, has found this to be a particularly efficient way to get into the stomach of a grazing sheep or cow where it can flourish and reproduce. The ant is controlled by the worm, which, equally unconscious of purpose, maneuvers the ant into place.
--------------------------------------------------------------------------------------------------
=> Done


KDD cup 1999
the closest two points one of which is legal and the other anomaly


--------------------------------------------------------------------------------------------------
Subsection => When an iris flower is normal then are others abnormal?
\noindent
{\bf $\Box$ A visualization of IRIS data by Sammon Mapping}\\
\begin{figure}[h]
 \begin{center}
   \epsfile{file=final-2d.eps,width=18cm,height=8cm}
 \end{center}
\end{figure}
===================================================================================================
=> DONE


--------------------------------------------------------------------------------------------------
[6] G.Castellano an A.M.Fanelli(2000)
Fuzzy Inference and Rule Extraction using a Neural Network. 
Neural Network Worl Journal Vol.3 ,pp.361-- 371.

Castellano et al.[6 ]clearly described their data-set as : The validity of our approach 
to fuzzy infer-ence and rule extraction has been tested on the well-known benchmark 
Iris data problem. The classification probl em of the Iris data consists of classifying three 
species of iris flowers (setosa, versicolor and vir-ginica).
There are 150 sampl es for this problem, 50 of each class. A sample is a four-dimensional
pattern vector represent-ing four attributes of the iris flower (sepal length, sepal width, 
petal l ength, and petal width).

Then what about KDD cup 99 data
--------------------------------------------------------------------------------------------------
=> DONE


===========================================================================================
An Immuno-Fuzzy Approach to Anomaly Detection
Jonatan Gomez Fabio Gonzalez Dipankar Dasgupta

http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html.

B. KDD Cup 99
This data set is a version of the 1998 DARPA intrusion detection evaluation data set prepared 
and managed by MIT Lin-coln Labs [24]. Experiments were conducted on the ten percent that is 
available at the University of Irvine Machine Learning repository 1. Forty-two attributes, 
that usually characterize net-work traffic behavior, compose each record of the 10% data set
(twenty-two of them numerical). Also, the number of records in the 10% is huge (492021).

1) Experimental settings: We generated a reduced version of the 10% data set including only 
the numerical attributes, i.e., the categorical attributes were removed from the data set.
Therefore, the reduced 10% data set is composed by thirty-three attributes. 

The attributes were normalized between 0 and 1 using the maximum and minimum values found. 
An 80% of the normal samples were picked randomly and used as training data set, 
while the remaining 20% was used along with the abnormal samples as a testing set. 

Five fuzzy sets were defined for the 33

[24] M. Labs, �gDarpa intrusion detection evaluation.�h
http://www.ll.mit.edu/IST/ideval/index.html, 1999. => No. this is for DARPA data
=> http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html


===========================================================================
1) random parashuter
2) random work after fall --- what hinton noran called lifetime learning
3) one to many mapping from genotype to phenotype
4) neutral mutation


R Shipman, M Shackleton and I Harvey
"The use of neutral genotype-phenotype mappings for improved evolutionary search."
=> when and where?

-------------------------------------------------------------------------------------------
"Mappings can be constructed that introduce the possibility of a number of different
genotypes producing the same phenotype... Thus it is possible to change the genotype, via
mutation for example, without affecting the phenotype.
Such mutations are called neutral mutations. It has been
theorised [1] that in natural systems a considerable fraction
of all mutations are neutral with only a minute fraction of
non-neutral mutations being beneficial."

[1] Kimura M (1994)
"Population genetics, molecular evolution, and the neutral
theory: selected papers"
The University of Chicago Press, Chicago
-------------------------------------------------------------------------------------------


It is studied by applying it to
(1) a random Boolean network 
(2) telecommunications networks.


But why not more simple example, if it is to work universaly?

---------------------------------------
1024 walkers 500 steps of each of them
---------------------------------------


-------------------------------------------------------------------------------------------
It is very unlikely that the needle in this
haystack would be discovered. 

The encoding of the scaling parameters introduced a significant probability of 
neutral mutation. However, the bias shifted the balance between neutral mutations
and non-neutral mutations too much in favour of the former. Neutral ridges had been 
formed but the accessibility between them
had been reduced -- too much hay, not enough needles.


She applied this only to a very specific problem, that is 
... So, this is a kind of needles in a haystac.
She assert neutral mutation outperformed random search.
To be more specific...


icnnai-addition
my-icnnai
cite 2 as to kdd99 => neurofuzzy andcpa


==============
eng for icnnai
These remarks are clearly music to French,
---
Kretschmer underlined that, especially in the agriculture sector, it was time to ...
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Foreign Ministry spokesman Namik underlined on Wednesday that the Turkish government expected ...
                                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
probably for ACS
from Turkish Daily News on 
(HEAD-0329: BUSINESS REPORT ... ENIS SENERDEM)

Some security systems become outdated a day after they go on the market. 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

One effective way to stop hackers invading a network is to invade �gtheir minds�h and train people who would �gthink like a thief to catch a thief.�h 

The ethical hacker is an individual who is employed or contracted to undertake an attempted penetration test. These individuals use the same methods employed by hackers. The goal of the ethical hacker is to help the organization take preemptive measures against malicious attacks by attacking the system himself/herself, all the while staying within legal limits.  => ACS we can create a couter intrusion dataset so that zero detection rate and 100% false alarm if we want.
(if at all.)

indeed the article went on to write

"A second CEH training program is scheduled to start on April 8 by the same organizer, Bilginc IT Academy. Cuneyt Aktan, a trainer in the program, told Referans that hackers are now working very professionally, "Those highly qualified hackers who provide security services to companies during the daytime and then go home at night to conduct totally illegal hacking are the ones who are the most dangerous.""


eng for icnnai
These remarks are clearly music to French,
---
Kretschmer underlined that, especially in the agriculture sector, it was time to ...
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Foreign Ministry spokesman Namik underlined on Wednesday that the Turkish government expected ...
                                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


probably for ACS
from TDN
Some security systems become outdated a day after they go on the market. 

One effective way to stop hackers invading a network is to invade �gtheir minds�h and train people who would �gthink like a thief to catch a thief.�h 

The ethical hacker is an individual who is employed or contracted to undertake an attempted penetration test. These individuals use the same methods employed by hackers. The goal of the ethical hacker is to help the organization take preemptive measures against malicious attacks by attacking the system himself/herself, all the while staying within legal limits. 

A training program organized by the U.S. Department of Commerce International Electronic Trade Council initialized the formal training of Certified Ethical Hackers. About 30,000 individuals trained under this program undertook responsibility for the security of corporations' Web sites. Those who have completed the program were handed certificates that legally authorized them as Certified Ethical Hackers. 

A second CEH training program is scheduled to start on April 8 by the same organizer, Bilginc IT Academy. Cuneyt Aktan, a trainer in the program, told Referans that hackers are now working very professionally, "Those highly qualified hackers who provide security services to companies during the daytime and then go home at night to conduct totally illegal hacking are the ones who are the most dangerous."


automatic review
perfect


Tina Yu and J. Miller (2002)
"Finding Needles in Haystacks is Not Hard with Neutrality."
EURO GP

==========
there is one kind of search space, needle-in-haystack, which is difficult for heuristic search 
algorithms to outperform random search.

In a needle-in-haystack type of search space, a solution is either a needle or a piece
of hay. In other words, a search algorithm either finds a perfect solution (the needle) or 
otherwise (the hay). 
--------------------------------------------------------------------------------------
No knowledge about the location of the needles can be obtained from examining the hays. 
--------------------------------------------------------------------------------------
In this kind of situation, a heuristic search algorithm works like a random search algorithm.
When the number of solutions in the search space is small, finding a good solution is difficult,
no matter what search algorithm one uses.

What if the search space only has two possible fitness values (one for the needles and the other 
for the hays)? Evolutionary algorithms seem to become helpless in this kind of situation. In this
 study, 
--------------------------------------------------------------------------------------------------
we investigate building a network within the "hay" to provide a trail for the search process. In 
this way, the discovery of the "needle" solutions may become easier. 
--------------------------------------------------------------------------------------------------
Since the network connects solutions with the same fitness (within the hays), it is called 
"neutral network." Moreover, an evolutionary algorithm utilizing such a network for search is 
said to support neutrality, a term borrowed from evolutionary biology.

The theory of natural evolution established by Darwin has had profound impact on biology. Most 
biologists are convinced that selection acting on advantageous mutations is the driving force of
evolution. It was not until the late 1970s when molecular data became available, that the theory
was challenged. In particular, Motoo Kimura found that the number of mutant substitutions in
amino acid sequences of hemoglobin was too large to be explained by the theory of natural
selection. Based on this discrepancy, he proposed the neutral theory, which states that 
--------------------------------------------------------------------------------------------------
most mutants at the molecular level in evolution are caused by random genetic drift rather than 
by natural selection [3]. In other words, the mutants involved are neither advantageous nor
disadvantageous to the survival or reproduction of the individual. ([3] Kimura, M.: The Neutral Theory of Molecular Evolution. Cambridge Univ. Press (1983)).
--------------------------------------------------------------------------------------------------
But can neutral mutations (those are neither advantageous nor disadvantageous) benefit 
evolutionary search?
--------------------------------------------------------------------------------------------------
In particular, we measure the number of neutral mutations that occur in the evolved entities
during evolutionary search. In this way, the impact of neutrality on search performance can be
analyzed quantitatively. Using this approach, we have studied a Boolean function problem. The
results show that 
--------------------------------------------------------------------------------------------------
there is a positive relationship between neutral mutations and success rate: the larger the 
allowed neutral mutations quantity the greater is the possibility for the evolutionary search to 
find a solution.
--------------------------------------------------------------------------------------------------
To investigate these questions, we have devised a methodology for systematic study of this 
subject [12].

The amount of neutral mutations is measured in the selection step, which evaluates both the 
fitness and the number of neutral mutations in the evolved entities. Moreprecisely, 
-----------------------------------------------------------------------------
an offspring solution is selected to replace the current winner only when it has a better fitness
or it has the same fitness but its neutral mutants are within a specified range (the Hamming 
bound). 
-----------------------------------------------------------------------------
One can envisage all solutions with the same fitness and satisfy the Hamming bound are connected 
in a network (neutral network). The search process selects solutions in the network one after 
another in the manner of a neutral walk. We found that such a walk can lead to a solution with 
a better fitness if it satisfies the fitness improvement criterion. 

The criterion is concerned with the ratio of adaptive and neutral mutations. The analysis 
indicated that when this ratio for the neutral walks was close to the ratio for the fitness
improvement, a high probability of success occurred.

=====
Claus Wilke and colleagues studied the evolution of digital organisms (as computer programs) 
using the Avida system [11]. They reported that
__________________________________________________________________________________________________
under high mutation rates, an organism that has its neighbors (those accessible by one mutation
step) with a similar fitness (not necessary the same fitness) had a higher reproduction rate.
The reason is that such flat fitness landscape is more robust against mutations than a fitness
landscape that has high and narrow peak. 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Although they didn't mention neutral networks (where the neighbors have the same fitness), one
would expect the same findings.

=====
Marc Ebner and colleagues also studied the relationship between neutral networks and 
evolvability [2]. Particularly, they investigated a search space with 2^16 possible fitness 
values. Moreover, these fitness values were divided into 64 groups. Their selection criterion was
 similar to ours in that they allowed an individual with better or equal fitness to replace 
the current winner. They experimented with 3 different sizes of neutral network 
(1, 2^112 , 2^320 ) using a single point mutation. They reported that the larger the network
(more neutrality), the higher the average population fitness. ([2] Ebner, M., Langguth, P., Albert, J., Shackleton, M. and Shipman, R.: On neutral networks and evolvability. In: Proceedings of the 2001 Congress on Evolutionary Computation, IEEE Press (2001) 1-8.)


=====
6.1 Evolutionary Algorithm
1. Randomly generate an initial population of 5 genotypes with the lowest possible fitness and 
   select one (randomly) as the winner.
2. Carry out point-wise mutation on the winning parent to generate 4 offspring;
3. Construct a new generation with the winner and its offspring;
4. Select a winner from the current population using the following rules: 
       1) If any offspring has a better fitness, it becomes the winner.
          Otherwise, an offspring with the same fitness is randomly selected.
       2) If the parent-offspring pair has a Hamming distance within the permitted range the 
             offspring becomes the winner.
          Otherwise, the parent remains as the winner.
5. Go to step 2 unless the maximum number of generations reached or a solution with needle 
      fitness is found.

6.2 Control Parameters
Eleven different mutation rates and 7 neutrality levels were used in the experiments. 
   Mutation Rate (%) on genotype 1,2,4,6,8,10,12,14,16,18,20
   Max Generation 10,000
   Neutrality Level (Hamming distance range) 0,50,100,150,200,250,300
   Population Size 5
   Number of Runs 100


==================================================================================================
We can give it a consideration on the discussion that neutral mutaion makes search easier or 
harder by applying this algorithm to our problem conparing with our random search (Fig)
==================================================================================================


For even-5-parity, all implementations (mutation rates and Hamming distances) have a 100% 
successful rate, i.e. all 100 runs find a solution (see Figure 2A). 

In contrast, even-8-parity (a harder problem) has lower success rates in some cases. In
particular, the combination of low Hamming distance and low mutation rate has produced some 
unsuccessful runs. When mutation rate is 1% or 2%, a small amount of neutrality (50) is able to
improve success rates. However, when mutation rate is 4%, a neutrality level equal to Hamming
distance 100 is required to improve the performance. For example, increasing mutation rate from
2% to 4% gives Hamming distance 0 a success rate jump from 82% to 94%. In comparison, increasing
neutrality level from 0 to 50 on mutation rate 4% does not improve success rate. This suggests
that raising mutation rate has a stronger impact than raising neutrality level on the 
evolutionary search. The relationship between neutrality and mutation rates will in more details 
in Section 8.2

Regardless of mutation rates, increasing neutrality level beyond 150 does not improve or impair 
the performance; all of them have 100% success rate. This suggests that equilibrium between the
benefits of exploitation and exploration is reached at this point. Indeed, the adaptive/neutral
mutation analysis shows that the exploitation/exploration ratios are very similar for all
Hamming distance beyond 150. Increasing neutrality or mutation rates do not affect this
equilibrium.

Even-10-parity is harder than even-8-parity: there are more unsuccessful runs, especially when
low Hamming distance values were used. Similar to even-8-parity, the success rate remains
approximately the same after a certain Hamming distance value is reached (200 is the equilibrium 
point). Moreover, mutation rates are more influential in this problem: neutrality level 150 is
required to give consistent improvement of success rates (in contrast to neutrality level 100 in
even-8-parity).

Even-12-parity is the hardest among all; none of the implementations has 100% success rate 
Although not as precise as other problems, even-12-parity reaches the equilibrium point around
neutrality level 200. We also made 100 experimental runs using Hamming distance 0 and 100% 
mutation rate. Among them, 48 runs find a solution (48% success rate). This suggests that high 
neutrality and mutation rate are not sufficient for the search algorithm to find a solution to
this problem. Modification of other parameters, such as gene length and the maximum number of
generation, is required to improve performance. The performance of the 7 different Hamming
distances implementations can be roughly divided into two groups: the first group consists of
neutrality level 0, 50, 100 while the second group consists of neutrality level 150, 200, 250 
and 300. In the first group, increasing mutation rates increases success rates while in the 
second group little performance improvement is gained after mutation rate exceeds 4%.
Nevertheless, the group with higher neutrality level also gives higher success rates.


Even-12-Parity:
The ratio pattern of even-12-parity has higher active gene changes than those in the other 
problems. They are also associated with lower success rates. This suggests that the algorithm
does not provide sufficient inactive gene change (exploration) for the search to find a solution.
Moreover, each time problem difficulty is increased more inactive gene changes were required
for the evolutionary search to be successful. 
----------------------------------------------------------------------------------------------
This suggests exploration is more important than exploitation for search in needle-in-haystack
type of space.
----------------------------------------------------------------------------------------------

<my>
==================================================================================================
Concluding Remarks
We observed impressive improvement by ...
We have compared the improvement with three other method so far proposed.
None of them have not revealed better results.

Moreover, when we applied this algorithm in the context of Network Intrusion Detection,
we have a more togh problem of how the system learn only be self data to detect
non self. Although we have enormous amount of self data (hay) but we have no information
about non-self (needles) until it's too late.   
==================================================================================================


icnnai-addition
my-icnnai
cite 2 as to kdd99 => neurofuzzy andcpa


probably for ACS
from TDN
Some security systems become outdated a day after they go on the market. 

One effective way to stop hackers invading a network is to invade �gtheir minds�h and train people who would �gthink like a thief to catch a thief.�h 

The ethical hacker is an individual who is employed or contracted to undertake an attempted penetration test. These individuals use the same methods employed by hackers. The goal of the ethical hacker is to help the organization take preemptive measures against malicious attacks by attacking the system himself/herself, all the while staying within legal limits. 

A training program organized by the U.S. Department of Commerce International Electronic Trade Council initialized the formal training of Certified Ethical Hackers. About 30,000 individuals trained under this program undertook responsibility for the security of corporations' Web sites. Those who have completed the program were handed certificates that legally authorized them as Certified Ethical Hackers. 

A second CEH training program is scheduled to start on April 8 by the same organizer, Bilginc IT Academy. Cuneyt Aktan, a trainer in the program, told Referans that hackers are now working very professionally, "Those highly qualified hackers who provide security services to companies during the daytime and then go home at night to conduct totally illegal hacking are the ones who are the most dangerous."


automatic review
perfect


Tina Yu and J. Miller (2002)
"Finding Needles in Haystacks is Not Hard with Neutrality."
EURO GP

==========
there is one kind of search space, needle-in-haystack, which is difficult for heuristic search 
algorithms to outperform random search.

In a needle-in-haystack type of search space, a solution is either a needle or a piece
of hay. In other words, a search algorithm either finds a perfect solution (the needle) or 
otherwise (the hay). 
--------------------------------------------------------------------------------------
No knowledge about the location of the needles can be obtained from examining the hays. 
--------------------------------------------------------------------------------------
In this kind of situation, a heuristic search algorithm works like a random search algorithm.
When the number of solutions in the search space is small, finding a good solution is difficult,
no matter what search algorithm one uses.

What if the search space only has two possible fitness values (one for the needles and the other 
for the hays)? Evolutionary algorithms seem to become helpless in this kind of situation. In this
 study, 
--------------------------------------------------------------------------------------------------
we investigate building a network within the "hay" to provide a trail for the search process. In 
this way, the discovery of the "needle" solutions may become easier. 
--------------------------------------------------------------------------------------------------
Since the network connects solutions with the same fitness (within the hays), it is called 
"neutral network." Moreover, an evolutionary algorithm utilizing such a network for search is 
said to support neutrality, a term borrowed from evolutionary biology.

The theory of natural evolution established by Darwin has had profound impact on biology. Most 
biologists are convinced that selection acting on advantageous mutations is the driving force of
evolution. It was not until the late 1970s when molecular data became available, that the theory
was challenged. In particular, Motoo Kimura found that the number of mutant substitutions in
amino acid sequences of hemoglobin was too large to be explained by the theory of natural
selection. Based on this discrepancy, he proposed the neutral theory, which states that 
--------------------------------------------------------------------------------------------------
most mutants at the molecular level in evolution are caused by random genetic drift rather than 
by natural selection [3]. In other words, the mutants involved are neither advantageous nor
disadvantageous to the survival or reproduction of the individual. ([3] Kimura, M.: The Neutral Theory of Molecular Evolution. Cambridge Univ. Press (1983)).
--------------------------------------------------------------------------------------------------
But can neutral mutations (those are neither advantageous nor disadvantageous) benefit 
evolutionary search?
--------------------------------------------------------------------------------------------------
In particular, we measure the number of neutral mutations that occur in the evolved entities
during evolutionary search. In this way, the impact of neutrality on search performance can be
analyzed quantitatively. Using this approach, we have studied a Boolean function problem. The
results show that 
--------------------------------------------------------------------------------------------------
there is a positive relationship between neutral mutations and success rate: the larger the 
allowed neutral mutations quantity the greater is the possibility for the evolutionary search to 
find a solution.
--------------------------------------------------------------------------------------------------
To investigate these questions, we have devised a methodology for systematic study of this 
subject [12].

The amount of neutral mutations is measured in the selection step, which evaluates both the 
fitness and the number of neutral mutations in the evolved entities. Moreprecisely, 
-----------------------------------------------------------------------------
an offspring solution is selected to replace the current winner only when it has a better fitness
or it has the same fitness but its neutral mutants are within a specified range (the Hamming 
bound). 
-----------------------------------------------------------------------------
One can envisage all solutions with the same fitness and satisfy the Hamming bound are connected 
in a network (neutral network). The search process selects solutions in the network one after 
another in the manner of a neutral walk. We found that such a walk can lead to a solution with 
a better fitness if it satisfies the fitness improvement criterion. 

The criterion is concerned with the ratio of adaptive and neutral mutations. The analysis 
indicated that when this ratio for the neutral walks was close to the ratio for the fitness
improvement, a high probability of success occurred.

=====
Claus Wilke and colleagues studied the evolution of digital organisms (as computer programs) 
using the Avida system [11]. They reported that
__________________________________________________________________________________________________
under high mutation rates, an organism that has its neighbors (those accessible by one mutation
step) with a similar fitness (not necessary the same fitness) had a higher reproduction rate.
The reason is that such flat fitness landscape is more robust against mutations than a fitness
landscape that has high and narrow peak. 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Although they didn't mention neutral networks (where the neighbors have the same fitness), one
would expect the same findings.

=====
Marc Ebner and colleagues also studied the relationship between neutral networks and 
evolvability [2]. Particularly, they investigated a search space with 2^16 possible fitness 
values. Moreover, these fitness values were divided into 64 groups. Their selection criterion was
 similar to ours in that they allowed an individual with better or equal fitness to replace 
the current winner. They experimented with 3 different sizes of neutral network 
(1, 2^112 , 2^320 ) using a single point mutation. They reported that the larger the network
(more neutrality), the higher the average population fitness. ([2] Ebner, M., Langguth, P., Albert, J., Shackleton, M. and Shipman, R.: On neutral networks and evolvability. In: Proceedings of the 2001 Congress on Evolutionary Computation, IEEE Press (2001) 1-8.)


=====
6.1 Evolutionary Algorithm
1. Randomly generate an initial population of 5 genotypes with the lowest possible fitness and 
   select one (randomly) as the winner.
2. Carry out point-wise mutation on the winning parent to generate 4 offspring;
3. Construct a new generation with the winner and its offspring;
4. Select a winner from the current population using the following rules: 
       1) If any offspring has a better fitness, it becomes the winner.
          Otherwise, an offspring with the same fitness is randomly selected.
       2) If the parent-offspring pair has a Hamming distance within the permitted range the 
             offspring becomes the winner.
          Otherwise, the parent remains as the winner.
5. Go to step 2 unless the maximum number of generations reached or a solution with needle 
      fitness is found.

6.2 Control Parameters
Eleven different mutation rates and 7 neutrality levels were used in the experiments. 
   Mutation Rate (%) on genotype 1,2,4,6,8,10,12,14,16,18,20
   Max Generation 10,000
   Neutrality Level (Hamming distance range) 0,50,100,150,200,250,300
   Population Size 5
   Number of Runs 100


==================================================================================================
We can give it a consideration on the discussion that neutral mutaion makes search easier or 
harder by applying this algorithm to our problem conparing with our random search (Fig)
==================================================================================================


For even-5-parity, all implementations (mutation rates and Hamming distances) have a 100% 
successful rate, i.e. all 100 runs find a solution (see Figure 2A). 

In contrast, even-8-parity (a harder problem) has lower success rates in some cases. In
particular, the combination of low Hamming distance and low mutation rate has produced some 
unsuccessful runs. When mutation rate is 1% or 2%, a small amount of neutrality (50) is able to
improve success rates. However, when mutation rate is 4%, a neutrality level equal to Hamming
distance 100 is required to improve the performance. For example, increasing mutation rate from
2% to 4% gives Hamming distance 0 a success rate jump from 82% to 94%. In comparison, increasing
neutrality level from 0 to 50 on mutation rate 4% does not improve success rate. This suggests
that raising mutation rate has a stronger impact than raising neutrality level on the 
evolutionary search. The relationship between neutrality and mutation rates will in more details 
in Section 8.2

Regardless of mutation rates, increasing neutrality level beyond 150 does not improve or impair 
the performance; all of them have 100% success rate. This suggests that equilibrium between the
benefits of exploitation and exploration is reached at this point. Indeed, the adaptive/neutral
mutation analysis shows that the exploitation/exploration ratios are very similar for all
Hamming distance beyond 150. Increasing neutrality or mutation rates do not affect this
equilibrium.

Even-10-parity is harder than even-8-parity: there are more unsuccessful runs, especially when
low Hamming distance values were used. Similar to even-8-parity, the success rate remains
approximately the same after a certain Hamming distance value is reached (200 is the equilibrium 
point). Moreover, mutation rates are more influential in this problem: neutrality level 150 is
required to give consistent improvement of success rates (in contrast to neutrality level 100 in
even-8-parity).

Even-12-parity is the hardest among all; none of the implementations has 100% success rate 
Although not as precise as other problems, even-12-parity reaches the equilibrium point around
neutrality level 200. We also made 100 experimental runs using Hamming distance 0 and 100% 
mutation rate. Among them, 48 runs find a solution (48% success rate). This suggests that high 
neutrality and mutation rate are not sufficient for the search algorithm to find a solution to
this problem. Modification of other parameters, such as gene length and the maximum number of
generation, is required to improve performance. The performance of the 7 different Hamming
distances implementations can be roughly divided into two groups: the first group consists of
neutrality level 0, 50, 100 while the second group consists of neutrality level 150, 200, 250 
and 300. In the first group, increasing mutation rates increases success rates while in the 
second group little performance improvement is gained after mutation rate exceeds 4%.
Nevertheless, the group with higher neutrality level also gives higher success rates.


Even-12-Parity:
The ratio pattern of even-12-parity has higher active gene changes than those in the other 
problems. They are also associated with lower success rates. This suggests that the algorithm
does not provide sufficient inactive gene change (exploration) for the search to find a solution.
Moreover, each time problem difficulty is increased more inactive gene changes were required
for the evolutionary search to be successful. 
----------------------------------------------------------------------------------------------
This suggests exploration is more important than exploitation for search in needle-in-haystack
type of space.
----------------------------------------------------------------------------------------------

<my>
==================================================================================================
Concluding Remarks
We observed impressive improvement by ...
We have compared the improvement with three other method so far proposed.
None of them have not revealed better results.

Moreover, when we applied this algorithm in the context of Network Intrusion Detection,
we have a more togh problem of how the system learn only be self data to detect
non self. Although we have enormous amount of self data (hay) but we have no information
about non-self (needles) until it's too late.   
==================================================================================================