Intrusion detection plays a key role in detecting attacks over networks, and due to the increasing usage of Internet services, several security threats arise. Though an intrusion detection system (IDS) detects attacks efficiently, it also generates a large number of false alerts, which makes it difficult for a system administrator to identify attacks. This paper proposes automatic fuzzy rule generation combined with a Wiener filter to identify attacks. Further, to optimize the results, simplified swarm optimization is used. After training a large dataset, various fuzzy rules are generated automatically for testing, and a Wiener filter is used to filter out attacks that act as noisy data, which improves the accuracy of the detection. By combining automatic fuzzy rule generation with a Wiener filter, an IDS can handle intrusion detection more efficiently. Experimental results, which are based on collected live network data, are discussed and show that the proposed method provides a competitively high detection rate and a reduced false alarm rate in comparison with other existing machine learning techniques.
Internet usage has become increasingly widespread in relation to online banking, shopping, stock exchanges, and online auctions. Due to its open public nature, the security of the Internet and any related data is always an issue. The tremendous growth of Internet use has led to the development of network intrusion detection, which is especially critical for protection. Intrusion detection is mainly used to identify malicious activity within a network or system; such activity makes network resources vulnerable to attacks [1].Intrusion is mainly based on either misuse detection or anomaly detection. Regarding misuse detection, it searches for a pattern of user behavior that matches two well-known scenarios. Meanwhile, anomaly detection is based on the normal behavior of network data, and it detects attacks based on significant deviations from the normal data flow [2]. The main advantage of anomaly detection is that it detects attacks when there is no existing signature. However, comparatively, it suffers from a larger false positive rate.Moreover, it is very difficult to train a computer system using an intrusion detection method, such as fuzzy rule generation combined with a Wiener filter, using only unclassified data as, ultimately, this will lead to a drop in accuracy.Recently, many soft computing methods, such as fuzzy sets, neural networks [3], genetic algorithms (GAs) [4], and simulated annealing, have been used in the field of intrusion detection [5]–[6]. Broadly speaking, the fuzzy concept is one where the knowledge of human experts is used to create a fuzzy rule; such a fuzzy rule will then make an intrusion detection system (IDS) more robust [7]; however, such an IDS will still be deficient with regards to learning and training adaptation. It is this fact that has led to the development of automatic fuzzy rule generation, such as neuro-fuzzy rule generation and genetic-fuzzy rule generation.From the point of view of classification, an intrusion detection module is mainly used to identify data as either normal or abnormal (that is, an attack). In addition, machine learning techniques, such as data mining [8]–[9], statistical analysis [10], immunological-inspired techniques [11], and computational intelligence algorithms, such as genetic programming [12], artificial immune systems [11], and swarm intelligence (SI), are also reported in [6]–[13] as a means for solving the IDS problem. Among the aforementioned methods, particle swarm optimization (PSO) is one of the most popular SI algorithms. Over a series of tasks, PSO is better at optimization than GAs, particularly in optimization applications [14]; however, very little research has been performed on applying the PSO technique to solving network intrusion detection problems. As a result, this paper focuses on a new simplified version of PSO (namely, simplified swarm optimization (SSO)) along with automatic fuzzy rule generation for detecting intrusion patterns.In this paper, a live network dataset is used both for training and for testing attacks. Such network data are collected using Linkware Technologies as a live dataset [15]. This is a leading provider of embedded systems and is used as a network simulation capture (NSC) simulation tool to capture live data for analyzing various attacks in a network. This research paper has various levels to detect intrusion. First, it selects the most relevant attribute that represents a pattern of network intrusion using a new hybrid SSO with random forest algorithm (SSO-RF). Second, an automatic fuzzy rule is generated based on a Wiener filter as a decision-making process to detect and classify current intrusion activity as normal or as an attack. Finally, SSO is employed to optimize the structure of the fuzzy sets of the fuzzy decision-making engine. The structure of the proposed work is summarized as follows:
Fuzzy rule generation based on a Wiener filter is used to collect and analyze network data to detect anomalous behavior in network traffic.
The hybrid feature-selection method (SSO-RF) is used to reduce the attributes more efficiently.
The proposed adaptive fuzzy granule fitness function is used to mine more new rules with higher accuracy.
The proposed framework is more flexible in detecting intrusion behavior in real-time scenarios.
A high detection rate (DR) and low false alarm rate (FAR) are obtained for the overall network data.
The rest of this paper is organized as follows. Section II describes some of the related work based on various intrusion detection methods. Section III explains the concept of fuzziness and the Wiener filter in relation to detection. Section IV discusses the proposed system architecture. Section V draws some experimental results for analysis. Finally, the conclusions and future work are detailed in Section VI.
II. Related Work
IDS research has been promoted based on datasets developed by the Defense Advanced Research Project Agency (DARPA) and Air Force Research Laboratory (AFRL) of the Lincoln laboratory at MIT [16], which are widely used as training and testing datasets for evaluating IDS. Later, the Knowledge Discovery database developed, as a standard benchmark, the KDDCup99 dataset, which is commonly used to detect intrusion [17]. It consists of five million single connection records of training data and two million connections for testing data. Due to its huge size, only 10% of this dataset is used for IDS. However, various statistical issues [18] degrade the performance of KDDCup99 — a fact that led to the creation of the NSL-KDD dataset [19], which consists of only selected records from the KDDCup99 data, whereby no redundant or duplicate records are present in either the training data or the testing data.A.N. Toosi and M. Kahani [20] have proposed a neuro-fuzzy classifier based on an adaptive neuro-fuzzy inference system using GAs. In addition, subtractive clustering is used to group data from large databases. No feature selection methods were used. The training time, testing time, and cost per example of the method proposed by A.N. Toosi and M. Kahani exhibited better accuracy than any other existing system.Y.Y. Chung and N. Wahid [21] have proposed a hybrid network IDS using SSO. The proposed system achieves higher classification accuracy than others, with 93.3%, and it is a competitive classifier for IDS.S. Mabu and others [22] have proposed an ID model based on fuzzy class association (FCA) rule-mining using genetic network programming (GNP). Experimental results based on FCA and GNP and using the KDDCup99 and DARPA98 databases show that it affords competitively high DRs compared with other learning techniques.S. Zaman and others [23] have proposed a new concept for feature selection based on various soft computing methodologies, such as PSO, GA, and differential evolution (DE), for reducing attributes in the KDD dataset to detect intrusion. The classification accuracy, FAR, training time, and testing time are classified using SVMs and NNs, and it was found that DE reduces more attributes than the other two approaches.M. Davarynejad and others [24] proposed a new adaptive fuzzy granule fitness function to generate a fitness queue for various evolutionary optimization techniques, such as GAs, producing better results.A.J. Malik and others [25] have proposed a new concept — hybrid PSO with a random forest algorithm — to reduce the number of required feature attributes. Their new concept achieves an accuracy of approximately 95%.M. Al-Kasassbeh [26] proposed a new concept for a Wiener filter as an agent to filter out various attacks that pass through a network. The main use of this filter is to eliminate the scalability issues based on a statistical approach.
III. Fuzziness and Wiener Filter
- 1. Fuzzy System
Fuzzy systems are rule-based systems that can be easily structured based on prior knowledge. A fuzzy system is based on fuzzy logic, which provides a computational framework for manipulating and reasoning about imprecise expressions of knowledge. L.A. Zadeh [27] explains that fuzzy logic is an extension of Boolean logic in that either a full-member set or a non-member (crisp) set can be used for making complex decisions. The elements of a fuzzy set are based on a triangular membership function, μ(x), with a value range between [0, 1].$${\mu}_{A}(x)=\{\begin{array}{l}1\text{\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}}x\in A,\\ 0\text{\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}}x\notin A.\end{array}$$A fuzzy rule is generally based on a fuzzy If-Then rule that generates rules on a given input “If” part as a fuzzy antecedent and a “Then” part as a fuzzy consequent.Seven linguistic values are used for each attribute — very small (VS), small (S), small-medium (SM), medium (M), large-medium (LM), large (L), and very large (VL). Figure 1 shows the membership functions (MFs) of the seven linguistic values.
Fuzzification. Convert classical data or crisp data into fuzzy data or MFs.
Rule Evaluation. Each fuzzy rule is determined and based on a degree of membership of crisp input values in the fuzzy set of the antecedent.
Defuzzification. It transposes the fuzzy outputs into crisp values.
A Wiener filter is used to discriminate an original signal from a noisy signal. It is employed in a wide range of applications, such as echo cancellation, linear prediction, signal restoration, channel equalization, and system identification [28]. The coefficient of a Wiener filter is designed to minimize the mean square error (MSE) between the desired output and the actual output from the filter. A Weiner filter is an optimum linear filter, consisting of an estimation of a desired signal sequence from another related sequence. It is used to calculate statistical parameters, such as the mean and the correlation function, based on the original signal with an unwanted additive. The problem for such a linear filter is the requirement to minimize the effect of noisy data at the output according to a statistical criterion.In this research work, a Wiener filter is used to filter out and is based on the minimum mean square error (MMSE). Based on fuzzy rule generation, the Wiener filter is trained in relation to error detection and postulates abnormalities that exist within network data.The Weiner filter input consists of a time series, x(0), x(1), x(2), ... , x(n), and the filter is itself characterized by the impulse response time as w_{0}, w_{1}, w_{2}, ... ; furthermore, at some discrete time, n, the filter produces an output denoted by y(n). This output is used to provide an estimate of a desired response time denoted by d(n). With the filter input and the desired response time representing single realizations of their respective stochastic processes, the estimation is accompanied by an error with statistical characteristics of its own.In particular, the estimation error, e(n), is as small as possible in a statistical sense. When the Wiener filter (see Fig. 2) operates under optimum conditions, it takes on the following special form:$${e}_{\mathrm{min}}(n)=d(n)-{y}_{\text{opt}}(n),$$whereby the terms may be rearranged as follows:$$d(n)={y}_{\text{opt}}(n)+{e}_{\mathrm{min}}(n).$$Let J_{MSE} denote MMSE, which can be defined as follows:$${J}_{\text{MSE}}=\text{E\hspace{0.17em}}[|{e}_{\mathrm{min}}(n){|}^{2}].$$Hence, the mean square values of both sides of (3) are evaluated, where e_{min}(n) is the estimation error. Next, taking the estimation problem of finding the vector w that optimizes the cost function J_{MSE}(w) and applying it to the principle of orthogonality, we get the following equation (5), where Lx1 represents the optimum filter length:$$\text{E}[{e}_{\mathrm{min}}(n)X(n)]=\text{E}\left[d(n)-{w}_{\text{opt}}^{\text{T}}X(n)\right]={0}_{Lx1},$$where we can interpret from the principle of orthogonality that at time n the input vector x(n) = [x(n), x(n − 1)]^{T} will pass through the optimal filter w_{opt} = [w_{opt,0}, w_{opt,1}]^{T} to generate the output y_{opt}(n). Given d(n), y_{opt}(n) is the only element in the space spanned by x(n) that leads to an error e(n) that is orthogonal to x(n), x(n − 1), and y_{opt}(n). The computational optimal solution is$$\text{E}[X(n){X}^{\text{T}}(n)]{w}_{\text{opt}}=\text{E}[X(n)d(n)].$$We introduce the following definitions for the input auto-correlation matrix and the cross-correlation vector, respectively. The two expectations in (6) may be interpreted as follows:$$\begin{array}{c}{R}_{XX}(k)=\text{E}[X(n){X}^{\text{T}}(n)],\\ {r}_{xd}(k)=\text{E}[x(n)d(n)].\end{array}$$In the expectation R_{XX}(k), we have$${R}_{XX}(k)\text{\hspace{0.05em}}=\text{\hspace{0.05em}E}[X(n){X}^{\text{T}}\text{}(n)]\text{\hspace{0.05em}}=\text{\hspace{0.05em}}{\displaystyle \sum _{i=0}^{L-1}\text{E}[x(n-k)x(n-i)]}\text{\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}}k\text{\hspace{0.05em}}=\text{\hspace{0.05em}}0,\text{\hspace{0.17em}\hspace{0.17em}}1,\text{\hspace{0.17em}\hspace{0.17em}}2,\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{...}$$The signal statistic, R_{XX}(k), is an auto-correlation function of the filter input x(n) for a lag of i-k in (8). For the expectation r_{xd}(k), we have$${r}_{xd}(k)=\text{E}[x(n)d(n)]=\text{E}\left(x(n-k)d(n)\right)\text{\hspace{0.17em}\hspace{0.17em}}k=0,\text{\hspace{0.17em}}1,\text{\hspace{0.17em}}2,\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{...}\text{\hspace{0.17em}\hspace{0.17em},}$$where we let r_{xd}(k) be a cross-correlation vector between the filter input x(n − k) and the desired response d(n) for a lag of −k in (9).
The auto-correlation matrix R_{XX} is defined as R_{0}; then the equation for the optimum filter coefficient w can be written as follows:$$w{R}_{\text{0}}\text{\hspace{0.17em}}=\text{\hspace{0.17em}}{r}_{xd},$$$$w{R}_{\text{0}}^{-1}=\text{\hspace{0.17em}}{r}_{xd},$$where r_{xd} is the cross-correlation vector between the input sequence x(n) and desired output sequence d(n). In addition, r_{xd} is denoted as an L-by-1 cross-correlation vector between the tap inputs x(n), x(n − 1), ... , x(n − L + 1) of the filter and the desired response d(n) as follows:$${r}_{xd}={\left[p(0),p(-1),\text{\hspace{0.17em}\hspace{0.17em}}.\text{\hspace{0.17em}}\mathrm{..}\text{\hspace{0.17em}\hspace{0.17em}},p(1-L)\right]}^{\text{T}},$$where p is a joint wide sense stationary process. The correlation matrix can be written as$$R=\left[\begin{array}{l}r(0)\text{\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}}r(-1)\text{\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}}r(-2)\text{\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}}r(-3)\\ r(1)\text{\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}}r(0)\text{\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}}r(-1)\text{\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}}r(-2)\\ r(2)\text{\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}}r(1)\text{\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}}r(0)\text{\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}}r(-1)\\ r(3)\text{\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}}r(2)\text{\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}}r(1)\text{\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}}r(0)\text{\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}}\end{array}\right],$$where the correlation matrix R is a Toeplitz matrix [29]–[30]. The matrix elements along its main diagonal are all equivalent to each other, as are those that are along diagonals that are parallel to the main diagonal. The correlation matrix R is always positive. To calculate the variance, which is based on (4), MMSE is set equal to the difference between the variance of the desired response and the variance of the estimate that the filter produces as its output; that is,$${J}_{\text{MSE}}={\sigma}_{d}^{2}-{\sigma}_{{y}_{\text{opt}}}^{2}.$$To find the normal and abnormal states of the network, the auto-correlation matrix R utilizes the eigen-decomposition to identify normal and abnormal states. For instance, an eigenvector marks the boundary between normal data traffic and abnormal traffic based on minimum and maximum values. The values beyond the maximum may lead to an abnormality in the data flow.
IV. Proposed System Architecture
- 1. Feature Reduction
Live network server data is collected based on a network data recorder (or sniffer). It reads raw network packets upon transmission and stores them onto a disk. This unclassified data is then separated into either attack data, which is based on various attack categories (DOS, Probe, U2R, and R2L), or into normal data. These datasets are then stored in a database, which contains around 20,971,520 records, and are used for both the training and testing of the IDS. There are more than 13,000 attacks in the training and testing dataset that cover the four major attack categories. This dataset contains 41 attributes, some of which may be used to detect as opposed to identify intrusions. Table 1 shows a number of records in the dataset.
The traditional pre-processing algorithms, such as particle swarm optimization with random forest (PSO-RF), DE, genetic algorithm, and so on, are not adaptive to benchmark datasets, such as KDDCup99, DARPA, and NSL-KDD, due to the huge sizes of such datasets; thus, if they were to be applied to such datasets, then this would result in large numbers of false recommendations.In this paper, a new hybrid technique, SSO-RF, is used. SSO is a simplified version of PSO and can be used to find the global minimum of nonlinear functions [21]. This approach is used to solve the problem of classification and reduce the dimensions of a dataset. Random forest (RF) methods are used to classify the best split from the data and to identify the most important attributes to detect intrusion [25]. Intrusion DR (IDR) is considered as a fitness measure of the SSO-RF classifier, and the particle with highest IDR in the swarm is considered to be the best. At each iteration, attribute selection is performed by SSO, and within its loop, RF is used to classify the data. The SSO loop terminates when a stopping condition (the maximum number of iterations or 100% IDR) is reached. The evolutionary algorithm for SSO-RF is as follows [31]:
1. Letpbe the population size, the maximum generation be MG, the maximum fitness function be MF, and the three predetermined constants beCw,Cp,Cg.
2. Generate a random numberR= 0 to 1 ford-dimension data.
3. Perform the comparison strategy where:
If (0 ≤R
Else if (Cw≤R
Else if (Cp≤R
Else if (Cg≤R≤ 1), then {xnd= new (xnd)};
4. RF (n,d) =1/exp (x(n,d))
Update (xnd).
5. Update pbest.
6. Update gbest.
7. Process will repeat until the stopping condition is satisfied or the maximum IDR is reached.
The experimental setup for intrusion detection is split into two steps. Initially, the SSO-RF algorithm is implemented to reduce the feature set for the IDS. Next, the result is validated using an RF classifier. The experiment is repeated ten times over both during the training and testing phases. The SSO-RF method is evaluated for 50 particles, and the iterations are repeated for 100 generations or until the maximum IDR is reached.The proposed method filters the most suitable attributes based on the number of normal connections and reduces the dimensions of the live dataset. The eleven attributes (see Table 2) are further used in the fuzzy Wiener detection engine to analyze the intrusive behavior. The proposed method (SSO-RF) produces near optimal results compared to other intelligent swarm techniques [31].
In the proposed system, initially, the fuzzy rules are automatically generated using Lagrange interpolation with the successive approximation method. This is based on a sequence of rule approximations that converge on the solution and that are constructed recursively (iteratively); that is, each new rule is calculated on the basis of the preceding rule generation.Let f(x) be a continuous function in the interval [a, b]. To locate the position of the initial rule of the function f(x) = 0, we divide the interval [a, b] into n subintervals as a = x_{0} < x_{1} < x_{2} < ... < x_{n} = b, where x_{i} = x_{0} + ih, i = 0, 1, 2, ... , n, h = 0. Let the rule generation equation be x = f(x) for x_{1}, ... , x_{n}, ... , such that x_{1} = f(x_{0}), x_{2} = f(x_{1}), x_{3} = f(x_{2}) ... x_{n} = f(x_{n−1}). Iterate the same process until (x_{n} − x_{n−1}) is smaller than some specified tolerance (that is, the maximum IDR is reached). The result of the successive approximation is then processed using Lagrange interpolation to check whether the fuzzy rule belongs to one of seven linguistic values; attributes that fall within the seven linguistic values are further filtered using a Wiener filter to generate the data in terms of the normal or attack categories. The procedure for fuzzy rule generation using the Wiener filter is presented below.A. Procedure for Fuzzy Rule Generation Using Wiener Filter to Detect AttacksStep 1. An input dataset, D, is divided into five subsets as D = {D_{i}; where i = 1 ≤ i ≤ 5} based on a fuzzy predictor. The dataset is then sent as the input signal x_{i}(n) to the Wiener filter.Step 2. The Wiener filter is modeled as a linear combination of normal dataflow s_{i}(n) and abnormal dataflow η(n), where x_{i}(n) = s_{i}(n) + η(n).Step 3. The Wiener filter is used to remove abnormal data as the noise signal and extract the normal output data s_{i}.Step 4. To achieve a balanced estimator, the fuzzy predictor uses seven linguistic variables (that is, VS, S, SM, M, LM, L, and VL) to generate rules to achieve better results.Step 5. Based on the fuzzy rule, the Wiener filter is used as the error detector to classify the attack data in the network. The filter is broadly used to reduce the MMSE.Step 6. The correlation matrix generated by the Wiener filter is used in the chi-squared test, which returns an output result of either 0 or 1.Step 7. The result 0 indicates significant indifference (known as normal data), and the result 1 indicates significant difference, known as abnormal data.Figure 3 shows the process flow for automatic fuzzy rule generation. The transaction values are used to identify security threats that have already been classified according to a given predefined intruder detection rule. A normal network transaction can easily pass through this process (the loop executed in our computer to check data); however, an ambiguous transaction will be suspended from subsequent phases. In the case of an ambiguous transaction, an alert event and inspection criteria are triggered. If there is a rule to identify the ambiguous transaction, then the numerical and logical values are adjusted and fine-tuned to identify any attack related to the ambiguous transaction. If there is no rule to identify the ambiguous transaction, then a new rule is generated based on logical and numerical values. If a new rule has to be generated, then merging this new rule with the existing reference database makes the existing numerical and logical values stored in our database prone to change following further similar transactions with similar input values; this is the essential aspect of the machine learning procedure that makes the proposed system impregnable to attacks. The reference database is frequently analyzed and updated to group the harmonic parameters (11 attributes) into a correlative group and the non-harmonic parameters into an auto-correlative group, in an accurate manner.
B. SSO ModuleTo improve the accuracy of the proposed method, SSO is used to evolve the optimal solution for the current population. The SSO algorithm repeatedly modifies the population based on the velocity of the particle moving from the pbest (personal) to the gbest (global) position.In the proposed system, each individual particle is given as input to the MF of the fuzzy decision module. The new concept of the adaptive fuzzy granule fitness function is used for fitness generation to evaluate DR and FAR. The fitness queue is based on the insufficient similarity of the individual; if an individual is sufficiently similar to a known fuzzy granule, then that granule’s fitness is used.The fuzzy fitness can be calculated using$$\begin{array}{l}\text{G}=\{{c}_{d},\text{\hspace{0.17em}\hspace{0.17em}}{\sigma}_{d},\text{\hspace{0.17em}\hspace{0.17em}}{L}_{d}\},\text{\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}}\\ \text{where\hspace{0.17em}\hspace{0.17em}}{c}_{d}\in {R}^{m},\text{\hspace{0.17em}\hspace{0.17em}}{\sigma}_{d}\in R,\text{\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}}{L}_{d}\in R,\text{\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}}d=\text{1},\text{\hspace{0.17em}\hspace{0.17em}2},\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{...}\text{\hspace{0.17em}\hspace{0.17em}},\text{\hspace{0.17em}\hspace{0.17em}}l,\text{\hspace{0.17em}}\end{array}$$where initially l = 0, to make the fuzzy queue empty, C_{d} is an m-dimensional vector of centers, σ_{d} is the width of the MFs of the dth fuzzy granule, and L_{d} is the granule’s life index. The granule G can be computed using$${\overline{\mu}}_{j,d}={\displaystyle \sum _{r=1}^{n}\frac{{\mu}_{d,m}({x}_{j,m}^{i})}{n}},$$where
X j i = { X j,1 1 , X j,2 1 , ... , X j,m 1 , ... , X j,n 1 }
is the ith individual in the jth particle. The fitness of
X j i
is computed either by the exact fitness function or estimated fitness function by associating it to one of the granules in the queue with high similarity to
X j i
rather than a predefined threshold as follows:$$f({X}_{j}^{i})=\text{\hspace{0.17em}}\frac{f({x}_{d})\text{\hspace{0.17em}if\hspace{0.17em}}\mathrm{max}\text{\hspace{0.17em}}d\in \text{\hspace{0.17em}}\left\{1,2,\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{...}\text{\hspace{0.17em}\hspace{0.17em}},t\right\}\text{\hspace{0.17em}\hspace{0.17em}}\left\{{\overline{\mu}}_{j,d}\right\}>{\theta}_{{}^{i}}}{f({X}_{j}^{i})\text{\hspace{0.17em}\hspace{0.17em}computed\hspace{0.17em}by\hspace{0.17em}fitness\hspace{0.17em}function,\hspace{0.17em}otherwise}}.$$Threshold θ_{i} increases as the best individual’s fitness in generation i increases. Hence, as the particle matures, the fuzzy fitness queue reaches the highest fitness valuation.
V. Analysis of Experimental Results
The live dataset collected based on Table 1 is used as the test data to evaluate the classifier. The experimental analysis was performed using 30% of the training data, and the full dataset (see Table 1) was used for the testing phase to analyze the performance of the proposed system. The training dataset, which is based on fuzzy rule generation, trains the classifier, and the Wiener filter is used as an error detector; both the classifier and the Weiner filter are trained based on fuzzy rule generation. To make the proposed system more accurate, a fuzzy granule fitness function is used to generate various fitness functions based on similarity and an SSO optimizer is used to further optimize the ID. Table 3 below shows the various parameters, such as DR and FAR, the training time, and the testing time.
The soft computing method (fuzzy logic) within the IDS along with a Weiner filter helps the IDS to detect attacks more accurately. The proposed work is also compared with several machine learning techniques in Table 4, and it is found that the fuzzy rule generation with Wiener filter gives a strong performance in detecting various attacks, under the categories mentioned in this paper, in computer networks. Furthermore, this method is more flexible and convenient to this research work due to its automatic rule generation and adaptive fuzzy granule fitness queue. The DR and FAR are calculated based on the following:$$\text{DR\hspace{0.17em}}=\text{\hspace{0.17em}}\frac{\text{Total\hspace{0.17em}number\hspace{0.17em}of\hspace{0.17em}correctly\hspace{0.17em}classified\hspace{0.17em}attacks}}{\text{Total\hspace{0.17em}number\hspace{0.17em}of\hspace{0.17em}instances}}\text{\hspace{0.17em}}\times \text{\hspace{0.17em}}100,$$$$\text{FAR\hspace{0.17em}}=\text{\hspace{0.17em}}\frac{\text{Total\hspace{0.17em}number\hspace{0.17em}of\hspace{0.17em}misclassified\hspace{0.17em}instances}}{\text{Total\hspace{0.17em}number\hspace{0.17em}of\hspace{0.17em}instances}}\times 100.$$
DR and FAR results for other machine learning techniques.
Techniques
DR
FAR
Fuzzy logic
74.5%
2.51%
SVM + fuzzy logic
80.7%
2.30%
Kalman filter + fuzzy logic
81.4%
1.97%
GA + fuzzy logic
83.5%
1.80%
Proposed fuzzy logic + Wiener filter
88.76%
1.34%
The current research work on intrusion detection gives the best DR only in relation to DOS and probe attacks rather than the other two; however, in real-time scenarios, it is U2R and R2L that play the greater role. This research work focuses not only on DOS and Probe attacks but also on U2R and R2L attacks in terms of both accuracy and FAR.Figure 4 shows a data sequence using normal and Wiener filters, where the x-axis represents data sequence and the y-axis represents data size.
MSE values for individual attacks have also been experimented with in this paper, and Table 5 below shows individual MSE values both with and without the Wiener filter. From the above table, the MSE without the Wiener filter remains constant. While using the Wiener filter, MMSE values are reduced for all attacks, which increase the accuracy of the proposed system. Calculated MSE values for dataflow in our system are plotted as an input data sequence along the x-axis, and their data sizes are plotted along the y-axis, as shown in Fig. 5 (for individual attacks). Figure 5 indicates the dataflow values using the Wiener filter for a DOS attack.
The dataflow values reduce drastically for a probe attack (see Fig. 6). The proposed method works effectively and efficiently for U2R and R2L attacks. Figures 7 and 8 show the dataflow values for these attacks, respectively. From Figs. 7 and 8, it is clear from the yellow lines, which indicate the data flow from using a Wiener filter, that MSE values are reduced for all attacks.
In this paper, automatic fuzzy rule generation combined with a Wiener filter is proposed for the live dataset and efficiently extracts numerous good rules for classification. In addition, simplified swarm optimization is used for optimization.As an application of intrusion detection, an IDS was developed for real-time scenarios to detect attacks. The results obtained from the proposed method show that use of fuzzy rule generation combined with a Wiener filter in our IDS, and also in our database, works efficiently for detection both in simulation and in real time.For intrusion detection, DR and FAR are considered as two important criteria for the developed IDS; as such, both of these are focused on in this research work along with training and testing times.Initially, in the proposed system, the feature selection method, SSO-RF, reduces the number of attributes more effectively than the methods cited in [31], which makes the IDS more compact for detection. To assess the efficiency of the proposed system, we compared it against other machine learning techniques, and the results prove that the use of the Wiener filter in fuzzy rule generation increases the accuracy of intrusion detection.In our proposed method, SSO is used to optimize the results obtained from the Weiner filter. The main advantage of SSO over other techniques is that there is no mutation or crossover required for processing individual populations, which may reduce time-efficiency.The important functions of the proposed system are that it efficiently extracts numerous fuzzy rules automatically and the Wiener filter uses statistical calculation based on a chi-squared test, both of which mean that the data is normal or abnormal.The advantage of our automatic rule generation method is that it generates numerous rules based on real-time attacks; moreover, it automatically calculates deviations for normal connections. Our future work might focus on using various neural network techniques combined with fuzzy logic for detecting normal and intrusive data in a network.
BIO
revathisujendran86@gmail.comRevathi Sujendran received her MSc degree in computer science from St. Joseph College of Arts and Science, Tamilnadu, India, in 2008 and her MPhil degree in computer science from Bharathidasan University, Tamilnadu, India, in 2009. She is now currently pursuing her PhD degree at PG and Research, Department of Computer Science, Government Arts College, affiliated to Bharathiar University, Tamilnadu, India. Her current research interests include network security, data mining, and computational intelligence.
Corresponding Authormalathi.arunachalam@yahoo.com)Malathi Arunachalam received her MSc degree in computer science from Bharathidasan University, Tamilnadu, India, in 1991 and her MPhil degree in computer science from Bharathiar University, Tamilnadu, India, in 2002. She received her PhD in computer science from Bharathiar University, Tamilnadu, India, in 2012. She is currently serving as an assistant professor in PG and Research, Department of Computer Science, Government Arts College (Autonomous), Coimbatore, India. She has successfully completed one funded project sponsored by UGC. She has designed and developed many computer awareness programs for government school students. Her research interests include data mining, networking, and object-oriented programming.
Lazarevic A.
2003
“A Comparative Study of Anomaly Detection Schemes in Network Intrusion Detection,”
Proc. SIAM Conf. Data Mining
University of Minnesota, Minneapolis, MN, USA
Han S.-J.
,
Cho S.-B.
2005
“Evolutionary Neural Networks for Anomaly Detection Based on the Behavior of a Program,”
IEEE Trans. Syst., Man, Cybern. Part B
36
(3)
559 -
570
DOI : 10.1109/TSMCB.2005.860136
Lu W.
,
Traore I.
2004
“Detecting New Forms of Network Intrusion Using Genetic Programming,”
Comput. Intell.
20
(3)
475 -
494
DOI : 10.1111/j.0824-7935.2004.00247.x
Zadeh L.A.
1998
Comput. Intell.: Soft Comput. Fuzzy-Neuro Integr. Appl.
Springer Berlin, Heidelberg
Berlin, Germany
“Role of Soft Computing and Fuzzy Logic in the Conception, Design and Development of Information/Intelligent Systems,”
1 -
9
Gomez J.
,
Dasgupta D.
2001
“Evolving Fuzzy Classifiers for Intrusion Detection,”
Proc. IEEE Workshop Inf. Assurance
United States Military Academy, West Point, NY, US
68 -
75
Li Y.
,
Guo L.
2007
“An Active Learning Based TCM-KNN Algorithm for Supervised Network Intrusion Detection,”
Comput. Security
26
(7)
459 -
467
DOI : 10.1016/j.cose.2007.10.002
Zhang Z.
2001
“HIDE: A Hierarchical Network IDS Using Statistical Preprocessing and Neural Network Classification,”
Proc. IEEE Workshop Int. Assurance Security
West Point, NY, USA
85 -
90
Powers S.T.
,
He J.
“A Hybrid Artificial Immune System and Self Organizing Map for Network Intrusion Detection,”
Inf. Sci.
178
(15)
3024 -
3042
DOI : 10.1016/j.ins.2007.11.028
Sousa T.
,
Silva A.
,
Neves A.
2004
“Particle Swarm Based Data Mining Algorithms for Classification Tasks,”
Parallel Comput.
30
(5–6)
767 -
783
DOI : 10.1016/j.parco.2003.12.015
Toosi A.N.
,
Kahani M.
“A New Approach to Intrusion Detection Based on an Evolutionary Soft Computing Model Using Neuro-Fuzzy Classifiers,”
Comput. Commun.
30
(10)
2201 -
2212
DOI : 10.1016/j.comcom.2007.05.002
Mabu S.
2011
“An Intrusion-Detection Model Based on Fuzzy Class-Association-Rule Mining Using Genetic Network Programming,”
IEEE Trans. Syst., Man., Cybern., — Part C: Appl. Rev.
41
(1)
130 -
139
DOI : 10.1109/TSMCC.2010.2050685
Zaman S.
,
El-Abed M.
,
Karray F.
“Features Selection Approaches for IDSs Based on Evolution Algorithms,”
Int. Conf. Ubiquitous Inf. Manag. Commun.
Kota Kinabalu, Malaysia
Jan. 17–19, 2013
@article{ HJTODO_2015_v37n3_502}
,title={Hybrid Fuzzy Adaptive Wiener Filtering with Optimization for Intrusion Detection}
,volume={3}
, url={http://dx.doi.org/10.4218/etrij.15.0114.0275}, DOI={10.4218/etrij.15.0114.0275}
, number= {3}
, journal={ETRI Journal}
, publisher={Electronics and Telecommunications Research Institute}
, author={Sujendran, Revathi
and
Arunachalam, Malathi}
, year={2015}
, month={Jun}
TY - JOUR
T2 - ETRI Journal
AU - Sujendran, Revathi
AU - Arunachalam, Malathi
SN - 1225-6463
TI - Hybrid Fuzzy Adaptive Wiener Filtering with Optimization for Intrusion Detection
VL - 37
PB - Electronics and Telecommunications Research Institute
DO - 10.4218/etrij.15.0114.0275
PY - 2015
UR - http://dx.doi.org/10.4218/etrij.15.0114.0275
ER -
Sujendran, R.
,
&
Arunachalam, M.
( 2015).
Hybrid Fuzzy Adaptive Wiener Filtering with Optimization for Intrusion Detection.
ETRI Journal,
37
(3)
Electronics and Telecommunications Research Institute.
doi:10.4218/etrij.15.0114.0275
Sujendran, R
,
&
Arunachalam, M
2015,
Hybrid Fuzzy Adaptive Wiener Filtering with Optimization for Intrusion Detection,
ETRI Journal,
vol. 3,
no. 3,
Retrieved from http://dx.doi.org/10.4218/etrij.15.0114.0275
[1]
R Sujendran
,
and
M Arunachalam
,
“Hybrid Fuzzy Adaptive Wiener Filtering with Optimization for Intrusion Detection”,
ETRI Journal,
vol. 3,
no. 3,
Jun
2015.
Sujendran, Revathi
and
,
Arunachalam, Malathi
and
,
“Hybrid Fuzzy Adaptive Wiener Filtering with Optimization for Intrusion Detection”
ETRI Journal,
3.
3
2015:
Sujendran, R
,
Arunachalam, M
Hybrid Fuzzy Adaptive Wiener Filtering with Optimization for Intrusion Detection.
ETRI Journal
[Internet].
2015.
Jun ;
3
(3)
Available from http://dx.doi.org/10.4218/etrij.15.0114.0275
Sujendran, Revathi
,
and
Arunachalam, Malathi
,
“Hybrid Fuzzy Adaptive Wiener Filtering with Optimization for Intrusion Detection.”
ETRI Journal
3
no.3
()
Jun,
2015):
http://dx.doi.org/10.4218/etrij.15.0114.0275