Proposing an Integrated Method based on Fuzzy Tuning and ICA Techniques to Identify the Most Influencing Features in Breast Cancer

AUTHORS

Irvan Masoudiasl 1 , Shaghayeh Vahdat 2 , Somayeh Hessam 2 , * , Shahaboddin Shamshirband 3 , 4 , ** , Hamid Alinejad-Rokny 5 , 6

1 Department of Healthcare Services Management, School of Health Management and Information Sciences, Iran University of Medical Sciences, Tehran, Iran

2 Department of Health Services Administration, South Tehran Branch, Islamic Azad University, Tehran, Iran

3 Department for Management of Science and Technology Development, Ton Duc Thang University, Ho Chi Minh City, Vietnam

4 Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh City, Vietnam

5 The Graduate School of Biomedical Engineering, UNSW Australia, 2052, Sydney, Australia

6 School of Computer Science and Engineering, UNSW Australia, 2052, Sydney, Australia

Corresponding Authors:

How to Cite: Masoudiasl I, Vahdat S , Hessam S, Shamshirband S, Alinejad-Rokny H . Proposing an Integrated Method based on Fuzzy Tuning and ICA Techniques to Identify the Most Influencing Features in Breast Cancer, Iran Red Crescent Med J. 2019 ; 21(9):e92077. doi: 10.5812/ircmj.92077.

ARTICLE INFORMATION

Iranian Red Crescent Medical Journal: 21 (9); e92077
Published Online: September 30, 2019
Article Type: Research Article
Accepted: August 17, 2019

Crossmark

CHEKING

Abstract

Background: Breast cancer is the most common cancer in women, which has not been completely cured yet. The traditional approaches have low accuracy for breast cancer detection. However, intelligent techniques have been recently used in medical research to distinguish infected individuals from healthy ones, accurately.

Objectives: In this study, we aim to develop an ensemble machine learning (ML) method to distinguish tumor samples from healthy samples robustly.

Methods: We used an Imperial Competitive Algorithm coupled with a Fuzzy System (ICA-Fuzzy-SR) to identify the most influencing features to recognize tumor samples. To evaluate the proposed method, we used the publicly available Wisconsin Breast Cancer Dataset (WBCD).

Results: Benchmarking with the current existing leading methods indicates that our proposed method achieves 95.45% prediction accuracy, which is 3% better than those reported in previous studies.

Conclusions: Such results achieve while our model is significantly faster than previously proposed models to solve this problem.

Keywords

Algorithms Benchmarking Breast Neoplasms Fuzzy Tuning ICA Feature Selection Machine Learning Sparse Representation Wisconsin

Copyright © 2019, Author(s). This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/) which permits copy and redistribute the material just in noncommercial usages, provided the original work is properly cited

1. Background

When abnormal cells grow up in breast tissue, the result is to create breast cancer. This type of cancer is common in developing and advanced countries. The changed lifestyle and hormone therapy are among the main reasons for breast cancer. Diagnosing breast cancer at an early stage helps initiation of less extensive treatment (1). Most of the diagnostic methods are based on expensive and time-consuming methods such as, surgery and mammography screening. In developed countries, mammography screening is a conventional method (2). However, there is a need for new cost-effective approaches for early diagnosis. Unfortunately, the typical strategy for detecting breast cancer has failed to achieve high accuracy in diagnosis management and inaccurate alert rates. Therefore, artificial intelligence (AI) and specifically, machine learning (ML) approaches are adopted to make smart decisions (3).

During the past decade, machine learning (ML) techniques have been widely employed to diagnose breast cancer for tumor feature extraction in order to increase diagnostic accuracy (4). Fuzzy logic is considered as a primary method in ML to tackle this problem. Most applications of this model are for the classification and feature extraction of tumors in this field of the study. This method has been employed for the classification of various tumors. Gerald Schaefer et al. (5) employed a fuzzy logic method for the classification of breast cancer using bilateral differences among right and left breast areas. Also, Chen et al. (6) implemented a fuzzy c-mean (FCM) method for the segmentation of breast lesion in MR-images. In general, the fuzzy logic method has been employed to decide on uncertain information. However, there is a need for dealing with the cognitive uncertainties of the learning mechanism.

Artificial Neural Network (ANN), an advanced technique in cognitive science and machine learning, has also been widely applied as a classification technique in cancer diagnosis. Khan et al. (7) developed an ANN to classify cancer for multiple diagnosis groups in the presence of their gene expression specifications.

Another important aspect of breast cancer research is to select the most informative and relevant feature for the identification process. Feature selection algorithms help the diagnose algorithm for having input features/attributes with more critical information. In this study, we aim at using Imperial Competitive Algorithm (ICA) to select more relevant features. ICA is an optimization method in AI that performs neighborhood movements and benefits from less dependency on the initial solutions. It has a better convergence rate compared to other optimization algorithms. Recently, ICA has been used in optimization algorithms, dimensionality reduction, and feature selection in many research topics such as medical research (8), (9), and (10).

Weighting the selected features or important attributes/features is a general method for highlighting the input features in the diagnostic/detection applications. The fuzzy system is one of the weighting methods that is employed for automatic weighing on uncertain features (11). The fuzzy system is proved to be practical in terms of automated decision making to assign a suitable weight to element. On the other hand, the manual weighting is proved to be unable to select the most informative features in comparison to fuzzy weighting (5).

The main objectives of this study are as follows:

● Investigating the state of art of the ML methods applied in breast cancer diagnosis.

● Performing the ICA (Imperialist competitive algorithm) feature selection method.

● Fuzzy tuning of features.

● Performing a sparse representation method for classifying the breast cancer data.

● Analyzing the results of the proposed method.

To achieve the objectives, we first select the best features through ICA and then use fuzzy rules to tune their weights. We then perform the sparse representation algorithm to classify the outcome of the dataset. The sparse representation for classification (SRC) is employed when specific data transformation techniques are applied. In other words, the main advantage of the sparse algorithm is ignoring the Euclidean distance between the samples when learning the wrong description of a test instance in the given dictionary (12).

The rest of this article is organized as follows. Section 2 presents a brief review of previous studies (literature review), and Section 3 presents the applied methodology. The proposed methods of applying ICA feature selection, Fuzzy tuning of feature and sparse representation are described in Section 4. The experimental results are presented in Section 5, and finally, the conclusion is provided in Section 6.

2. Literature Review

Breast cancer, as one of the most common cancers, has a high mortality rate in both developing and developed countries. From the estimated 1600 cured cases of breast cancer globally in 2012, 794 cases accrued in the more developed countries compared to 883 cases in the developing countries (13).

Image processing and mammograms screening are two most common breast cancer diagnosis approaches. Computer-Aided Design (CAD) Mammography is aimed to design a system with the ability to read mammograms independently in the presence of Convolutional Neural Network (CNN) (14).

With the advancement of computational methods, machine learning and data science are now considered as other significant contributors to medical science. For example, Rule-Based Fuzzy Cognitive Map (RBFCM) is considered as an important approach to classify the potential breast cancer threatening factors using human experiences and knowledge (15).

Williams et al. (16) employed the C4.5 (j48 is C4.5 version of weka) decision trees and the naïve Bayes’ algorithms to predict breast cancer risks in Nigerian patients. In this study, C4.5 demonstrated better performance compared to naïve Bayes.

In another study (17), Random Forest was demonstrated to have the best classification performance for the majority of the tested group, while in the case of masses texture, Naive Bayes had the best performance.

In a different study conducted by Diz et al. (17), a data mining method has been presented for breast cancer diagnosis in the presence of two breast cancer datasets. The proposed algorithm was developed on WEKA to classify the texture features.

One of the primary time consuming and challenging tasks in this area is to filter all the information pertinent to support the clinical disease diagnosis. For this reason, a hybrid K-means and support vector machine (K-SVM) method was developed in (18) to extract useful information and diagnose the tumor. Using this model, they were able to significantly improve breast cancer detection (18).

Artificial immune recognition system (AIRS) has been successfully employed for diagnosing various diseases. In the study by Saybani et al. (19), AIRS was applied to classify a tuberculosis disease. The proposed approach accurately classified the tuberculosis cells. Also, in another study by Saybani et al. (20), a hybrid AIRS-SVM, fuzzy logic, and real tournament selection mechanism were presented to detect the specified features of cancer cells in breast tissue. The results indicated a promising high-performance precision.

To the best of our knowledge, ICA technique provides on the highest accuracies among the other techniques for classification purposes in breast cancer research and feature extraction. Also, it can be comfortably merged with Computational Intelligence (CI) techniques to make hybrid methods.

Yaghoobi et al. (8) diagnosed breast cancer by Gray Level Co-occurrence Matrix (GLCM) and cumulative histogram features. They used the texture attributes from GLCM and presented the ICA feature selection method, decision tree, and neural network for feature selection. They showed that the ICA feature selection had better performance in terms of accuracy rate in diagnosing breast cancer. In (9), the authors proposed an algorithm that used a fuzzy expert system and the ICA as data mining methods to diagnose coronary artery disease (CAD). In this study, a fuzzy system and a decision tree were used to make correct rules. Also, the ICA was employed to modify fuzzy membership functions. Another application of ICA in medical research was for selecting the parameters of Gabor filters in retinal vessel segmentation (10). In this study, ICA reduced the parameters of the Gabor filter from 180 to 20 with the maximum accuracy for parameter selection. There are various dimensionality reducing and feature selection algorithms, like PCA and Fisher (21). However, the results have showed that the ICA feature selection algorithm after iterative epoch for selecting beneficial features has better performance in the low-dimensional dataset compared to PCA and LDA (22).

Similar to ICA and Fuzzy system, the sparse representation for classification has extensive applications in medical research. Al-Shaikhli et al. (23) used a sparse representation for local and global image information in 3D liver segmentation. They used the K-SVD method for the learning dataset. In (12), the authors applied the sparse representation based on the texture classification for CT lung images. They explored characterizing lung textures with sparse decomposition by different regularization techniques. Li et al. (24), proposed contour sparse representation methods and organ location determination for organ segmentation. They investigated the liver, kidney, and spleen.

Later on Jiang et al. (25), examined an images fusion strategy in medical images. They decomposed medical source images into low-frequency (LF) and high-frequency layers. In this study, the WBCD benchmark was extracted from the UCI (University of California Irvine) machine learning repository and used for experimentations (Frank and Asuncion 2010) make consistent referencing. WBCD dataset included 699 samples which were collected from needle aspirates from human breast cancer tissue. This dataset consists of nine features from fine-needle aspirates in the form of an integer value between 1 and 10. The regular characteristics are Normal Nuclei, Uniformity of Cell Size, Shape Single Epithelial Cell Size, Marginal Adhesion, Uniformity of Cell, Mitoses, Bland Chromatin, Clump Thickness, and Bare Nuclei. In the employed dataset, 65.5% of the samples are benign, and 34.5% of samples corresponds to malignant class (26).

3. Research Methodology

Figure 1 shows the architecture of the proposed method for the detection of breast cancer based on the ICA feature selection, Fuzzy tuning, and Sparse Representation.

Figure 1. The general architecture of our proposed method for Detection of Breast Cancer

The steps of the proposed method in Figure 1 are described as follow:

1. Step 1: Splitting the dataset for train and test with leave one out cross-validation (LOOCV) or K-Fold cross-validation.

2. Step 2: Selecting the effective features on train-data via the ICA feature selection algorithm.

3. Step 3: Weighting the features by Fuzzy tune on features;

3.1. Weighting features of selected indexes on test-data.

3.2. Weighting features of selected indexes on train-data.

This weighting is performed via Fuzzy tuning of the feature.

4. Step 4: Performing sparse representation on dataset samples. This step represents the discriminative sparse coefficient for each sample.

5. Step 5: Classification step using Euclidean distance for discriminating the new sparse representation of dataset samples form a test sample of the dataset.

6. Step 6: Evaluation of the proposed method with several measurements criteria such as Accuracy, RMSE, R, R2, Sensitivity, and Specificity.

4. Proposed Method

4.1. ICA Feature Selection

There are various dimensionality reduction and feature selection algorithms such as Principal Component Analysis (PCA), Fisher, etc. (21). However, the results show that the ICA feature selection algorithm has been shown to be better in a low-dimension dataset (22). ICA has been introduced, and its mechanism is explained in detail in (27). In this study, the ICA feature selection was adopted to a BC dataset to choose the most influencing features. The mechanism of the ICA is summarized in Algorithm 1 (Box 1).

Box 1. Imperialist Competitive Algorithm (ICA)
Procedure
1. Generate an initial population
2. Calculate the initial population’s cost function
3. Sort initial population based on cost function values
4. Select imperialist state
5. Divide colonies among imperialists based on their cost function
6. Check the cost of all colonies in each empire
7. Assimilation:
8. If there is a colony which has a lower cost than its imperialist, then
9. Exchange the position of the colony and the imperialist
10. End if
11. Revolution:
12. Update the position of the ith empire
13. Calculate the empire’s total cost
14. Find the weakest empire
15. Give one of its colony to the winner empire
16. Check the number of colonies in each empire
17. If there is an empire without a colony, then
18. Remove the empire and give its imperialist to the best empire
19. End if

According to the ICA procedure presented in Table 1, the best empire in ICA has the best cost, and its selected position is regarded as the best feature among all features. For example, according to the regularization parameters in Table 1, four elements among the nine features were selected in this study.

Table 1. The Result of the ICA Method on BC Dataset
Regularization ParametersImperialistPositionCostNumber of Colonies
Population size = 200; revolution rate = 0.3; epoch = 50; number of Initial IMP = 20; number of IMP after 50 epochs = 161[1,1,1,1,1,0,0,1,0]0.692411
2[1,1,1,1,1,1,0,1,0]0.696711
3[1,1,0,0,1,0,0,0,0]0.692410
4[1,1,1,0,1,0,0,1,1]0.69677
5[1,1,0,1,1,0,1,1,1]0.692410
6[0,0,1,0,1,0,0,1,1]0.692410
7[1,1,1,0,1,1,1,1,0]0.692410
8[1,1,0,0,0,0,1,0,1]0.698123
9[1,1,0,1,0,0,1,1,0]0.695213
10[1,1,0,1,0,1,1,1,0]0.69099
11[1,0,0,1,1,0,1,1,0]0.69246
12[0,1,1,1,1,1,1,0,0]0.695212
13[0,0,1,1,0,0,1,0,0]0.68958
14[1,0,0,1,0,1,1,0,1]0.692417
15[1,0,1,0,1,1,0,1,1]0.69389
16[1,0,1,1,1,0,0,0,0]0.695210
17[1,1,0,1,0,0,0,0,0]0.69537

Table 1 shows the results of the ICA feature selection. The total number of populations was 200, and we selected 20 IMP randomly among the total population. One could find the weakest empire and give its colony to the winner empire. The total number of selected IMP was 17 for running the algorithm. The best cost was realized for IMP number 8. The best position of imperialist was [1,1,0,0,0,0,1,0,1] where the value of 1 is the best feature selected by ICA, and the value of 0 is the unselected feature in the dataset.

4.2. Tuning of the Attributes Using the Fuzzy System

The ICA feature selection method selected four features, including: F1 (normal nuclei), F2 (uniformity of cell size), F7 (bland chromatin), and F9 (bare nuclei candidate as the best features). To make more differences between the selected features, we utilized the fuzzy system to tune the weight of each attribute. The fuzzy system produced a weight as an output of the fuzzy rule-based system (28). The mechanism of how fuzzy system perform is explained in detail in (29). This weight was calculated through the inference fuzzy rule-based system between F1, F2, F7, and F9 as the inputs of the fuzzy system.

Before applying the fuzzy system, we analyzed the data statistically to make sure about the range of each input to assign the proper membership function. In this case, the statistical analysis of BC sample features in two classes (benign and malignant) showed that the value of features was different in each class (see Figure 2). For example, the range value of 1 to 3 are more common in F2 (uniformity of cell size) among the benign class and the range value of 4 to 10 was more common among the malignant class. The statistical analysis of features normal nuclei, uniformity of cell size, bland chromatin, and bare nuclei are represented in Figures 2A, 2B, 2C, and 2D, respectively.

Figure 2. The histogram of frequency values and membership (MF) functions of each feature in the BC dataset in two classes (benign/malignant). A, The frequency values and MF of normal nuclei; B, The frequency values and MF of uniformity of cell size; C, The frequency values and MF of bland chromatin; D, The frequency values of bare nuclei; E, The membership function of the fuzzy output

Figure 2 shows the histogram of frequency values and membership functions (MF) of each feature in the BC dataset in two classes (benign/malignant). Also, the output of membership functions is demonstrated in Figure 2. Membership functions for input data in a fuzzy system are determined based on the frequency of data in the selected features. More details of the input and output of the fuzzy system are presented in Table 2.

Table 2. Input/Output of the Fuzzy System in the Tuning of the Selected Attributes
Input/outputInput Linguistic VariablesSelected by ICARangeParameter NameType of Membership Function
Input(Low), (High)Yes1 - 10Feature 1: Clump Thicknessgbelmf
(Low), (Moderate), (High)Yes1 - 10Feature 2: Uniformity of Cell Size
-No1 - 10Feature 3: Uniformity of Cell Shape
-No1 - 10Feature 4: Marginal Adhesion
-No1 - 10Feature 5: Single Epithelial Cell Size
-No1 - 10Feature 6: Bare Nuclei
(Low), (Moderate), (High)Yes1 - 10Feature 7: Bland Chromatin
-No1 - 10Feature 8: Normal Nucleoli
(Low), (High)Yes1 - 10Feature 9: Mitoses
Output(V-Low), (Low), (Moderate), (High), (V-High)-1 - 10Fuzzy Weight

Table 2 shows the input features of the fuzzy system, range, linguistic variables, and type of their membership functions. It is worth noting that the de-fuzzifier method in this FIS is the centroid.

4.3. Sparse Representation for Classification (SRC)

The SRC is considered as an active research field for classification purposes. The signals potentially represent a linear combination of atoms in an over-complete dictionary. Therefore, SRC can be applied to various classification tasks. It is based on the hypothesis that a prototype is represented in linear matter with a few tutorials of the same class. More technical description of SRC can be found in (30) and (31). Algorithm 2 (Box 2) shows how the SRC method in which the training data had two classes (malignant and benign ψ= [ψ1 ψ2]) and the testing data is Y. The relationship between the training and testing data is reported as Y= αi, 1 ψi, 1 + αi, 2 ψi, 2 + … + αi, ni ψ1, ni, where i is the number of classes and ni is the amount of training data in class i. To achieve the coefficient A^ in the minimization problem is the main aim of the sparse algorithm in this study. As l0 Minimization is NP-Hard problem, the l1 Minimization is replaced with l0 Minimization for solving this optimization problem.

Box 2. Algorithm 2 - Sparse Representation for Classification (SRC)
Sparse Representation for Classification (SRC) Algorithm
1. InputTraining data (ψ = [ψ1 ψ2]), ψ1 = training data for class malignant, ψ2 = training data for class benign; testing data (Y)
2. Problem definitionY = αi,1 ψi,1 + αi,2 ψi,2 +…+ αi,ni ψ1,ni , i = 1, 2 (number of classes), ni = number of training data in class i; Y = A ψ, (Y = A ψ + z, z = noise)
3. Coefficient vectorA=[0, …, 0, αi,1, αi,2, …, αi,ni, 0 , …, 0]$∈$Rn ; A^ = [A^1 A^2], A^i = $αi,jj=1ni2$
4. l0 Minimizationarg min$A^0$ subjected to < $ε$ ; (l0 Norm NP-Hard problem and convert to l0 Norm)
5. l1 Minimizationarg minsubjected to < ε ; arg min$A^1$ + , λ = regularization parameter
6. Calculate residual
7. OutputEstimated labels for testing data; argi min ri (Y)

There are various optimization methods for solving the minimization problem in the sparse representation algorithm. Some of them include Orthogonal Matching Pursuit (OMP), Orthogonal Recurrent Matching Pursuit (ORMP), Lasso, FOCUSS, Basic Pursuit, and Pinv (complete name) [19, 20]. More details about the parameters are reported in the next section.

5. Experimental Results

Here we use WBCD dataset for the evaluation of the proposed method. According to Section 4.2, the ICA feature selection method was applied for selecting the most effective feature. Therefore, four influencing features (F1: normal nuclei, F2: uniformity of cell size, F7: bland chromatin, F9: bare nuclei) were selected among the nine features with regularization parameters as mentioned in Table 3.

Table 3. The Selected Feature via the ICA Feature Selection Algorithm and ICA Regularization Parameter
MethodParameterF1F2F3F4F5F6F7F8F9
Without feature selection-111111111
ICA feature selectionPopulation size = 200; revolution rate = 0.3; epoch = 50; number of IMP = 2011----1-1

The value of 1 in Table 3 indicates the selected features, and the dash indicates the unselected features. It means that all of the features were selected in row 1 (without feature selection) of Table 3, and features 1, 2, 7 and 9 were selected based on the ICA feature selection method in row 2. As shown in Table 4, the accuracy rate of BC detection is evaluated as follows:

State 1: Without feature selection, all of the features (9 features) in the dataset were considered as usual (the weight of each feature is 1). In this state, the accuracy rate was 88.86%.

State 2: ICA feature selection: Four features were selected by the ICA feature selection method. The accuracy rate of 87.19% was achieved only with the four selected features (weight of the selected features was 1 while others was 0). In this state, the unselected features were eliminated, and just four features remained for diagnosis and classification.

State 3: Manual weighting for the selected features: In this state, the accuracy rates were achieved through weighting the selected features (F1, F2, F7, and F9). This weighting was applied to distinguish the effect of the selected features. Weighting in this state was performed manually in the range between 1 and 2 (the weight of the selected feature was between 1 and 2 while others was 1).

State 4: Fuzzy weighting for selected features: The manual weighting in state 3 was not reasonable for each dataset and needed the repetition and experience for selecting the best weight to achieve the best result. Therefore, the tuning of attributes was performed by a fuzzy system to weight the selected features. The fuzzy system produced a weight as an output of the fuzzy rule-based system. The accuracy of 95.45% was achieved when the fuzzy weight was applied to the selected features.

Table 4. The Procedure and Weighing of Features for Accuracy in Breast Cancer Detection
MethodWeightF1F2F3F4F5F6F7F8F9Accuracy
State 1: Without feature selectionα = 1α × F1α × F2α × F3α × F4α × F5α × F6α × F7α × F8α × F988.86
State 2: ICA feature selectionα = 0; β = 1β × F1β × F2α × F3α × F4α × F5α × F6β × F7α × F8β × F987.19
State 3: Manual weighting for ICA selected featuresα = 1β = 1.6β × F1β × F2α × F3α × F4α × F5α × F6β × F7α × F8β × F989.17
β = 1.789.38
β = 1.889.79
β = 1.989.9
β = 290
State 4: Fuzzy weighting for ICA selected featuresα = 1γ = Fuzzy weight (Table 3)γ × F1γ × F2α × F3α × F4α × F5α × F6γ × F7α × F8γ × F995.45

According to the result of step 4 in Table 4, γ is the fuzzy weight as an output of the fuzzy system. This weight was calculated through inference fuzzy rule-based system between F1, F2, F7, and F9 features as the inputs of the fuzzy system. Several numbers of samples and their calculated fuzzy weights in the current dataset are shown in Table 5.

Table 5. Calculated Weights as the Output of a Fuzzy System for Several Numbers of Samples in the Dataset
Input Fuzzy System (a Selected Feature with ICA Feature Selection Algorithm)Output (Calculated Weight with Fuzzy Rule)γ × Features (Data for Classification)
F1F2F6F9Weight (γ)F1F2F3F4F5F6F7F8F9
11317.447.47.411222.3317.4
87943.2125.722.4510728.95512.8
74115.1235.820.56465.12435.12
41119.3637.49.361129.36219.36
41119.3637.49.361129.36319.36
1071023.1931.922.376431.9416.38
61117.7946.747.791127.79317.79
731044.6432.5113.9210546.45418.5
105716.2562.5831.253643.87106.25

As shown in Table 5, the output of the fuzzy system was applied to the selected features, and other features remained unchanged. After feature selection and tuning the attributes using a fuzzy system, sparse representation was performed for each sample. Different sparse optimization methods such as OMP, Pinv, FOCUSS, Basic Pursuit, and ORMP were evaluated in this study. The values of evolution methods including RMSE, R, R2, accuracy, and true/false percentage on the detection of malignant/benign for several numbers of sparse optimization methods are summarized in Table 6.

Table 6. The Results of the Proposed Method for Weighting the Selected Features and Without Feature Selection
MethodSparse Optimization MethodTrue Positive (Truly Detect Malignant )True Negative (True Detect Benignly)False Positive (False Detect Malignant)False Negative (False Detect Benign)RMSERR2Accuracy
Without feature selectionOMP94.3283.405.0316.60.71570.14950.191088.86
Pinv53.4999.1746.510.830.76490.08710.210176.33
FOCUSS9.1798.7590.831.250.80320.04270.057253.96
Basic Pursuit8.7398.7591.271.250.80320.03980.054753.74
ORMP94.9782.155.037.750.71160.14370.188288.56
Weighting the selected featuresOMP96.7294.193.285.810.72850.18060.194095.45
Pinv56.3395.8543.674.150.75460.07180.216076.09
FOCUSS43.0194.6056.995.40.76400.04990.192668.80
Basic Pursuit26.6393.3673.376.640.76400.04990.192660
ORMP92.2892.537.727.470.72460.16320.192694.40

The OMP (Orthogonal Matching Pursuit), sparse optimization method, has a desirable result compared to the other optimization methods in both the feature selection and weighted functions for the ICA feature selection. True positive, true negative, false positive, and false negative are the true measurement percentage of detection of malignant, true percentage detection of the benign, false percentage of detection of the malignant and false percentage of detection of benign, respectively.

Figure 3 shows the accuracy rates among sparse optimization methods, both in sensitivity and in specificity (Equations 1 The best approach has the highest value of sensitivity and specificity. The range of values for sensitivity and specificity are between 0 and 1.

Figure 3. Result of calculated sensitivity and specificity value on several sparse optimization methods

As shown in Figure 3, OMP had the highest value in both sensitivity and specificity. Figure 4 shows ROC Curves for OMP, ORMP, FOCUSS, Pinv, and Basis Pursuit Sparse Optimization Methods on the classification of the selected features. Figure 4 shows the ROC plot, which measures the area under the ROC that shows how well a parameter can distinguish between malignant and benign classes.

Figure 4. ROC Curves for OMP, ORMP, FOCUSS, Pinv, and Basis Pursuit Sparse Optimization Methods on the classification of the selected feature with ICA feature selection algorithm

According to Figure 4, the ROC curve for Orthogonal Matching Pursuit (OMP) Sparse Optimization on the selected features has the highest accuracy rate. After OMP, ROMP demonstrate the best result compared to the other methods. Figure 5 indicates the ROC Curves for OMP (the abbreviation is already introduced) Sparse Optimization with and without feature selection.

Figure 5. ROC Curves for OMP Sparse Optimization with and without feature selection

According to Figure 5, ROC Curves shows a better performance for OMP Sparse Optimization on the selected features with the fuzzy tuning compared to the other case, in which feature selection had not been performed. Table 7 illustrates the performance of the proposed method compared with some of the primary techniques.

Table 7. Performance of the Proposed Method Compared with Some of Basic Methods
AlgorithmTesting AccuracyRunning Time
MLP91.21.94
Logistic regression92.50.23
Proposed method95.450.0034

As shown in Table 7, our proposed method achieves 95.45 percentage of accuracy rate, which is the best compared to other primary techniques such as Multi-layer Perceptron (MLP) and Logistic Regression (LR). It also demonstrates better running time compared to MLP and LR.

6. Conclusions

This study introduced a new multi-phase feature selection method that uses a fuzzy system integrated with the ICA feature selection algorithm to robustly distinguish tumor samples from healthy samples. To choose the most influencing features, it employs the spare algorithm to classify breast cancer. Our achieved results demonstrate the effectiveness of the proposed feature selection strategy. The experimental results were optimized by sparse representation method, and we found that the Orthogonal Matching Pursuit (OMP) was the best method in sparse representation with feature selection. The promising result of our proposed model suggests its possible usage as a substitute for medical invasive detection management. The proposed hybrid ICA-Fuzzy-SR method with 94.52% accuracy and 0.7285 of RMSE rate demonstrated excellent performance compared to current methods.

References

• 1.

Bhardwaj A, Tiwari A. Breast cancer diagnosis using Genetically Optimized Neural Network model. Expert Syst Appl. 2015;42(10):4611-20. doi: 10.1016/j.eswa.2015.01.065.

• 2.

World Health Organization. Breast cancer: Prevention and control. 2012. Available from: http://www.who.int/cancer/detection/breastcancer/en/.

• 3.

Elouedi H, Meliani W, Elouedi Z, Ben Amor N, editors. A hybrid approach based on decision trees and clustering for breast cancer classification. Soft Computing and Pattern Recognition (SoCPaR), 2014 6th International Conference of IEEE. 2014. p. 226-31.

• 4.

Wolberg WH, Street WN, Mangasarian OL. Image analysis and machine learning applied to breast cancer diagnosis and prognosis. Anal Quant Cytol Histol. 1995;17(2):77-87. [PubMed: 7612134].

• 5.

Schaefer G, Zavisek M, Nakashima T. Thermography based breast cancer analysis using statistical features and fuzzy classification. Pattern Recogn. 2009;42(6):1133-7. doi: 10.1016/j.patcog.2008.08.007.

• 6.

Chen W, Giger ML, Bick U. A fuzzy c-means (FCM)-based approach for computerized segmentation of breast lesions in dynamic contrast-enhanced MR images. Acad Radiol. 2006;13(1):63-72. doi: 10.1016/j.acra.2005.08.035. [PubMed: 16399033].

• 7.

Khan J, Wei JS, Ringner M, Saal LH, Ladanyi M, Westermann F, et al. Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat Med. 2001;7(6):673-9. doi: 10.1038/89044. [PubMed: 11385503]. [PubMed Central: PMC1282521].

• 8.

Yaghoobi H, Ghahramani Barandagh A, Mohammadi Z, editors. Breast cancer diagnosis using, grey-level co-occurrence matrices, decision tree classification and evolutionary feature selection. 2nd International Conference on Knowledge-Based Engineering and Innovation (KBEI): IEEE. 2015. p. 317-24.

• 9.

Mahmoodabadi Z, Saniee Abadeh M. CADICA: Diagnosis of coronary artery disease using the imperialist competitive algorithm. J Comput Sci Eng. 2014;8(2):87-93. doi: 10.5626/jcse.2014.8.2.87.

• 10.

Farokhian F, Yang C, Demirel H, Wu S, Beheshti I. Automatic parameters selection of Gabor filters with the imperialism competitive algorithm with application to retinal vessel segmentation. Biocybernet Biomed Eng. 2017;37(1):246-54. doi: 10.1016/j.bbe.2016.12.007.

• 11.

Paul AK, Shill PC, Rabin M, Murase K. Adaptive weighted fuzzy rule-based system for the risk level assessment of heart disease. Appl Intell. 2017;48(7):1739-56. doi: 10.1007/s10489-017-1037-6.

• 12.

Yang J, Feng X, Angelini ED, Laine AF, editors. Texton and sparse representation based texture classification of lung parenchyma in CT images. 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). 2016. p. 1276-9.

• 13.

Yip CH, Taib NA. Breast health in developing countries. Climacteric. 2014;17 Suppl 2:54-9. doi: 10.3109/13697137.2014.947255. [PubMed: 25131779].

• 14.

Kooi T, Litjens G, van Ginneken B, Gubern-Merida A, Sanchez CI, Mann R, et al. Large scale deep learning for computer aided detection of mammographic lesions. Med Image Anal. 2017;35:303-12. doi: 10.1016/j.media.2016.07.007. [PubMed: 27497072].

• 15.

Buyukavcu A, Albayrak YE, Goker N. A fuzzy information-based approach for breast cancer risk factors assessment. Appl Soft Comput. 2016;38:437-52. doi: 10.1016/j.asoc.2015.09.026.

• 16.

Williams K, Adebayo Idowu P, Ademola Balogun J, Ishola Oluwaranti A. Breast cancer risk prediction using data mining classification techniques. Trans Network Commun. 2015;3(2). doi: 10.14738/tnc.32.662.

• 17.

Diz J, Marreiros G, Freitas A. Applying data mining techniques to improve breast cancer diagnosis. J Med Syst. 2016;40(9):203. doi: 10.1007/s10916-016-0561-y. [PubMed: 27498205].

• 18.

Zheng B, Yoon SW, Lam SS. Breast cancer diagnosis based on feature extraction using a hybrid of K-means and support vector machine algorithms. Expert Syst Appl. 2014;41(4):1476-82. doi: 10.1016/j.eswa.2013.08.044.

• 19.

Saybani MR, Wah TY, Aghabozorgi SR, Shamshirband S, Mat Kiah ML, Balas VE. Diagnosing breast cancer with an improved artificial immune recognition system. Soft Comput. 2016;20(10):4069-84. doi: 10.1007/s00500-015-1742-1.

• 20.

Saybani MR, Shamshirband S, Golzari S, Wah TY, Saeed A, Mat Kiah ML, et al. RAIRS2 a new expert system for diagnosing tuberculosis with real-world tournament selection mechanism inside artificial immune recognition system. Med Biol Eng Comput. 2016;54(2-3):385-99. doi: 10.1007/s11517-015-1323-6. [PubMed: 26081904].

• 21.

Sugiyama M. Dimensionality reduction of multimodal labeled data by local Fisher discriminant analysis. J Mach Learn Res. 2007;8(May):1027-61.

• 22.

Martinez AM, Kak AC. PCA versus LDA. IEEE Trans Pattern Anal Mach Intell. 2001;23(2):228-33. doi: 10.1109/34.908974.

• 23.

Al-Shaikhli SDS, Yang MY, Rosenhahn B. Automatic 3D liver segmentation using sparse representation of global and local image information via level set formulation. arXiv. 2015:1508-21.

• 24.

Li S, Jiang H, Yao YD, Yang B. Organ location determination and contour sparse representation for multiorgan segmentation. IEEE J Biomed Health Inform. 2018;22(3):852-61. doi: 10.1109/JBHI.2017.2705037. [PubMed: 28534802].

• 25.

Jiang W, Yang X, Wu W, Liu K, Ahmad A, Sangaiah AK, et al. Medical images fusion by using weighted least squares filter and sparse representation. Comput Electr Eng. 2018;67:252-66. doi: 10.1016/j.compeleceng.2018.03.037.

• 26.

Frank A, Asuncion A. UCI machine learning repository. Irvine, CA: School of Information and Computer Science, University of California; 2010. Available from: http://archive.ics.uci.edu/ml.

• 27.

Atashpaz-Gargari E, Lucas C, editors. Imperialist competitive algorithm: An algorithm for optimization inspired by imperialistic competition. IEEE Congress on Evolutionary Computation. 2007. p. 4661-7.

• 28.

Kalantari A, Kamsin A, Shamshirband S, Gani A, Alinejad-Rokny H, Chronopoulos AT. Computational intelligence approaches for classification of medical data: State-of-the-art, future challenges and research directions. Neurocomputing. 2018;276:2-22. doi: 10.1016/j.neucom.2017.01.126.

• 29.

Miranda GH, Felipe JC. Computer-aided diagnosis system based on fuzzy logic for breast cancer categorization. Comput Biol Med. 2015;64:334-46. doi: 10.1016/j.compbiomed.2014.10.006. [PubMed: 25453323].

• 30.

Huang GB, Zhu QY, Siew CK. Extreme learning machine: Theory and applications. Neurocomputing. 2006;70(1-3):489-501. doi: 10.1016/j.neucom.2005.12.126.

• 31.

Huang GB, Ding X, Zhou H. Optimization method based extreme learning machine for classification. Neurocomputing. 2010;74(1-3):155-63. doi: 10.1016/j.neucom.2010.02.019.