An Hybrid Method for Feature Selection Based on Multiobjective Optimization and Mutual Information
DOI:
https://doi.org/10.26713/jims.v7i1.268Keywords:
Hybrid feature selection, Mutual information, Multiobjective optimization, ClassificationAbstract
In this paper we propose a hybrid approach using mutual information and multi-objective optimization for feature subset selection problem. The hybrid aspect is due to the sequence of a filter method and a wrapper method in order to take advantages of both. The filter method reduces the exploration space by keeping subsets having good internal properties and the wrapper method chooses among the remaining subsets with a classification performances criterion. In the filter step, the subsets are evaluated in a multi-objective way to ensure diversity within the subsets. The evaluation is based on the mutual information to estimate the dependency between features and classes and the redundancy between features within the same subset. We kept the non-dominated (Pareto optimal) subsets for the second step. In the wrapper step, the selection is made according to the stability of the subsets regarding classification performances during learning stage on a set of classifiers to avoid the specialization of the selected subsets for a given classifiers. The proposed hybrid approach is experimented on a variety of reference data sets and compared to the classical feature selection methods FSDD and mRMR. The resulting algorithm outperforms these algorithms and the computation complexity remains acceptable even if it increases with regards to these two fast selection methods.Downloads
References
M. Abadi, E. Grandchamp, O. Alata, O. Olivier and M. Khoudeir, Information criteria performance for feature selection, in Proceedings of the 4th International Congress on Image and Signal Processing, Vol. 2, Shangay, China, October 2011, pp. 919-923.
A. Al-Ani, M. Deriche and J. Chebil, A new mutual information based measure for feature selection, Intelligent Data Analysis Vol. 7 (1) , pp. 43-57, 2003.
E. Cantu-Paz, Feature Subset Selection, Class Separability, and Genetic Algorithms, in Genetic and Evolutionary Computation, 2004, pp. 959-970.
H. Chouaib, O. Ramos-Terrades, S. Tabbone, F. Cloppet and N. Vincent, Feature selection combining genetic algorithm and Adaboost classifiers, in 19th International Conference on Pattern Recognition - ICPR, Tampa, USA, 2008.
S. Das, Filters, Wrappers and a Boosting-Based Hybrid for Feature Selection, in Proceedings of the Eighteenth International Conference on Machine Learning, San Francisco, CA, USA, 2001, pp. 74-81.
L. Davis, Handbook of genetic algorithms, L. Davis, Ed. New York: Van Nostrand Reinhold, 1991.
K. Deb, Multi-Objective Optimization Using Evolutionary Algorithms, John Wiley and Sons, Chichester, 2001.
J. Dy, C. Brodley, A. Kak, L.S. Broderick and A.M. Aisen, Unsupervised feature selection applied to content-based retrieval of lung images, IEEE Transactions on Pattern Analysis and Machine Intelligence 25 (3), pp. 373-378, 2003.
C. Emmanouilidis, A. Hunter, and J. MacIntyre, A Multiobjective Evolutionary Setting for Feature Selection and a Commonality-Based Crossover Operator, in Proceedings of the Congress on Evolutionary Computation, California, July 2000, pp. 309-316.
C. Emmanouilidis, A. Hunter, and J. MacIntyre, A Multi-Objective Genetic Algorithm Approach to Feature Selection in Neural and Fuzzy Modeling, Evolutionary Optimization 3 (1), pp. 1-26, 2001.
G. Forman, An extensive empirical study of feature selection metrics for text classification, Journal of Machine Learning Research 3 (1), pp. 1289-1305, 2003.
J.Q. Gan, S.H. Bashar Awwad and C.S.L. Tsui, A Hybrid Approach to Feature Subset Selection for Brain-Computer Interface Design, in Intelligent Data Engineering and Automated Learning - IDEAL 2011, vol. 6936, 2011, pp. 279-286.
D. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning, London, Ed ed.: 1st Addison-Wesley Longman, 1989.
R. Günter, Convergence Analysis of Canonical Genetic Algorithms, IEEE Transactions on Neural Networks, vol. 5, pp. 96-101, 1994.
I. Guyon and A. Elisseeff, An introduction to variable and feature selection, Journal of Machine Learning Research, vol. 3, no. 1, pp. 1157-1182, 2003.
I. Guyon et al., Eds., Feature Extraction: Foundations, and Applications., 2006.
J. Handl and J. Knowles, Feature Subset Selection in Unsupervised Learning via Multiobjective Optimization, International Journal of Computational Intelligence Research, vol. 2, no. 3, pp. 217-238, 2006.
B.A.S. Hasan and J.Q. Gan, A Multi-objective particle swarm optimization for channel selection in brain-computer interfaces, in The UK Workshop on Computational Intelligence - UKCI, 2009.
B.A.S. Hasan, J.Q. Gan, and Z. Qingfu, Multi-objective evolutionary methods for channel selection in Brain-Computer Interfaces: Some preliminary experimental results, in IEEE Congress Evolutionary Computation, Barcelona, Spain, July 2010, pp. 1-6.
M. Hilario and A. Kalousis, Approaches to dimensionality reduction in proteomic biomarker studies, Briefings in Bioinformatics, vol. 9, no. 2, pp. 102-118, 2008.
F. Hussein, R. Ward, and N. Kharma, Genetic algorithms for feature selection and weighting, a review and study, in International Conference on Document Analysis and Recognition, 2001, pp. 1240-1244.
L.B. Jack, Feature Selection for ANNs using Genetic Algorithms in Condition Monitoring, in European Symposium on Artificial Neural Networks - ESANN, Bruges, Belgium, 1999, pp. 313-318.
O.A. Jadaan, L. Rajamani, and C.R. Rao, Non-Dominated ranked Genetic Algorithm for Solving Multi-Objective Optimization Problems : NRGA, Journal of Theoretical and Applied Information Technology, pp. 60-67, 2008.
A.K. Jain, R.P.W. Duin, and J. Mao, Statistical Pattern Recognition: A Review, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 1, pp. 4-37, 2000.
J. Jarmulak and S. Craw, Genetic algorithms for feature selection and weighting, in Proceedings of the workshop on Automating the Construction of Case Based Reasoners, 1999, pp. 28-33.
R. Jensen, Performing Feature Selection with ACO, Studies in Computational Intelligence, vol. 34, pp. 45–73, 2006.
L. Jourdan, C. Dhaenens, and E-G. Talbi, A Genetic Algorithm for feature selection in Data-Mining for genetics, in Metaheuristic International Conference 2001, Porto, Portugal, July 2001, pp. 29-34.
A. Kalousis, J. Prados, and M. Hilario, Stability of feature selection algorithms: A study on high-dimensional spaces, Knowledge and Information Systems, vol. 12, no. 1, pp. 95-116, 2007.
R. Kohavi and G. John, Wrapper for Feature Subset Selection, Artificial Intelligence, vol. 97, no. 1-2, pp. 273-324, 1997.
I. Kononenko, Estimating attributes: Analysis and extensions of RELIEF, in Proceedings of ECML-94, 1994, pp. 171-182.
L.I. Kuncheva, A stability index for feature selection, in Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: artificial intelligence and applications, Innsbruck, Austria, February 2007, pp. 390-395.
N. Kwak and C.H. Choi, Input Feature Selection by Mutual Information Based on Parzen Window, IEEE Trans. Pattern Anal. Mach. Intell, vol. 24, no. 12, pp. 1667-1671, 2002.
J. Liang, S. Yang, and A-C. Winstanley, Invariant optimal feature selection: A distance discriminant and feature ranking based solution, Pattern Recognition, vol. 41, no. 5, pp. 1429-1439, 2008.
H. Liu and H. Motoda, Feature Selection for Knowledge Discovery and Data Mining. Norwell, MA, USA: Kluwer Academic Publishers, 1998.
H. Liu and L. Yu, Toward Integrating Feature Selection Algorithms for Classification and Clustering, IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 4, pp. 491-502, 2005.
S. Loscalzo, L. Yu, and C. Ding, Consensus group stable feature selection, in Proceedings of the 15th ACM SIGKDD International Conference on Knowledge discovery and data mining, Paris, France, 2009, pp. 567-576.
G.L. Pappa, A.A. Freitas, and C.A.A. Kaestner, Attribute Selection with a Multi-objective Genetic Algorithm, in Proceedings of the 16th Brazilian Symposium on Artificial Intelligence: Advances in Artificial Intelligence, London, UK, 2002, pp. 280-290.
G.L. Pappa et al., A multiobjective genetic algorithm for attribute selection, in Proceedings of the 4th International Conference on Recent Advances in Soft Computing, Nottingham Trent University, December 2002, pp. 116-121.
E. Parzen, On Estimation of a Probability Density Function and Mode, The Annals of Mathematical Statistics, vol. 33, no. 3, pp. 1065-1076, 1962.
H. Peng, F. Long, and C. Ding, Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 8, pp. 1226-1238, 2005.
F. Pernkopf and P. O'Leary, Feature Selection for Classification Using Genetic Algorithms with a Novel Encoding, in Proceedings of the 9th International Conference on Computer Analysis of Images and Patterns, London, UK, 2001, pp. 161-168.
A. Porebski, N. Vandenbroucke, and L. Macaire, Comparison of feature selection schemes for color texture classification, in Proceedings of the International Conference on Image Processing Theory, Tools and Applications, 2010, pp. 32-37.
P. Pudil and J. Novovicova, Novel Methods for Subset Selection with Respect to Problem Knowledge, IEEE Intelligent Systems, vol. 13, no. 2, pp. 66-74, 1998.
S. J Raudys, Feature over-selection, Structural, Syntactic, and Statistical Pattern Recognition, vol. LNCS 4109, pp. 622–631, 2006.
Y. Saeys, T. Abeel, and Y. Peer, Robust Feature Selection Using Ensemble Feature Selection Techniques, in Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II, Antwerp, Belgium, 2008, pp. 313-325.
Y. Saeys, I. Inza, and P. Larranaga, A review of feature selection techniques in bioinformatics, Bioinformatics, vol. 23, no. 19, pp. 2507-2517, 2007.
Y. Sakamoto and H. Akaike, Analysis of cross classified data by AIC, Annals of the Institute of Statistical Mathematics, vol. 30, no. 1, pp. 185-197, 1978.
M. Sebban and R. Nock, A hybrid filter wrapper approach of feature selection using information theory, Pattern Recognition, vol. 35, no. 4, pp. 835-846, 2002.
J. Sheinvald, B. Dom, and W. Niblack, A modeling approach to feature selection, in Proceedings of the 10th International Conference on Pattern Recognition, Atlantic City, NJ, Jun 1990, pp. 535-539.
A. Solanas et al., Feature Selection and Outliers Detection with Genetic Algorithms and Neural Networks, in Proceedings of the Conference on Artificial Intelligence Research and Development, Amsterdam, The Netherlands, 2005, pp. 41-48.
P. Somol, J. Novovicova, and P. Pudil, Efficient Feature Subset Selection and Subset Size Optimization, in InTech-Open Access Publisher, vol. 56, 2010, pp. 1-24.
P. Somol, J. Novovicova, and P. Pudil, Flexible Hybrid sequential floating search in statistical feature selection, in Proceedings of the International Conference on Structural, Syntactic, and Statistical Pattern Recognition, joint IAPR, Hong Kong, China, 2006, pp. 632-639.
P. Somol, J. Novovicova, and P. Pudil, On the Over Fitting Problem of Complex Feature Selection Methods , in Proceedings of the 5th International Computer Engineering Conference , Cairo University, December 2009.
Y. Sun, S. Todorovic, and S. Goodison, Local-Learning-Based Feature Selection for High-Dimensional Data Analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 9, pp. 1610-1626, 2010.
H. Vafaie and I.F. Imam, feature selection methods: Genetic algorithms vs greedy-like search, in Proceedings of the International Conference on Fuzzy and Intelligent Control Systems, 1994.
P. Villar, A. Fern, and F. Herrera, A Genetic Algorithm for Feature Selection and Granularity Learning in Fuzzy Rule-Based Classification Systems for Highly Imbalanced Data-Sets, in IPMU (1), 2010, pp. 741-750.
D. Wettschereck and D.W. Aha, Weighting Features, in Proceedings of the First International Conference on Case-Based Reasoning, 1995, pp. 347-358.
Y.Y. Yao, Information-theoretic measures for knowledge discovery and data mining, in Entropy Measures, Maximum Entropy Principle and Emerging Applications, 2003, pp. 115-136.
H. Zhang and G. Sun, Feature selection using tabu search method, Pattern Recognition, vol. 35, pp. 701–711, 2002.
L. Zhuo, J.and Wang, F. Zheng, X. Li, B. Ai, and J. Qian, A Genetic Algorithm based Wrapper Feature selection method for Classification of Hyperspectral Images using Support Vector Machine, in The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Part B7, vol. XXXVII, Beijing, China, 2008, pp. 397-402.
Downloads
Published
How to Cite
Issue
Section
License
Authors who publish with this journal agree to the following terms:- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a CCAL that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.