Feature Selection With Discernibility and Independence Criteria
Article 2024 en
Authors
JX
Juanying Xie
MW
Mingzhao Wang
PG
P.W. Grant
Abstract
1 min read
Feature selection plays a significant role in data mining and machine learning. It is challenging to determine how many features are necessary to form an optimal feature subset. To address this challenge, an innovative visual 2D feature selection framework is introduced, in which the feature discernibility and independence are defined to evaluate its capability for classification and its relevance to other features, respectively. All features are represented in 2D space with discernibility as <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$x$</tex-math></inline-formula> -axis and independence as <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$y$</tex-math></inline-formula> -axis. The features located in the upper right corner represent high discernibility and high independence, so comprise the optimal feature subset. This leads to the formation of a family of feature selection algorithms. Three such algorithms are proposed in this paper referred to as FSDIE, FSDIR, and FSDIS (Feature Selection based on the Discernibility and the Independence, respectively, of Exponent, Reciprocal, and anti-Similarity). To speed-up these three algorithms, a clustering based feature preselection first eliminates some unrelated and redundant features. Extensive experiments on UCI datasets, face datasets and gene expression datasets demonstrate that these three 2D feature selection algorithms are superior to the state-of-the-art methods indicating the power of our 2D feature selection framework.
Discussion(0)
No comments yet. Be the first to comment.