Booster of an FS Algorithm on High Dimensional Data
Keywords:
GPS tracking, Reliability, Road network, Visualized map, Map-matching, P-median Model, Network density.Abstract
Classification issues in high dimensional knowledge with tiny variety of observations have become additional common particularly in microarray knowledge. The increasing quantity of text info on the net sites affects the agglomeration analysis[1]. The text agglomeration may be a favorable analysis technique used for partitioning a colossal quantity of knowledge into clusters. Hence, the most important downside that affects the text agglomeration technique is that the presence uninformative and distributed options in text documents .A broad class of boosting algorithms can be interpreted as performing coordinate-wise gradient descent to minimize some potential function of the margins of a data set[1]. This paper proposes a new evaluation measure Q-statistic that incorporates the stability of the selected feature subset in addition to the prediction accuracy. Then we propose the Booster of an FS algorithm that boosts the value of the Q statistic of the algorithm applied.
References
- I.H. Witten, E. Frank and M.A. Hall, Data mining practical machine learning tools and techniques, Morgan Kaufmann publisher, Burlington 2011
- J. Han and M. Kamber, Data mining concepts and techniques, Morgan Kaufmann, San Francisco 2006
- T.J. Shan, H. Wei and Q. Yan, "Application of genetic algorithm in data mining", 1st Int Work Educ Technol Comput Sci, IEEE 2, 2009, pp. 353- 356
- Z.Z. Shi, Knowledge discovery, Tsinghua University Press, Beijing, 2001
- D. Pyle, Data preparation for data mining, 1st Vol., Morgan Kaufmann publisher, San Francisco, 1999
- I. Guyon, N. Matic and V. Vapnik, "Discovering informative patterns and data cleaning", In: Fayyad UM, Piatetsky-Shapiro G, Smyth P and Uthurusamy R. (ed) Advances in knowledge discovery and data mining, AAAI/MIT Press, California, 1996, pp. 181- 203
- E. Simoudis, B. Livezey B and R. Kerber R , "Integrating inductive and deductive reasoning for data mining", In: Fayyad UM, Piatetsky-Shapiro G, Smyth P and Uthurusamy R. (Eds.) Advances in knowledge discovery and data mining, AAAI/MIT Press, California, 1996, pp. 353-373
- B. Pfahringer, "Supervised and unsupervised discretization of continuous features", Proc. 12th Int. Conf. Machine Learning, 1995, pp. 456-463.
- J. Catlett, "On changing continuous attributes into ordered discrete attributes", In Y. Kodratoff (ed), Machine Learning—EWSL-91, Springer-Verlag, New York,1991, pp 164-178
- W. Daelemans, V. Hoste, F.D. Meulder and B. Naudts, "Combined Optimization of Feature Selection and Algorithm Parameter Interaction in Machine Learning of Language", Proceedings of the 14th European Conference on Machine Learning (ECML-2003), Lecture Notes in Computer Science 2837, Springer-Verlag, Cavtat-Dubrovnik, Croatia, 2003, pp. 84-95
- M.L. Raymer, W.F. Punch, E.D. Goodman, L.A. Kuhn and A.K. Jain, "Dimensionality Reduction Using Genetic Algorithms", IEEE Transactions On Evolutionary Computation, Vol. 4, No. 2, 2000
- Y. Saeys, I. Inza and P. Larranaga, "A review of feature selection techniques in bioinformatics", Bioinformatics-19, 2007, pp. 2507–17.
- G.L. Pappa and A.A. Freitas, Automating the Design of Data Mining Algorithms. An Evolutionary Computation Approach, Natural Computing Series, Springer, 2010
- A. Darwiche, Modeling and Reasoning with Bayesian Networks, Cambridge University Press, 2009
- G.F. Cooper, P. Hennings-Yeomans, S. Visweswaran and M. Barmada, "An Efficient Bayesian Method for Predicting Clinical Outcomes from Genome-Wide Data", AMIA 2010 Symposium Proceedings, 2010, pp. 127-131
- M. Garofalakis, D. Hyun, R. Rastogi and K. Shim, "Building Decision Trees with Constraints", Data Mining and Knowledge Discovery, vol. 7, no. 2, 2003, pp. 187 – 214
- T.M. Mitchell, Machine Learning, McGraw-Hill Companies, USA, 1997
- Y. Singh Y, A.S. Chauhan, "Neural Networks in Data Mining", Journal of Theoretical and Applied Information Technology, 2005, pp. 37-42
Downloads
Published
Issue
Section
License
Copyright (c) IJSRSET

This work is licensed under a Creative Commons Attribution 4.0 International License.