Document Categorization by using Weighted J48 Classifier


Sonali Suskar, Dr. S. D. Babar
In the field of information retrieval text categorization is the key research area in present. The text categorization selects entries from set of prebuilt categories and allots those to a document. Learning with high dimensional data space is challenging in a text categorization method. Learning with high-dimensional features may prompt a heavy calculation overhead and may affect the classification performance of classifiers because of unrelated and repetitive features. To improve the “scourge of dimensionality “issue and to accelerate the learning procedure of classifiers, it is important to perform feature reduction to reduce the size of features. This paper introduces a Bayesian arrangement approach and WeightedJ48 classifier for auto text categorization using class-specific features. For text classification, the proposed strategy selects a specific feature subset for every class. The presented system reconstructs PDF in raw data space from class specific PDF in low dimensional feature space and assembles Bayes classification rule utilizing Baggenstoss PDF Projection Theorem. The detectable importance of this methodology is that many feature selection criteria. The WeightedJ48 classifier saves the time and memory. The proposed system also uses Term weighting concept for pre-processing. These methods increase the accuracy of classification, feature selection process, and improve the system performance.

Sonali Suskar, Dr. S. D. Babar

Text categorization, class-specific features, Feature selection, PDF projection and estimation, dimension reduction, WeightedJ48, Term weighting.

Volume 4 | Issue 9 | July-August - 2018
2018-07-30 2395-1990 2394-4099
51-58 IJSRSET18495   Technoscience Academy

Sonali Suskar, Dr. S. D. Babar, "Document Categorization by using Weighted J48 Classifier", International Journal of Scientific Research in Science, Engineering and Technology(IJSRSET), Volume 4, Issue 9, pp.51-58, July-August-2018.
