IJSRSET calls volunteers interested to contribute towards the scientific development in the field of Science, Engineering and Technology

Home > IJSRSET184425                                                     


Investigation of Performance Analysis of Classification Algorithm in Data Mining

Authors(2):

Dr. Mohd Ashraf, Dr. Zair Hussain
  • Abstract
  • Authors
  • Keywords
  • References
  • Details
Data mining is now one of the most active field of research. Extracting those nuggets of information is becoming crucial and one of its important technique is classification. It helps to group the data in some predefined classes. Various techniques for classification exists which classifies the data using different algorithms. Each algorithm has its own area of best and worst performance. This paper concentrates on the four most famous algorithms, i.e., Decision Tree, Na´ve Bayes, K Nearest Neighbour and Genetic Programming and the effect on their performance of time and accuracy when the number of instances are incrementally decreased. This paper will also investigate the difference in result when working with binary class or multiclass datasets and suggest the algorithms to follow when using certain kind of dataset.

Dr. Mohd Ashraf, Dr. Zair Hussain

Decision Tree, Na´ve Bayes, K-Nearest Neighbor, Genetic Programming, Accuracy

  1. Radhika Kotecha, Vijay Ukani and Sanjay Garg, "An Empirical Analysis of Multiclass Classification Techniques in Data Mining", INTERNATIONAL CONFERENCE ON CURRENT TRENDS IN TECHNOLOGY, Vol.2, NUiCONE, DECEMBER, 2011
  2. Ian H. Witten, Eibe Frank, Mark A. Hall, "What’s It All About?," ] Data Mining Practical Machine Learning Tools and Techniques, Third Edition. USA, 2011.
  3. Wikipedia. (2014, November, 11), Data MiningOnline].Available: http://en.wikipedia.org/wiki/Data_mining
  4. Matthieu Cord, and Sarah Jane Delany, "Supervised Learning," P´adraig Cunningham.
  5. Harvinder Chauhan, Anu Chauhan, "Evaluating Performance of Decision Tree Algorithms," International Journal of Scientific and Research Publications, Volume 4, Issue 4, April 2014
  6. UCI repository. (2008, July, 15). Index of /Datasets/UCI/arff Online]. Available: http://repository.seasr.org/Datasets/UCI/arff/
  7. leyan. (2013, April, 05). Genetic Programming Classifier for Weka Online]. Available: http://sourceforge.net/projects/wekagp/
  8. Machine Learning Group at the University of Waikato. (2014). Weka 3: Data Mining Software in Java Online]. Availabe: http://www.cs.waikato.ac.nz/ml/weka/
  9. Jiawei Han and Micheline Kamber, "Introduction," Data Mining: Concepts and Techniques, Second Edition. University of Illinois at Urbana-Champaign , USA, 2006.
  10. Medeswara Rao, Kondamudi, Sudhir Tirumalasetty, "Improved Clustering And Naïve Bayesian Based Binary Decision Tree With Bagging Approach," International Journal of Computer Trends and Technology (IJCTT) - volume 5 number 2 -Nov 2013
  11. MIT Press. (2013). The GP Tutorial Online]. Available: http://www.geneticprogramming.com/Tutorial/
  12. R.S. Michalski and R.L. Chilausky "Learning by Being Told and Learning from Examples: An Experimental Comparison of the Two Methods of Knowledge Acquisition in the Context of Developing an Expert System for Soybean Disease Diagnosis", International Journal of Policy Analysis and Information Systems, Vol. 4, No. 2, 1980.

Publication Details

Published in : Volume 4 | Issue 4 | March-April - 2018
Date of Publication Print ISSN Online ISSN
2018-04-30 2395-1990 2394-4099
Page(s) Manuscript Number   Publisher
58-66 IJSRSET184425   Technoscience Academy

Cite This Article

Dr. Mohd Ashraf, Dr. Zair Hussain, "Investigation of Performance Analysis of Classification Algorithm in Data Mining", International Journal of Scientific Research in Science, Engineering and Technology(IJSRSET), Print ISSN : 2395-1990, Online ISSN : 2394-4099, Volume 4, Issue 4, pp.58-66, March-April-2018.
URL : http://ijsrset.com/IJSRSET184425.php