A Comprehensive and Experimental Survey on Medical Data Classification and Pattern Recognition

Authors

  • R. Subathra Devi  Research Scholar, PG and Research Department of Computer Science,Presidency College, Chennaii, Tamil Nadu, India

Keywords:

Medical support system, clinical support system, medical data classification, supervised classification, un-supervised classification, rule-based classification, DCT ,Bayesian classification, PNN artificial neural network adaptive classifier, K-NN, K-means, machine learning, svm, abdominal diseases.

Abstract

This paper is proposed to compare and analyze various type of medical data classification and pattern recognition methods. Medical data classification methods majorly divided into three categories such as supervised, classification and also semi-supervised classification. Pattern recognition and data classifications are both overlapped domain for useful knowledge generation and prediction from training data. The field of medical diagnosis (or) clinical support system needs in intelligent data classification and pattern recognition algorithms for more accuracy in clinical decision making. Supervised classification contains many methods such as rule based classification, decision tree based classification, Bayesian classification, KNN probabilistic neural network, SVM and more, combination of supervised classification called as ensemble algorithm. These types of mixed algorithm provide more accuracy. Unsupervised classification called lazy learner (or) clustering for example automatic classification of unlabeled data. Unsupervised classification also contains some types such as K-means, deep learning methods, hierarchical clustering and more. In this paper we have to analyze various types of classification algorithms using sample medical record of upper abdomen diseases database. In this paper we have to analyze maximum of algorithms in experimental using same training data, this will used for various performance and accuracy analysis.

References

  1. Sneha Chandra and Maneet Kaur, "Creation of an Adaptive Classifier to Enhance the Classification Accuracy of Existing Classification Algorithms in the Field of Medical Data Mining, 2015 2nd International Conference on Computing for Sustainable Global Development (INDIACom).
  2. Hernan Dar ?o Vargas Cardona, AlvaroAngel Orozco and Mauricio A.Alvarez , "Unsupervised Learning applied in MER and ECG Signals through Gaussians Mixtures with the Expectation-Maximization Algorithm and Variational Bayesian Inference", 35th Annual International Conference of the IEEE EMBS Osaka, Japan, 3 - 7 July, 2013.
  3. Devendra Naga, Dr. Swati Sharma, "Simultaneous 12-Lead QRS Detection by K-means Clustering Algorithm", IEEE International Conference on Recent Advances and Innovations in Engineering (ICRAIE-2014), May 09-11, 2014, Jaipur India.
  4. Mohammadreza Balouchestani, Member, IEEE and Sridhar Krishnan, Senior Member, IEEE, "Fast Clustering Algorithm for Large ECG Data Sets Based on CS theory in Combination with PCA and K-NN Methods", 978-1-4244-7929-0/14/$26.00 ©2014 IEEE.
  5. Su Liu, Nuri F. Ince, Senior Member IEEE, Akin Sabanci, Aydin Aydoseli, Yavuz Aras, Altay Sencer,Nerses Bebek, Zhiyi Sha and Candan Gurses, "Detection of High Frequency Oscillations in Epilepsy with K-means Clustering Method", 7th Annual International IEEE EMBS Conference on Neural Engineering Montpellier, France, 22 - 24 April, 2015.
  6. U. Fayyad, G. Piatetsky-Shapiro and P. Smyth, "From data mining to knowledge discovery in databases", Commun. ACM, vol. 39, no. 11, (1996), pp. 24-26.
  7. C. McGregor, C. Christina and J. Andrew, "A process mining driven framework for clinical guideline improvement in critical care", Learning from Medical Data Streams 13th Conference on Artificial Intelligence in Medicine (LEMEDS). http://ceur-ws. org, vol. 765, (2012).
  8. M. Silver, T. Sakara, H. C. Su, C. Herman, S. B. Dolins and M. J. O’shea, "Case study: how to apply data mining techniques in a healthcare data warehouse", Healthc. Inf. Manage, vol. 15, no. 2, (2001), pp. 155-164.
  9. P. R. Harper, "A review and comparison of classification algorithms for medical decision making", Health Policy, vol. 71, (2005), pp. 315-331.
  10. V. S. Stel, S. M. Pluijm, D. J. Deeg, J. H. Smit, L. M. Bouter and P. Lips, "A classification tree for predicting recurrent falling in community-dwelling older persons", J. Am. Geriatr. Soc., vol. 51, (2003), pp. 1356-1364.
  11. R. Bellazzi and B. Zupan, "Predictive data mining in clinical medicine: current issues and guidelines", Int. J. Med. Inform., vol. 77, (2008), pp. 81-97.
  12. R. D. Canlas Jr., "Data Mining in Healthcare:Current Applications and Issues", (2009).
  13. F. Hosseinkhah, H. Ashktorab, R. Veen, M. M. Owrang O., "Challenges in Data Mining on Medical Databases", IGI Global, (2009), pp. 502-511.
  14. M. Kumari and S. Godara, "Comparative Study of Data Mining Classification Methods in Cardiovascular Disease Prediction", IJCST ISSN: 2229- 4333, vol. 2, no. 2, (2011) June.
  15. J. Soni, U. Ansari, D. Sharma and S. Soni, "Predictive Data Mining for Medical Diagnosis: An Overview of Heart Disease Prediction", (2011).
  16. C. S. Dangare and S. S. Apte, "Improved Study of Heart Disease Prediction System Using Data Mining Classification Techniques", (2012).
  17. K. Srinivas, B. Kavihta Rani and Dr. A.Govrdhan, "Applications of Data Mining Techniques in Healthcare and Prediction of Heart Attacks", International Journal on Computer Science and Engineering, vol. 02, no. 02, (2010), pp. 250-255.
  18. A. A. Aljumah, M. G.Ahamad and M. K. Siddiqui, "Predictive Analysis on Hypertension Treatment Usinging Approach in Saudi Arabia", Intelligent Information Management, vol. 3, (2011), pp. 252-261.
  19. D. Delen, "Analysis of cancer data: a data mining approach", (2009).
  20. A. O. Osofisan, O. O. Adeyemo, B. A. Sawyerr and O. Eweje, "Prediction of Kidney Failure Using Artificial Neural Networks", (2011).
  21. S. Floyd, "Data Mining Techniques for Prognosis in Pancreatic Cancer", (2007).
  22. M.-J. Huang, M.-Y. Chen and S.-C. Lee, "Integrating data mining with case-based reasoning for chronic diseases prognosis and diagnosis", Expert Systems with Applications, vol. 32, (2007), pp. 856-867.
  23. S. Gupta, D. Kumar and A. Sharma, "Data Mining Classification Techniques Applied For Breast Cancer Diagnosis And Prognosis", (2011).
  24. K. S. Kavitha, K. V. Ramakrishnan and M. K. Singh, "Modeling and design of evolutionary neural network for heart disease detection", IJCSI International Journal of Computer Science Issues, ISSN (Online): 1694- 0814, vol. 7, no. 5, (2010) September, pp. 272-283.
  25. S. H. Ha and S. H. Joo, "A Hybrid Data Mining Method for the Medical Classification of Chest Pain", International Journal of Computer and Information Engineering, vol. 4, no. 1, (2010), pp. 33-38.
  26. R. Parvathi and S. Palaniammalì, "An Improved Medical Diagnosing Technique Using Spatial Association Rules", European Journal of Scientific Research ISSN 1450-216X, vol. 61, no. 1, (2011), pp. 49-59.
  27. S. Chao and F. Wong, "An Incremental Decision Tree Learning Methodology Regarding Attributes in Medical Data Mining", (2009).
  28. A. Habrard, M. Bernard and F. Jacquenet, "Multi-Relational Data Mining in Medical Databases", SpringerVerlag, (2003).
  29. S. B. Patil and Y. S. Kumaraswamy, "Intelligent and Effective Heart Attack Prediction System Using Data Mining and Artificial Neural Network", European Journal of Scientific Research ISSN 1450-216X, EuroJournals Publishing, Inc., vol. 31, no. 4, (2009), pp. 642-656.
  30. A.Shukla, R. Tiwari, P. Kaur, Knowledge Based Approach for Diagnosis of Breast Cancer, IEEE International Advance Computing Conference,IACC 2009.
  31. L. Duan, W. N. Street & E. Xu, Healthcare information systems: data mining methods in the creation of a clinical recommender system, Enterprise Information Systems, 5:2, pp169-181 , 2011.
  32. D. S. Kumar, G. Sathyadevi and S. Sivanesh, "Decision Support System for Medical Diagnosis Using Data Mining", (2011).
  33. S. Palaniappan and R. Awang, "Intelligent Heart Disease Prediction System Using Data Mining Techniques", (2008).
  34. H. Hu, J. Li, A. Plank, H. Wang and G. Daggard, "A Comparative Study of Classification Methods For Microarray Data Analysis", Proc. Fifth Australasian Data Mining Conference (AusDM2006), Sydney, Australia. CRPIT, ACS, vol. 61, (2006), pp. 33-37.
  35. C. Hattice and K. Metin, "A Diagnostic Software tool for Skin Diseases with Basic and Weighted K-NN", Innovations in Intelligent Systems and Applications (INISTA), (2012).
  36. R. Potter, "Comparison of classification algorithms applied to breast cancer diagnosis and prognosis", advances in data mining, 7th Industrial Conference, ICDM 2007, Leipzig, Germany, (2007) July, pp. 40-49.
  37. G. Beller, "The rising cost of health care in the United States: is it making the United States globally noncompetitive?", J. Nucl. Cardiol., vol. 15, no. 4, (2008), pp. 481-482.
  38. D. Bertsimas, M. V. Bjarnadóttir, M. A. Kane, J. C. Kryder, R. Pandey, S. Vempala and G. Wang, "Algorithmic prediction of health-care costs", Oper. Res., vol. 56, no. 6, (2008), pp. 1382-1392.
  39. C. H. Jena, C. C. Wang, B. C. Jiangc, Y. H. Chub and M. S. Chen, "Application of classification techniques on development an early-warning systemfor chronic illnesses", Expert Systems with Applications, vol. 39, (2012), pp. 8852-8858.
  40. M. Shouman, T. Turner and R. Stocker, "Applying K-Nearest Neighbour in Diagnosing Heart Disease Patients", International Conference on Knowledge Discovery (ICKD-2012), (2012).
  41. D. Y. Liu, H. L. Chen, B. Yang, X. E. Lv, N. L. Li and J. Liu, "Design of an Enhanced Fuzzy k-nearest Neighbor Classifier Based Computer Aided Diagnostic System for Thyroid Disease", Journal of Medical System, Springer, (2012).
  42. W. L. Zuoa, Z. Y. Wanga, T. Liua and H. L. Chenc, "Effective detection of Parkinson’s disease using an adaptive fuzzy k-nearest neighbor approach", Biomedical Signal Processing and Control, Elsevier, (2013), pp. 364-373.
  43. Goharian & Grossman, Data Mining Classification, Illinois Institute of Technology, http://ir.iit.edu/~nazli/cs422/CS422-Slides/DM-Classification.pdf, (2003).
  44. Apte & S.M. Weiss, Data Mining with Decision Trees and Decision Rules, T.J. Watson Research Center, http://www.research.ibm.com/dar/papers/pdf/fgcsaptewe issue_with_cover.pdf, (1997).
  45. M. U. Khan, J. P. Choi, H. Shin and M. Kim, "Predicting Breast Cancer Survivability Using Fuzzy Decision Trees for Personalized Healthcare", 30th Annual International IEEE EMBS Conference Vancouver, British Columbia, Canada, (2008) August 20-24.
  46. C. Chien and G. J. Pottie, "A Universal Hybrid Decision Tree Classifier Design for Human Activity Classification", 34th Annual International Conference of the IEEE EMBS San Diego, California USA, (2012) August 28-September 1.
  47. S. S. Moon, S. Y. Kang, W. Jitpitaklert and S. B. Kim, "Decision tree models for characterizing smoking patterns of older adults", Expert Systems with Applications, Elsevier, vol. 39, (2012), pp. 445-451.
  48. C. L. Chang and C. H. Chen, "Applying decision tree and neural network to increase quality of dermatologic diagnosis", Expert Systems with Applications, Elsevier, vol. 36, (2009), pp. 4035-4041.
  49. V. Vapnik, "Statistical Learning Theory", Wiley, (1998).
  50. V. Vapnik, "The support vector method of function estimation", (1998).
  51. N. Chistianini and J. Shawe-Taylor, "An Introduction to Support Vector Machines, and other kernel-based learning methods", Cambridge University Press, (2000).
  52. N. Cristianini and J. Shawe-Taylor, "An Introduction to Support Vector Machines", Cambridge University Press, (2000).
  53. T. H. A. Soliman, A. A. Sewissy and H. A. Latif, "A Gene Selection Approach for Classifying Diseases Based on Microarray Datasets", 2nd International Conference on Computer Technology and Development (lCCTD 2010), (2010).
  54. S. W. Fei, "Diagnostic study on arrhythmia cordis based on particle swarm optimization-based support vector machine", Expert Systems with Applications, Elsevier, vol. 37, (2010), pp. 6748-6752.
  55. C. L. Huang, H. C. Liao and M. C. Chen, "Prediction model building and feature selection with support vector machines in breast cancer diagnosis", Expert Systems with Applications, vol. 34, (2008), pp. 578-587.
  56. E. Avci, "A new intelligent diagnosis system for the heart valve diseases by using genetic-SVM classifier", Expert Systems with Applications, Elsevier, vol. 36, (2009), pp. 10618-10626.
  57. M. J. Abdi and D. Giveki, "Automatic detection of erythemato-squamous diseases using PSO–SVM based on association rules", Engineering Applications of Artificial Intelligence, vol. 26, (2013), pp. 603-608.
  58. M. H. Dunham, "Data mining introductory and advanced topics", Upper Saddle River, NJ: Pearson Education, Inc., (2003).
  59. O. Er, N. Yumusakc and F. Temurtas, "Chest diseases diagnosis using artificial neural networks", Expert Systems with Applications, vol. 37, (2010), pp. 7648-7655.
  60. R. Das, I. Turkoglub and A. Sengur, "Effective diagnosis of heart disease through neural networks ensembles", Expert Systems with Applications, vol. 36, (2009), pp. 7675-7680.
  61. S. Gunasundari and S. Baskar, "Application of Artificial Neural Network in identification of Lung Diseases", Nature & Biologically Inspired Computing, 2009. NaBIC 2009. World Congress on. IEEE, (2009).
  62. K. F. R. Liu and C. F. Lu, "BBN-Based Decision Support for Health Risk Analysis", Fifth International Joint Conference on INC, IMS and IDC, (2009).
  63. D. I. Curiac, G. Vasile, O. Banias, C. Volosencu and A. Albu, "Bayesian Network Model for Diagnosis of Psychiatric Diseases", Proceedings of the ITI 2009 31st Int. Conf. on Information Technology Interfaces, Cavtat, Croatia, (2009) June 22-25.
  64. J. Fox, "Applied Regression Analysis, Linear Models, and Related Methods", (1997).
  65. P. A. Gutiérrez, C. Hervás-Martínez and F. J. Martínez-Estudillo, "Logistic Regression by Means of Evolutionary Radial Basis Function Neural Networks", IEEE Transactions on Neural Networks, vol. 22, no. 2, (2011), pp. 246-263.
  66. C. Gennings, R. Ellis and J. K. Ritter, "Linking empirical estimates of body burden of environmental chemicals and wellness using NHANES data", http://dx.doi.org/10.1016/j.envint.2011.09.002,2011.
  67. Divya and S. Agarwal, "Weighted Support Vector Regression approach for Remote Healthcare monitoring", IEEE-International Conference on Recent Trends in Information Technology, ICRTIT 2011, 978-1-4577- 0590-8/11/$26.00 © 2011 IEEE MIT, Anna University, Chennai, (2011) June 3-5.
  68. J. J. Tapia, E. Morett and E. E. Vallejo, "A Clustering Genetic Algorithm for Genomic Data Mining", Foundations of Computational Intelligence, vol. 4 Studies in Computational Intelligence, vol. 204, (2009), pp. 249-275.
  69. A. K. Jain, M. N. Murty and P. J. Flynn, "Data Clustering: a review", ACM Compute, Surveys, vol. 31, (1996).
  70. G. Hamerly and C. Elkan, "Learning the K in K-means", Proceedings of the 17th Annual Conference on Neural Information Processing Systems, British Columbia, Canada, (2003).
  71. L. Lenert, A. Lin, R. Olshen and C. Sugar, "Clustering in the Service of the Public's Health", http://wwwstat.stanford.edu/~olshen/manuscripts/helsinki.PDF.
  72. S. Belciug, F. Gorunescu, A. Salem and M. Gorunescu, "Clustering-based approach for detecting breast cancer recurrence", 10th International Conference on Intelligent Systems Design and Applications, (2010).
  73. T. Balasubramanian and R. Umarani, "An Analysis on the Impact of Fluoride in Human Health (Dental) using Clustering Data mining Technique", Proceedings of the International Conference on Pattern Recognition, Informatics and Medical Engineering, (2012) March 21-23.
  74. J. Escudero, J. P. Zajicek and E. Ifeachor, "Early Detection and Characterization of Alzheimer’s Disease in Clinical Scenarios Using Bioprofile Concepts and K-Means", 33rd Annual International Conference of the IEEE EMBS Boston, Massachusetts USA, (2011) August 30-September 3.
  75. H. Chipman and R. Tibshirani, "Hybrid hierarchical clustering with applications to microarray data", Biostatistics, vol. 7, no. 2, (2009), pp. 286-301.
  76. T. S. Chen, T. H. Tsai, Y. T. Chen, C. C. Lin, R. C. Chen, S. Y. Li and H. Y. Chen, "A Combined K-Means and Hierarchical Clustering Method for improving the Clustering Efficiency of Microarray", Proceedings of 2005 International Symposium on Intelligent Signal Processing and Communication Systems, (2005).
  77. S. Belciug, "Patients length of stay grouping using the hierarchical clustering algorithm", Annals of University of Craiova, Math. Comp. Sci. Ser., ISSN: 1223-6934, vol. 36, no. 2, (2009), pp. 79-84.
  78. Z. Liu, T. Sokka, K. Maas, N. J. Olsen and T. M. Aune, "Prediction of Disease Severity in Patients with Early Rheumatoid Arthritis by Gene Expression Profiling", Human Genomics and Proteomics, (2009).
  79. M. E. Celebi, Y. A. Aslandogan and R. P. Bergstresser, "Mining Biomedical Images with Density-based Clustering", Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC’05), (2005).
  80. R. Agrawal, T. Imielinski and A. N. Swami, "Mining Association Rules between Sets of Items in Large Databases. SIGMOD", vol. 22, no. 2, (1993) June, pp. 207-16.
  81. R. Agrawal and R. Srikant, "Fast Algorithms for Mining Association Rules", VLDB, Chile, ISBN 1-55860- 153-8, (1994) September 12-15, pp. 487-99.
  82. J. Yanqing, H. Ying, J. Tran, P. Dews, A. Mansour and R. Michael Massanari, "Mining Infrequent Causal Associations in Electronic Health Databases", 11th IEEE International Conference on Data Mining Workshops, (2011).

Downloads

Published

2018-04-30

Issue

Section

Research Articles

How to Cite

[1]
R. Subathra Devi, " A Comprehensive and Experimental Survey on Medical Data Classification and Pattern Recognition, International Journal of Scientific Research in Science, Engineering and Technology(IJSRSET), Print ISSN : 2395-1990, Online ISSN : 2394-4099, Volume 4, Issue 4, pp.1521-1537, March-April-2018.