Machine Learning for the Diagnosis and Prognosis of Chronic Illnesses


  • Kajal Assistant Professor, Department of Computer Science & Engineering, Khwaja Moinuddin Chishti Language University (KMCLU), Lucknow, U.P, India Author
  • Kanchan Saini Assistant Professor, Department of Computer Science & Engineering, Khwaja Moinuddin Chishti Language University (KMCLU), Lucknow, U.P, India Author
  • Dr. Nikhat Akhtar Associate Professor, Department of Information Technology, Goel Institute of Technology & Management, Lucknow, U.P, India Author
  • Prof. (Dr.) Devendra Agarwal Dean (Academics), Goel Institute of Technology & Management, Lucknow, U.P, India Author
  • Ms. Sana Rabbani Assistant Professor, Department of Information Technology, Goel Institute of Technology & Management, Lucknow, U.P, India Author
  • Dr. Yusuf Perwej Professor, Department of Computer Science & Engineering, Goel Institute of Technology & Management, Lucknow, U.P, India Author



Symptoms, Data-Driven, Kaggle Dataset, Machine Learning, Disease Prediction, Healthcare, Disease Detection, ExtRa Trees


An essential part of healthcare is disease prediction, which seeks to identify people who are at risk of getting certain diseases. Because of their superior capacity to sift through massive datasets in search of intricate patterns, machine learning algorithms have recently become useful instruments in the fight against illness prediction. The goal of this project is to make it easier for people to diagnose their own health problems using just their symptoms and precise vital signs. Due to excessive medical expenditures, many people put off taking care of their health, which can result in worsening symptoms or even death. Medical expenses can be overwhelming for people without health insurance. Using machine learning methods like ExtRa Trees, the suggested system provides a general illness forecast based on patients' symptoms. The algorithm provides a possible diagnosis based on the user's age, gender, and symptoms, suggesting that the user may be experiencing a certain illness. The system also suggests healthy eating and exercise routines to help lessen the impact of the condition, depending on how bad it is. Lastly, this article has shown a comparison examination of the suggested system using several algorithms including logistic regression, decision tree, and Naïve Bayes. The efficiency and accuracy of illness prediction are both enhanced by the suggested model.


Download data is not yet available.


Zhou, S.-M., Fernandez-Gutierrez, F., Kennedy, J., Cooksey, R., Atkinson, M., Denaxas, S., Siebert, S., Dixon, W.G., O’Neill, T.W. and Choy, E., "Defining disease phenotypes in primary care electronic health records by a machine learning approach: A case study in identifying rheumatoid arthritis", PloS One, Vol. 11, No. 5, 2016 DOI:

Y. Perwej, Dr. Faiyaz Ahamad, Dr. Mohammad Zunnun Khan, Nikhat Akhtar, “An Empirical Study on the Current State of Internet of Multimedia Things (IoMT)”, International Journal of Engineering Research in Computer Science and Engineering (IJERCSE), ISSN (Online) 2394-2320, Volume 8, Issue 3, Pages 25 - 42, 2021, doi: 10.1617/vol8/iss3/pid85026

Littell, C.L., "Innovation in medical technology: Reading the indicators", Health Affairs, Vol. 13, No. 3, (1994), 226-235. DOI:

Kelly, C.J. and Young, A.J., "Promoting innovation in healthcare", Future Healthcare Journal, Vol. 4, No. 2, (2017), 121. doi: 10.7861/futurehosp.4-2-121 DOI:

Mobeen, A., Shafiq, M., Aziz, M.H. and Mohsin, M.J., "Impact of workflow interruptions on baseline activities of the doctors working in the emergency department", BMJ Open Quality, Vol. 11, No. 3, 2022 DOI:

Ahmed, S., Szabo, S. and Nilsen, K., "Catastrophic healthcare expenditure and impoverishment in tropical deltas: Evidence from the mekong delta region", International Journal for Equity in Health, Vol. 17, No. 1, 1-13, 2018 DOI:

Roberts, M.A. and Abery, B.H., "A person-centered approach to home and community-based services outcome measurement", Frontiers in rehabilitation Sci., Vol. 4, 2023 DOI:

Y. Perwej, Mohammed Y. Alzahrani, F. A. Mazarbhuiya, Md. Husamuddin, “The State-of-the-Art Cardiac Illness Prediction Using Novel Data Mining Technique”, International Journal of Engineering Sciences & Research Technology (IJESRT), ISSN: 2277-9655, Volume 7, Issue 2, Pages 725-739, 2018, DOI: 10.5281/zenodo.1184068

Gorfine, M., Hsu, L., Zucker, D., Parmigiani, G., 2014. Calibrated predictions for multivariate competing risks models. Lifetime Data Analysis, 20, pp. 234-251. DOI:

Nadella, P., Swaminathan, A., Subramanian, S., 2020. Forecasting ef forts from prior epidemics and COVID-19 predictions. European Journal of Epidemiology, 35, pp. 727- 729. 00661-0. 11Tyagi, A., 2021. Prediction Models. Handbook of Research on Disease Prediction Through Data Analytics and Machine Learning. DOI:

J. Mishra and S. Tarar, “Chronic disease prediction using deep learning,” in Proceedings of the International Conference on Advances in Computing and Data Sciences, pp. 201–11, Springer, Valletta, Malta, April 2020. DOI:

Y. Perwej, Md. Husamuddin, Fokrul Alom Mazarbhuiya,“An Extensive Investigate the MapReduce Technology”, International Journal of Computer Sciences and Engineering (IJCSE), E-ISSN : 2347-2693, Volume-5, Issue-10, Page No. 218-225, 2017, DOI: 10.26438/ijcse/v5i10.218225

F. Ceccarelli, M. Sciandrone, C. Perricone et al., “Prediction of chronic damage in systemic lupus erythematosus by using machine-learning models,” PLoS One, vol.12, no. 3, Article ID e0174200, 2017. DOI:

Cao, J., Wang, M., Li, Y. and Zhang, Q., "Improved support vector machine classification algorithm based on adaptive feature weight updating in the hadoop cluster environment", PloS One, Vol. 14, No. 4, 2019 DOI:

Y. Perwej, S. A. Hann, N.t Akhtar, “The State-of-the-Art Handwritten Recognition of Arabic Script Using Simplified Fuzzy ARTMAP and Hidden Markov Models”, International Journal of Computer Science and Telecommunications (IJCST), Sysbase Solution (Ltd), UK, London, ISSN 2047-3338, Volume, Issue 8, Pages, 26 - 32, 2014

Hamidi, H. and Daraee, A., "Analysis of pre-processing and post-processing methods and using data mining to diagnose heart diseases", International Journal of Engineering, Transactions B: Applications, Vol. 29, No. 7, 921-930, 2016 DOI:

Maurano, M., Humbert, R., Rynes, E., Thurman, R., Haugen, E., Wang, H., Reynolds, A., Sandstrom, R., Qu, H., Brody, J., 2012. Systematic Lo calization of Common Disease-Associated Variation in Regulatory DNA. Science, 337, pp. 1190- 1195. DOI:

Kumar, A., 2021. Disease Prediction and Doctor Recommenda tion System using Machine Learning Approaches. International Jour nal for Research in Applied Science and Engineering Technology. DOI:

Y. Perwej, Firoj Parwej, Nikhat Akhtar, “An Intelligent Cardiac Ailment Prediction Using Efficient ROCK Algorithm and K- Means & C4.5 Algorithm”, European Journal of Engineering Research and Science (EJERS), Bruxelles, Belgium, ISSN: 2506-8016 (Online), Vol. 3, No. 12, Pages 126 – 134, 2018, DOI: 10.24018/ejers.2018.3.12.989 DOI:

Patil, K., Pawar, S., Sandhyan, P., Kundale, J., 2022. Multiple Disease Prognostication Based On Symptoms Using Machine Learning Techniques. ITM Web of Conferences. DOI:

T. Takura, K. H. Goto, and A. Honda, “Development of a predictive model for integrated medical and long-term care resource consumption based on health behaviour: application of healthcare big data of patients with circulatory diseases,” BMC Medicine, vol. 19, no. 1, pp. 1–16, 2021. DOI:

D. Zufferey, T. Hofer, H. Jean, M. Schumacher, R. Ingold, and S. Bromuri, “Performance comparison of multi-label learning algorithms on clinical data for chronic diseases,” Computers in Biology and Medicine, vol. 65, pp. 34–43, 2015. DOI:

L. Beretta and A. Santaniello, “Nearest neighbor imputation algorithms: a critical evaluation,” BMC Medical Informatics and Dec.Making, vol. 16, no. 3, pp. 197–208, 2016. DOI:

K. Deepika and S. Seema, “Predictive analytics to prevent and control chronic diseases,” in Proceedings of the 2016 2nd In ternational Conference on Applied and eoretical Computing and Communication Technology (ICATccT), pp. 381–86, IEEE, Bangalore, India, July 2016. DOI:

Y. Perwej, “An Evaluation of Deep Learning Miniature Concerning in Soft Computing”, International Journal of Advanced Research in Computer and Communication Engineering (IJARCCE), ISSN (Online): 2278-1021, ISSN (Print): 2319-5940, Volume 4, Issue 2, Pages 10 - 16, 2015, DOI: 10.17148/IJARCCE.2015.4203 DOI:

H. Fr¨ohlich, R. Balling, N. Beerenwinkel et al., “From hype to reality: data science enabling personalized medicine,” BMC Medicine, vol. 16, no. 1, pp. 1–15, 2018. DOI:

Y. Perwej, “The Bidirectional Long-Short-Term Memory Neural Network based Word Retrieval for Arabic Documents”, Transactions on Machine Learning and Artificial Intelligence (TMLAI), which is published by Society for Science and Education, United Kingdom (UK), ISSN 2054-7390, Volume 3, Issue 1, Pages 16 - 27, 2015, DOI: 10.14738/tmlai.31.863 DOI:

D. J. Park, M. W. Park, H. Lee, Y.-J. Kim, Y. Kim, and Y. H. Park, “Development of machine learning model for diagnostic disease prediction based on laboratory tests,” Scientific Reports, vol. 11, no. 1, pp. 1–11, 2021 DOI:

Md R. Hoque andM.Sajedur Rahman, “Predictive modelling for chronic disease: machine learning approach,” in Pro ceedings of the 2020 the 4th International Conference on Compute and Data Analysis, pp. 97–101, Silicon Valley, CA, USA, March 2020. DOI:

Y. Perwej, Shaikh Abdul Hannan, N. Akhtar, “The State-of-the-Art Handwritten Recognition of Arabic Script Using Simplified Fuzzy ARTMAP and Hidden Markov Models”, International Journal of Computer Science and Telecommunications (IJCST), UK, London, ISSN 2047-3338, Volume, Issue 8, Pages 26 - 32, 2014

Y. Perwej, Md. Husamuddin, Fokrul Alom Mazarbhuiya,“An Extensive Investigate the MapReduce Technology”, International Journal of Computer Sciences and Engineering (IJCSE), Volume-5, Issue-10, Page No. 218-225, 2017, DOI: 10.26438/ijcse/v5i10.218225 DOI:

N.Akhtar, Devendera Agarwal, “An Efficient Mining for Recommendation System for Academics”, International Journal of Recent Technology and Engineering (IJRTE), ISSN 2277-3878 (online), SCOPUS, Volume-8, Issue-5, Pages 1619-1626, 2020, DOI: 10.35940/ijrte.E5924.018520 DOI:

C. Zhenhai and Liu. Wei, "Logistic Regression Model and Its Application", Journal of Yanbian University (Natural Science Edition), vol. 38, no. 01, pp. 28-32, 2012

Firoj Parwej, Nikhat Akhtar, Y. Perwej, “A Close-Up View About Spark in Big Data Jurisdiction”, International Journal of Engineering Research and Application (IJERA), ISSN: 2248-9622, Volume 8, Issue 1, (Part -I1), Pages 26-41, January 2018, DOI: 10.9790/9622-0801022641

M. Liu, X. Xu, Y. Tao and X. Wang, "An improved random forest method based on RELIEFF for medical diagnosis", 2017 IEEE International Conference on Computational Science and Engineering (CSE) and IEEE International Conference on Embedded and Ubiquitous Computing (EUC), vol. 1, pp. 44-49, 2017 DOI:

R. Cuingnet, C. Rosso, M. Chupin, S. Lehéricy, D., H. Benali, et al., "Spatial regularization of SVM for the detection of diffusion alterations associated with stroke outcome", Medical Image Ana., vol. 15, no. 5, pp. 729-737, 2011 DOI:

M. R. Camana Acosta, S. Ahmed, C. E. Garcia and I. Koo, "Extremely randomized trees-based scheme for stealthy cyber-attack detection in smart grid networks", IEEE Access, vol. 8, pp. 19921-19933, 2020 DOI:

Y. Perwej, “Unsupervised Feature Learning for Text Pattern Analysis with Emotional Data Collection: A Novel System for Big Data Analytics”, IEEE International Conference on Advanced computing Technologies & Applications (ICACTA'22), SCOPUS, IEEE No: #54488 ISBN No Xplore: 978-1-6654-9515-8, Coimbatore, India, 2022, DOI: 10.1109/ICACTA54488.2022.9753501 DOI:

Shobhit Kumar Ravi, Shivam Chaturvedi, Dr. Neeta Rastogi, N. Akhtar, Y. Perwej, “A Framework for Voting Behavior Prediction Using Spatial Data”, for published in the International Journal of Innovative Research in Computer Science & Technology (IJIRCST), ISSN: 2347-5552, Volume 10, Issue 2, Pages 19-28, 2022, DOI: 10.55524/ijircst.2022.10.2.4 DOI:

S. Zhang, X. Li, M. Zong, X. Zhu and R. Wang, "Efficient kNN classification with different numbers of nearest neighbors", IEEE Trans. neural networks Learn. Syst., vol. 29, no. 5, pp. 1774-1785, 2017 DOI:

Y. Perwej, Md. Husamuddin, Dr. Majzoob K.Omer, Bedine Kerim, “A Comprehend the Apache Flink in Big Data Environments” , IOSR Journal of Computer Engineering (IOSR-JCE), e-ISSN: 2278-0661, P-ISSN: 2278-8727, USA, Volume 20, Issue 1, Ver. IV, Pages 48-58, Feb. 2018, DOI: 10.9790/0661-2001044858

P. Geurts, D. Ernst L. Wehenkel, "Extremely randomized trees", Mach. Learn., vol. 63, no. 1, pp. 3-42, Apr. 2006 DOI:

Y. Perwej, Dr. S.A. H., Firoj Parwej, Nikhat Akhtar, “A Posteriori Perusal of Mobile Computing”, International Journal of Computer Applications Technology and Research (IJCATR), which is published by ATS (Association of Technology and Science), India, ISSN 2319–8656 (Online), Volume 3, Issue 9, Pages 569 - 578, 2014, DOI: 10.7753/IJCATR0309.1008 DOI:

Ren, Q., Cheng, H. and Han, H., "Research on machine learning framework based on random forest algorithm", in AIP conference proceedings, AIP Publishing LLC. Vol. 1820, 080020, 2017 DOI:

Speiser, J.L., Miller, M.E., Tooze, J. and Ip, E., "A comparison of random forest variable selection methods for classification prediction modelling", Expert Systems with Applications, Vol. 134, 93-101, 2019 DOI:

Webb, G., & Zheng, Z.,”Multi-strategy ensemble learning: reducing error by combining ensemble learning techniques”, IEEE Transactions on Knowledge and Data Engineering, 16:8, pp 980–991, 2004 DOI:

Perwej, Y.,” The hadoop security in big data: a technological viewpoint and analysis”, International Journal of Scientific Research in Computer Science and Engineering, 7(3), 1-14, 2019, DOI:

N. Akhtar, Saima Rahman, Halima Sadia, Y. Perwej, “A Holistic Analysis of Medical Internet of Things (MIoT)”, Journal of Information and Computational Science (JOICS), ISSN: 1548 - 7741, Volume 11, Issue 4, Pages 209 - 222, 2021, DOI: 10.12733/JICS.2021/V11I3.535569.31023

S. Wong and L. Kuhlmann, "Computationally efficient epileptic seizure prediction based on extremely randomised trees", Proceedings of the Australasian Computer Science Week Multiconference, pp. 1-3, 2020 DOI:






Research Articles

How to Cite

Kajal, Kanchan Saini, Dr. Nikhat Akhtar, Prof. (Dr.) Devendra Agarwal, Ms. Sana Rabbani, and Dr. Yusuf Perwej, “Machine Learning for the Diagnosis and Prognosis of Chronic Illnesses”, Int J Sci Res Sci Eng Technol, vol. 11, no. 3, pp. 112–122, May 2024, doi: 10.32628/IJSRSET24113100.

Similar Articles

1-10 of 163

You may also start an advanced similarity search for this article.