Framework For Data Development in Mordern Situation Using Machine Learning Technology

Authors

  • Syed Aamir Bokhari  Department of Artificial Intelligence and Machine Learning, New Horizon College of Engineering, Bangalore, India
  • Progyajyoti Mukherjee  Department of Artificial Intelligence and Machine Learning, New Horizon College of Engineering, Bangalore, India
  • Suhail Shaik  Department of Artificial Intelligence and Machine Learning, New Horizon College of Engineering, Bangalore, India
  • Abhishek Samar Singh  Department of Artificial Intelligence and Machine Learning, New Horizon College of Engineering, Bangalore, India

DOI:

https://doi.org/10.32628/IJSRSET229626

Keywords:

Data Profiling, Cleaning, Integration, Transformation

Abstract

This research is mainly focused on to talk about data preparation, what better way to start than from observation. Everyone is familiar with the adage that a data scientist should spend 80% of his or her time preparing the data and just 20% actually working with it, particularly when it comes to visualization. This essay will concentrate on data preparation, including the most common issues, solutions, and developments. Data must be put into the proper form before analysis can be done on it. Data manipulation and organization are steps in the preparation of data for analysis. Iteratively transforming unstructured, chaotic raw data into a more organized, practical form that is ready for further analysis is known as data preparation. Data profiling, cleaning, integration, and transformation are just a few of the primary activities (or tasks) that make up the entire preparation process.

References

  1. Zhang, Z., C. Zhang, and S. Zhang. 2003. An agent-based hybrid framework for database mining. Applied Artificial Intelligence 17(5–6):383–398.
  2. Zhang, C., and S. Zhang. 2002. Association Rules Mining: Models and Algorithms. In Lecture Notes in Artificial Intelligence, volume 2307, page 243, Springer-Verlag
  3. Zhang, H., and C. Ling. 2003. Numeric mapping and learnability of Na€ve Bayes. Applied Artificial Intelligence 17(5–6):507–518
  4. Yang, Q., T. Li, and K. Wang. 2003. Web-log cleaning for constructing sequential classifiers. Applied Artificial Intelligence 17(5–6):431–441.
  5. Tseng, S., K. Wang, and C. Lee. 2003. A pre-processing method to deal with missing values by integrating clustering and regression techniques. Applied Artificial Intelligence 17(5–6):535–544
  6. Ratanamahatana, C., and D. Gunopulos. 2003. Feature selection for the Naive Bayesian classifier using decision trees. Applied Artificial Intelligence 17(5–6):475–487
  7. Hruschka, E., Jr., E. Hruschka, and N. Ebecken. 2003. A feature selection Bayesian approach for extracting classification rules with a clustering genetic algorithm. Applied Artificial Intelligence 17(5–6):489–506.
  8. Batista, G., and M. Monard. 2003. An analysis of four missing data treatment methods for supervised learning. Applied Artificial Intelligence 17(5–6):519–533.
  9. Abdullah, N., M. Liquie`re, and S. A. Cerri. 2003. GAsRule for knowledge discovery. Applied Artificial Intelligence 17(5–6):399–417.

Downloads

Published

2022-11-30

Issue

Section

Research Articles

How to Cite

[1]
Syed Aamir Bokhari, Progyajyoti Mukherjee, Suhail Shaik, Abhishek Samar Singh "Framework For Data Development in Mordern Situation Using Machine Learning Technology" International Journal of Scientific Research in Science, Engineering and Technology (IJSRSET), Print ISSN : 2395-1990, Online ISSN : 2394-4099, Volume 9, Issue 6, pp.229-234, November-December-2022. Available at doi : https://doi.org/10.32628/IJSRSET229626