Framework For Data Development in Mordern Situation Using Machine Learning Technology
DOI:
https://doi.org/10.32628/IJSRSET229626Keywords:
Data Profiling, Cleaning, Integration, TransformationAbstract
This research is mainly focused on to talk about data preparation, what better way to start than from observation. Everyone is familiar with the adage that a data scientist should spend 80% of his or her time preparing the data and just 20% actually working with it, particularly when it comes to visualization. This essay will concentrate on data preparation, including the most common issues, solutions, and developments. Data must be put into the proper form before analysis can be done on it. Data manipulation and organization are steps in the preparation of data for analysis. Iteratively transforming unstructured, chaotic raw data into a more organized, practical form that is ready for further analysis is known as data preparation. Data profiling, cleaning, integration, and transformation are just a few of the primary activities (or tasks) that make up the entire preparation process.
References
- Zhang, Z., C. Zhang, and S. Zhang. 2003. An agent-based hybrid framework for database mining. Applied Artificial Intelligence 17(5–6):383–398.
- Zhang, C., and S. Zhang. 2002. Association Rules Mining: Models and Algorithms. In Lecture Notes in Artificial Intelligence, volume 2307, page 243, Springer-Verlag
- Zhang, H., and C. Ling. 2003. Numeric mapping and learnability of Na€ve Bayes. Applied Artificial Intelligence 17(5–6):507–518
- Yang, Q., T. Li, and K. Wang. 2003. Web-log cleaning for constructing sequential classifiers. Applied Artificial Intelligence 17(5–6):431–441.
- Tseng, S., K. Wang, and C. Lee. 2003. A pre-processing method to deal with missing values by integrating clustering and regression techniques. Applied Artificial Intelligence 17(5–6):535–544
- Ratanamahatana, C., and D. Gunopulos. 2003. Feature selection for the Naive Bayesian classifier using decision trees. Applied Artificial Intelligence 17(5–6):475–487
- Hruschka, E., Jr., E. Hruschka, and N. Ebecken. 2003. A feature selection Bayesian approach for extracting classification rules with a clustering genetic algorithm. Applied Artificial Intelligence 17(5–6):489–506.
- Batista, G., and M. Monard. 2003. An analysis of four missing data treatment methods for supervised learning. Applied Artificial Intelligence 17(5–6):519–533.
- Abdullah, N., M. Liquie`re, and S. A. Cerri. 2003. GAsRule for knowledge discovery. Applied Artificial Intelligence 17(5–6):399–417.
Downloads
Published
Issue
Section
License
Copyright (c) IJSRSET

This work is licensed under a Creative Commons Attribution 4.0 International License.