Adaptive Federated Data Cleaning with Explainability: A Basic Threshold-Driven Approach for Heterogeneous Data Environments

Authors

  • Iranna Shirol Independent Researcher, India Author

DOI:

https://doi.org/10.32628/IJSRSET25122207

Keywords:

Federated Learning, Data Cleaning, Threshold-Based Algorithm, Explainable AI, Outlier Removal, Adaptive Systems

Abstract

Automated data cleaning is critical for ensuring data quality and robustness of machine learning models. However, modern data environments are increasingly decentralized and heterogeneous, making centralized cleaning methods less viable, particularly when privacy is a concern. In this paper, we propose a novel framework for adaptive data cleaning based on a simple threshold-driven algorithm within a federated learning context. Our approach removes outliers by employing statistical measures (mean and standard deviation) to identify anomalies across distributed nodes. Additionally, we integrate explainability features so that each cleaning decision is transparent to end users. Experimental evaluations on both synthetic and real-world datasets indicate that our method yields notable improvements in data quality while preserving user privacy. We discuss current limitations and outline future avenues for enhancing scalability and extending the framework to handle multimodal data.

Downloads

Download data is not yet available.

References

Cô, P.-O., Nikanjam, A., Ahmed, N., Humeniuk, D., & Khomh, F. (2023). Data Cleaning and Machine Learning: A Systematic Literature Review. arXiv preprint arXiv:2310.01765.

Lee, G. Y., Alzamil, L., Doskenov, B., & Termehchy, A. (2021). A Survey on Data Cleaning Methods for Improved Machine Learning Model Performance. arXiv preprint arXiv:2109.07127.

Additional literature on federated learning and explainable AI.

Downloads

Published

30-04-2025

Issue

Section

Research Articles

How to Cite

[1]
Iranna Shirol, “Adaptive Federated Data Cleaning with Explainability: A Basic Threshold-Driven Approach for Heterogeneous Data Environments”, Int J Sci Res Sci Eng Technol, vol. 12, no. 2, pp. 828–832, Apr. 2025, doi: 10.32628/IJSRSET25122207.

Similar Articles

1-10 of 295

You may also start an advanced similarity search for this article.