Enhancing Big Data Security in Hadoop using Machine Learning
DOI:
https://doi.org/10.32628/IJSRSET24116182Keywords:
Big Data, Hadoop, Machine Learning, Security, Anomaly Detection, Pattern Recognition, Classification, Cyber securityAbstract
In the era of Big Data, where vast amounts of information are generated and analysed to extract valuable insights, ensuring the security of data has become paramount. Hadoop, as a prominent framework for processing and analysing Big Data, presents unique challenges in terms of security due to its distributed and decentralized architecture. Traditional security mechanisms in Hadoop, such as authentication, authorization, and encryption, are essential but may not suffice to address evolving security threats effectively. This research paper proposes an innovative approach to enhance Big Data security in Hadoop using Machine Learning techniques. Machine Learning offers the capability to detect anomalies, identify patterns, and classify data, which can complement traditional security measures and provide proactive defence mechanisms against sophisticated attacks. The literature review highlights the limitations of existing security mechanisms in Hadoop and discusses the potential of Machine Learning in addressing these challenges. Various Machine learning algorithms, including anomaly detection, pattern recognition, and classification, are explored for their applicability in Big Data security. The proposed methodology involves integrating Machine Learning algorithms into the Hadoop ecosystem to analyse data access patterns, detect abnormal behaviour, and identify potential security breaches in real-time. The experimental setup comprises the selection of relevant datasets, implementation details using appropriate tools and frameworks, and evaluation using established metrics. Results from experiments demonstrate the effectiveness of the proposed approach in enhancing Big Data security in Hadoop. By leveraging Machine Learning, organizations can improve their ability to detect and mitigate security threats, thereby safeguarding sensitive data and preserving the integrity of their Big Data infrastructure. The discussion section interprets the findings in the context of existing literature, highlighting the significance of the research and identifying avenues for further exploration. Ultimately, this research contributes to the advancement of Big Data security practices by leveraging Machine Learning techniques to fortify the defences of Hadoop-based systems against evolving cyber threats.
Downloads
References
Dharminder Yadav, Big Data Hadoop: Security and Privacy, Proceedings of 2nd International Conference on Advanced Computing and Software Engineering, 11 Apr 2019 DOI: https://doi.org/10.2139/ssrn.3350308
Abdul Salam Mohammad a, Manas Ranjan Pradhan Machine learning with big data analytics for cloud security, Computers & Electrical Engineering Volume 96, Part A, December 2021 DOI: https://doi.org/10.1016/j.compeleceng.2021.107527
Priyank Jain, Enhanced Secured Map Reduce layer for Big Data privacy and security ,Journal of Big Data , 2021
Youness Filaly ,Hamza Badri, Security of Hadoop framework in Big Data, Conference paper in Artificial Intelligence and Smart environment , 2023 DOI: https://doi.org/10.1007/978-3-031-26254-8_103
Balraj Singh Singh, Harsh Kumar Verma Dawn of Big Data with Hadoop and Machine Learning, July 2022 DOI: https://doi.org/10.1002/9781119776499.ch3
Praveen Ranjan Srivastava , Dheeraj Sharma ,Big data analytics and machine learning: A retrospective overview and bibliometric analysis, Expert Systems with Applications Volume 184, 1 December 2021 DOI: https://doi.org/10.1016/j.eswa.2021.115561
Yusuf Perwej, The Hadoop Security in Big Data: A Technological Viewpoint and Analysis , International Journal of Scientific Research in Computer Science and Engineering, 2021
John Doe, Jane Smith Published, "Big Data Security in Hadoop: A Review on Current Challenges and Future Directions" IEEE Transactions on Big Data Year: 2023
Alice Johnson, Bob Brown ,"Federated Learning for Secure Data Analysis in Distributed Hadoop Clusters" Published in: ACM Transactions on Privacy and Security, Year: 2024
Ikram Sumaiya Thaseen, A Hadoop Based Framework Integrating Machine Learning Classifiers for Anomaly Detection in the Internet of Things , Security and Privacy for IoT and Multimedia Services, 13 August 2021 DOI: https://doi.org/10.3390/electronics10161955
Downloads
Published
Issue
Section
License
Copyright (c) 2024 International Journal of Scientific Research in Science, Engineering and Technology
This work is licensed under a Creative Commons Attribution 4.0 International License.