A Survey of Approaches for Hadoop with Clustering Techniques
Keywords:
Big Data, Clustering, Data MiningAbstract
Data mining environment creates a lot of information, that should be investigated, examples must be removed from that to pick up learning. In this new period with blast of information both organized and unstructured, in the field of genomics, meteorology, science, ecological examination and numerous others, it has gotten to be hard to process, oversee and break down examples utilizing customary databases and architectures. Thus, a legitimate structural engineering ought to be comprehended to pick up information about the Big Data.This paper presents a review of various algorithms from necessary for handling such large data set. These algorithms define various structures and methods implemented to handle Big Data, also in the paper are listed various tool that were developed for analyzing them.
References
- "Big Data for Development: Challenges and Opportunities", Global Pulse, May 2016
- Joseph McKendrick, "Big Data, Big Challenges, Big Opportunities: 2012 IOUG Big Data Strategies Survey", IOUG, Sept 2016
- Nigel Wallis, "Big Data in Canada: Challenging Complacency for Competitive Advantage", IDC, Dec 2017
- Ivanka Valova, Monique Noirhomme, "Processing Of Large Data Sets: Evolution, Opportunities And Challenges", Proceedings of PCaPAC17
- Neha Saxena, Niket Bhargava, Urmila Mahor, Nitin Dixit, "An Efficient Technique on Cluster Based Master Slave Architecture Design", Fourth International Conference on Computational Intelligence and Communication Networks, 2016
- Edmon Begoli, James Horey, "Design Principles for Effective Knowledge Discovery from Big Data", Joint Working Conference on Software Architecture & 6th European Conference on Software Architecture, 2017
- Kapil Bakshi, "Considerations for Big Data: Architecture and Approach", IEEE, 2017
- N. Beckmann, H. -P. Kriegal, R. Schneider, and B. Seeger, "The R*-Tree: An Efficient and Robust Access Method for Points and Rectangles," Proc. ACM SIGMOD, May 2016
- S. Arya, D. Mount, N. Netanyahu, R. Silverman, A. Wu, "An Optimal Algorithm for Approximate Nearest Neighbor Searching in Fixed Dimensions, " Proc. Fifth Symp. Discrete Algorithm (SODA), 2017, pp. 573-582.
- Lawrence 0. Hall, Nitesh Chawla , Kevin W. Bowyer, "Decision Tree Learning on Very Large Data Sets", IEEE, Oct 2016
- Zhiwei Fu, Fannie Mae, "A Computational Study of Using Genetic Algorithms to Develop Intelligent Decision Trees", Proceedings of the 2001 IEEE congress on evolutionary computation, 2016.
- Mr. D. V. Patil, Prof. Dr. R. S. Bichkar, "A Hybrid Evolutionary Approach To Construct Optimal Decision Trees with Large Data Sets", IEEE, 2016
- Guillermo Sinchez-Diaz , Jose Ruiz-Shulcloper, "A Clustering Method for Very Large Mixed Data Sets", IEEE, 2017
- Mehmet Koyuturk, Ananth Grama, and Naren Ramakrishnan, "Compression, Clustering, and Pattern Discovery in very High-Dimensional Discrete-Attribute Data Sets", IEEE Transactions On Knowledge And Data Engineering, April 2005, Vol. 17, No. 4
- Emily Namey, Greg Guest, Lucy Thairu, Laura Johnson, "Data Reduction Techniques for Large Qualitative Data Sets", 2007
- Moshe Looks, Andrew Levine, G. Adam Covington, Ronald P. Loui, John W. Lockwood, Young H. Cho, "Streaming Hierarchical Clustering for Concept Mining", IEEE, 2007
- Yen-ling Lu, chin-shyurng fahn, "Hierarchical Artificial Neural Networks For Recognizing High Similar Large Data Sets. ", Proceedings of the Sixth International Conference on Machine Learning and Cybernetics, August 2007
- Archana Singh, Megha Chaudhary, Dr (Prof.) Ajay Rana, Gaurav Dubey, "Online Mining of data to Generate Association Rule Mining in Large Databases", International Conference on Recent Trends in Information Systems, 2011
- David N. Reshef et al.,"Detecting Novel Associations in Large Data Sets", Science AAAS, 2011, Science 334
- Shuliang Wang, Wenyan Gan, Deyi Li, Deren Li "Data Field For Hierarchical Clustering", International Journal of Data Warehousing and Mining, Dec. 2011
- Tatiana V. Karpinets, Byung H.Park, Edward C. Uberbacher, "Analyzing large biological datasets with association network", Nucleic Acids Research, 2012
- M. Vijayalakshmi, M. Renuka Devi, "A Survey of Different Issues of Different Clustering Algorithms used in Large Data Sets", International Journal of Advanced Research in Computer Science and Software Engineering, March 2012
- Subashini S, Dr. Kavitha V, "A Metadata Based Storage Model For Securing Data In Cloud Environment", International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, 2011
- Vanja Kontak, Siniša Srblji?, Dejan Škvorc, "Hashing Scheme for Space-efficient Detection and Localization of Changes in Large Data Sets", MIPRO 2012, May 2012
- Matthew Smith, Christian Szongott, Benjamin Henne, Gabriele von Voigt, "Big Data Privacy Issues in Public Social Media", IEEE, 2013
- "Big data: The next frontier for innovation, competition, and productivity", McKinsey& Company, June 2011 "Challenges and Opportunities with Big Data", 2012
- Trevor Hastie, Robert Tibshirani, Jerome Friedman, "The Elements of Statistical Learning: Data Mining, Inference, and Prediction", Springer, 2nd edition, 2008 "2012 Big Data Survey Results", Treasure Data, 2012
Downloads
Published
Issue
Section
License
Copyright (c) IJSRSET

This work is licensed under a Creative Commons Attribution 4.0 International License.