A Survey on Hadoop-Mapreduce Environment with Scheduling Algorithms in Big Data

Authors

  • Swathi Kiruthika V.  Department of Computer Science Gobi Arts & Science College, Gobichettipalayam, Erode, TamilNadu, India
  • Dr. Thiagarasu V.  Department of Computer Science Gobi Arts & Science College, Gobichettipalayam, Erode, TamilNadu, India

Keywords:

Bigdata, Hadoop, MapReduce, Straggler, Data Skew and Job Scheduling.

Abstract

Hadoop and Map Reduce are the most efficient tools which are used to reduce the complexity of maintaining the big data set. MapReduce has been introduced by Google and it is an open source counterpart. Hadoop is focused for parallelizing computing in large distributed clusters of commodity machines. Thus the parallelizing data processing tool MapReduce has been gaining significance moment from both academy and industries. The objective of this survey is to study MapReduce with different algorithms to improve the performance in large dataset.

References

  1. Min Chen, Shiwen Mao and Yunhao Liu, “Big Data: A Survey”, Springer, New York, Volume 19, Issue 2, April 2014.
  2. Poonam S. Patil and Rajesh N. Phursule, “Survey Paper on Big Data Processing and Hadoop Components”, International Journal of Science and Research (IJSR), Volume 3, Issue 10, October 2014.
  3. Suryawanshi, Shital, and V. S. Wadne, "Big Data Mining using Map Reduce: A Survey Paper", IOSR Journals (IOSR Journal of Computer Engineering), Volume 16, Issue 6, December 2014.
  4. Ayma, "Classification Algorithms for Big Data Analysis, a MapReduce Approach", The International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XL-3/W2, March 2015.
  5. Mahesh Maurya and Subita Mahajab, “Comparative analysis of MapReduce job by keeping data constant & varying cluster size technique”, Elsevier, Volume 5, Issue 11, May 2011.
  6. Qi Chen, Jinyu Yao and Zhen Xiao, Senior Member, IEEE, “LIBRA: LightWeight DataSkew Mitigation in MapReduce”, IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, Volume 26, Issue 9, September 2015.
  7. Pakize, Seyed Reza, "A comprehensive view of Hadoop MapReduce scheduling algorithms", “ International Journal of Computer Networks & Communications Security ”, Volume 2, Issue 9, September 2014.
  8. S, “Information Processing & Management”, Springer, New York, Volume 27, Issue 4, 1991.
  9. Biliris A, “An Efficient Database Storage Structure for Large Dynamic Object”, IEEE Data Engineering Conference, Phoenix, Arizona, February 1992.
  10. Ghemawat S. and Dean.J, “MapReduce: Simplified Data Processing on Large Clusters”, OSDI ’04: 6th Symposium on Operating Systems Design and Implementation, August 2004.
  11. Puneet Singh Duggal and Sanchita Paul, “Big Data Analysis: Challenges and Solutions”, International Conference on Cloud, Big Data and Trust, RGPV, 2013.
  12. Arun and Dr. L. Jabasheela, “Big Data: Review, Classification and Analysis Survey”, International Journal of Innovative Research in Information Security (IJIRIS), Volume 1, Issue 3, 2014.
  13. Mukherjee A., Datta J., Jorapur R., Singhvi R., Haloi S., Akram W., “Shared disk big data analytics with Apache Hadoop”, “Institute of Electrical and Electronics Engineers (IEEE)”, Volume 12, December 2012.

Downloads

Published

2016-08-30

Issue

Section

Research Articles

How to Cite

[1]
Swathi Kiruthika V., Dr. Thiagarasu V., " A Survey on Hadoop-Mapreduce Environment with Scheduling Algorithms in Big Data, International Journal of Scientific Research in Science, Engineering and Technology(IJSRSET), Print ISSN : 2395-1990, Online ISSN : 2394-4099, Volume 2, Issue 4, pp.593-596, July-August-2016.