Critical Survey for Scheduling and Resource Allocation Methods in Hadoop-MapReduce on Clouds

Authors

  • Vaishali Sontakke  Department of information science and engineering, East point college of engineering and Technology, Bangalore, Karnataka, India
  • Dr. Dayananda R B  Department of Computer science and engineering, KSIT, Bangalore, Karnataka, India

Keywords:

Hadoop, Mapreduce, Survey, Scheduling, Resource Allocation

Abstract

Cloud computing is defined as the computing platform which hosts the various services and application for the users and businesses. It provides the access to the users with less cost and easily accessibility from any part of the world and works on the theory of Pay as you go service. In cloud environment computing resources provided as they demanded. It forms upon developments of virtualization. Cost of computing resources, highlighting towards resource scalability and provided on-demand services. It permits business consequences to upgrade and degrade their resources based on requirements. Meanwhile an open source Hadoop performance MapReduce has become a widespread model for data-intensive application for short job and low response time. IN this paper, we study the works on scheduling and resource allocation for matching the processing load .We provide the comparison of the same, comparison includes the various methodology along with their shortcomings.

References

  1. E. Oppong, S. Khaddaj and H. E. Elasriss, "Cloud Computing: Resource Management and Service Allocation," 2013 12th International Symposium on Distributed Computing and Applications to Business, Engineering & Science, Kingston upon Thames, Surrey, UK, 2013, pp. 142-145.
  2. K. Aziz, D. Zaidouni and M. Bellafkih, "Real-time data analysis using Spark and Hadoop," 2018 4th International Conference on Optimization and Applications (ICOA), Mohammedia, 2018, pp. 1-6.
  3. S. B. Elagib, A. R. Najeeb, A. H. Hashim and R. F. Olanrewaju, "Big Data Analysis Solutions Using MapReduce Framework," 2014 International Conference on Computer and Communication Engineering, Kuala Lumpur, 2014, pp. 127-130.
  4. P. Merla and Y. Liang, "Data analysis using hadoop MapReduce environment," 2017 IEEE International Conference on Big Data (Big Data), Boston, MA, 2017, pp. 4783-4785.
  5. P. Gohil, D. Garg and B. Panchal, "A performance analysis of MapReduce applications on big data in cloud based Hadoop," International Conference on Information Communication and Embedded Systems (ICICES2014), Chennai, 2014, pp. 1-6.
  6. G. Yang, "The Application of MapReduce in the Cloud Computing," 2011 2nd International Symposium on Intelligence Information Processing and Trusted Computing, Hubei, 2011, pp. 154-156.
  7. M. Zaharia, D. Borthakur, J. Sen Sarma, K. Elmeleegy, S. Shenker, and I. Stoica, “Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling,” in Proceedings of the 5th European conference on Computer systems. ACM, 2010, pp. 265–278.
  8. M. Isard, V. Prabhakaran, J. Currey, U. Wieder, K. Talwar, and A. Goldberg, “Quincy: fair scheduling for distributed computing clusters,” in Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles. ACM, 2009, pp. 261–276.
  9. A. Verma, L. Cherkasova, and R. H. Campbell, “Two sides of a coin: Optimizing the schedule of mapreduce jobs to minimize their makespan and improve cluster performance,” in Modeling, Analysis & Simulation of Computer and Telecommunication Systems (MASCOTS), 2012 IEEE 20th International Symposium on. IEEE, 2012, pp. 11–18.
  10. J. Wang, Y. Yao, Y. Mao, B. Sheng, and N. Mi, “Fresh: Fair and efficient slot configuration and scheduling for hadoop clusters,” in Cloud Computing (CLOUD), 2014 IEEE 7th International Conference on.IEEE, 2014, pp. 761–768.
  11. D. Cheng, J. Rao, Y. Guo, C. Jiang, and X. Zhou, “Improving perfor- mance of heterogeneous mapreduce clusters with adaptive task tuning,” IEEE Transactions on parallel and distributed Systems, vol. 28, no. 3, pp. 774–786, 2017.
  12. Isard, M. (Silicon Valley, USA); Budiu, M.; Yuan Yu; Birrell, A.; Fetterly, D., "Dryad: distributed data-parallel programs from sequential building blocks “ ms Review, v 41, n 3, 2007, pp 59-72
  13. J. Polo, C. Castillo, D. Carrera, Y. Becerra, I. Whalley, M. Steinder, J. Torres, and E. Ayguad´e, “Resource-aware adaptive scheduling for mapreduce clusters,” in Middleware 2011. Springer, 2011, pp. 187–207.
  14. Chen C-H, Lin J-W, Kuo S-Y (2018) “MapReduce scheduling for deadline-constrained jobs in heterogeneous cloud computing systems”. IEEE Trans Cloud Comput 6(1):127–140.
  15. Hsieh S-Y, Chen C-T, Chen C-H, Yen T-H, Hsiao H-C, Buyya R (2018) “Novel scheduling algorithms for efficient deployment of MapReduce applications in heterogeneous computing environments”. IEEE Trans Cloud Compute 6(4):1080–1095.
  16. Cheng D, Zhou X, Yinggen X, Liu L, Jiang C (2019) “Deadline-aware MapReduce job scheduling with dynamic resource availability”. IEEE Trans Parallel Distrib Syst 30(4):814–826.
  17. Zeng X, Garg SK, Wen Z, Strazdins P, Zomaya AY, Ranjan R (2018) “Cost efficient scheduling of MapReduce applications on public clouds”. J Comput Sci 26:375–388.
  18. S. Gupta, C. Fritz, B. Price, R. Hoover, J. de Kleer, and C. Witteveen, “Throughputscheduler: Learning to schedule on heterogeneous Hadoop clusters,” in Proceedings 10th ACM International Conference on Autonomic Computing (ICAC’13). ACM
  19. Pallickara, S.; Ekanayake, J.; Fox, G.; , "Granules: A lightweight, streaming runtime for cloud computing with support, for Map- Reduce," Proc. of the IEEE International n Cluster Computing and Workshops, 2009, pp.1-10.
  20. Zaharia, Matei; Borthakur, Dhruba; Sarma, Joydeep Sen; Elmeleegy, Khaled; Shenker, Scott; Stoica, Ion, "Job Scheduling for Multi-User MapReduce Clusters," EECS Department, University of California, Berkeley, Technical Report No. UCB/EECS-2009-55, April 30, 2009.
  21. Jiong Xie; Shu Yin; Xiaojun Ruan; Zhiyang Ding; Yun Tian; Majors, J.; Manzanares, A.; Xiao Qin, "Improving MapReduce performance through data placement in heterogeneous Hadoop clusters," Proc. of the IEEE International Symposium on Parallel buted Processing, Workshops and Phd Forum, 2010, pp. 1-9.
  22. Zeng Dadan; Wang Xieqin; Jiang Ningkang "Distributed Scheduling Extension on Hadoop," Proc. of the International Conference on Cloud Computing”, 2009, pp. 687-93.
  23. Sandholm, Thomas; Lai, KevinPrimary, "Dynamic Proportional Share Scheduling in Hadoop," Job Scheduling Strategies for Parallel ing, Lecture Notes in Computer Science, Volume: 6253, 2010, 110-131.
  24. N. Lim, S. Majumdar and P. Ashwood-Smith, "MRCP-RM: A Technique for Resource Allocation and Scheduling of MapReduce Jobs with Deadlines," in IEEE Transactions on Parallel and Distributed Systems, vol. 28, no. 5, pp. 1375-1389, 1 May 2017.
  25. Q. Zhang, M. F. Zhani, Y. Yang, R. Boutaba and B. Wong, "PRISM: Fine-Grained Resource-Aware Scheduling for MapReduce," in IEEE Transactions on Cloud Computing, vol. 3, no. 2, pp. 182-194, 1 April-June 2015.
  26. Z. Fadika, E. Dede, M. Govindaraju and L. Ramakrishnan, "Benchmarking MapReduce Implementations for Application Usage Scenarios," 2011 IEEE/ACM 12th International Conference on Grid Computing, Lyon, 2011, pp. 90-97.
  27. B. Byambajav, T. W. Wlodarczyk, C. Rong, P. LePendu and N. Shah, "Performance of Left Outer Join on Hadoop with Right Side within Single Node Memory Size," Advanced Information Networking and Applications Workshops (WAINA), 2012 26th International Conference on, Fukuoka, 2012, pp. 1075-1080
  28. A. Jaiswal and A. Upadhyay, "An enhanced framework of genomics using big data computing," Computer, Communication and Control (IC4), 2015 International Conference on, Indore, 2015, pp. 1-7.
  29. M. Yui and I. Kojima, "A Database-Hadoop Hybrid Approach to Scalable Machine Learning," 2013 IEEE International Congress on Big Data, Santa Clara, CA, 2013, pp. 1-8.
  30. A. A. Al-Absi and D. K. Kang, "A Novel Parallel Computation Model with Efficient Local Memory Management for Data-Intensive Applications," 2015 IEEE 8th International Conference on Cloud Computing, New York City, NY, 2015, pp. 958-963.
  31. “Optimal CPU Scheduling in Data Centers via a Finite-Time Distributed Quantized Coordination Mechanism”Apostolos I. Rikos, Andreas Grammenos, Evangelia Kalyvianaki,hristoforos N. Hadjicostis, Themistoklis Charalambous, and Karl H. Johansson 7th april 2021
  32. ”Clustering of Association Rules for Big Datasets using Hadoop MapReduce” Salahadin Moahmmed1 , Mohamed A. Alasow2 , El-Sayed M. El-Alfy3 Information and Computer Science Department, King Fahd University of Petroleum and Minerals, Dhahran 31261, Saudi Arabia (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 12, No. 3, 2021
  33. “[HTML] A technique for parallel query optimization using MapReduce framework and a semantic-based clustering method” E Azhir, NJ Navimipour, M Hosseinzadeh, A Sharifi… - PeerJ Computer Science, 1 June 2021
  34. Sree Lakshmi K,Theertha Jayarajan N,Nitha L“Ascendancy of MapReduce with Hadoop for Weather Data and Word Count Analytics “ in IEEE 2021 5th International Conference on Trends in Electronics and Informatics (ICOEI) 21 June 2021 IEEE DOI: 10.1109/ICOEI51242.2021.9452980
  35. “A classification of hadoop job schedulers based on performance optimization approaches”R Ghazali, S Adabi, DG Down, A Movaghar - Cluster Computing, springer link, published 18 june 2021

Downloads

Published

2021-07-30

Issue

Section

Research Articles

How to Cite

[1]
Vaishali Sontakke, Dr. Dayananda R B "Critical Survey for Scheduling and Resource Allocation Methods in Hadoop-MapReduce on Clouds" International Journal of Scientific Research in Science, Engineering and Technology (IJSRSET), Print ISSN : 2395-1990, Online ISSN : 2394-4099, Volume 9, Issue 4, pp.121-129, July-August-2021.