IJSRSET calls volunteers interested to contribute towards the scientific development in the field of Science, Engineering and Technology

Home > IJSRSET162537                                                     

Implementation of Aggregation of Map and Reduce Function for Performance Improvisation


Varsha B.Bobade
  • Abstract
  • Authors
  • Keywords
  • References
  • Details
Big Data is term that refers to data sets whose size (volume), complexity (variability), and rate of growth (velocity) make them difficult to capture, manage, process or analyzed. To analyze this enormous amount of data Hadoop can be used. Hadoop is an open source software project that enables the distributed processing of large data sets across clusters of commodity servers.

I proposed a modified MapReduce architecture that allows data to be pipelined between operators. This reduces completion times and improve system utilization for batch jobs as well. I present a modified version of the Hadoop MapReduce framework that supports online aggregation, which allows users to see early returns from a job as it is being computed. The objective of the proposed technique is to signicantly improve the performance of Hadoop MapReduce for efficient Big Data processing.

Varsha B.Bobade

Big Data, Hadoop Framework, Online Aggregation, Combiners.

  1. Vikram Phaneendra and E. Madhusudhan Reddy, Big Data- solutions for RDBMS problems- A survey, In 12th IEEE/IFIP Network Operations and Management Symposium (NOMS 2010) (Osaka, Japan, Apr 19, 2013)
  2. Kiran kumara Reddi & Dnvsl Indira, Different Technique to Transfer Big Data : survey, IEEE Transactions on 52(8) (Aug.2013) 2348 2355
  3. Jimmy Lin MapReduce Is Good Enough?, The control project. IEEE Computer 32 (2013).
  4. Jiawei Han and Micheline Kamber, Classification and Prediction in Data Mining: Concepts and Techniques, 2nd ed., San Francisco, CA The Morgan Kaufmann, 2006.
  5. Laptev, K. Zeng, and C. Zaniolo, Early accurate results for advanced analytics on mapreduce, vol. 5, no. 10. VLDB Endowment, 2012, pp.10281039.
  6. Report from Pike research, http://www.pikeresearch.com/research/smartgrid-data-analytics.
  7. National Climate Data Center Online]. Available:http://www.ncdc.noaa.gov/oa/ncdc.html
  8. Borthakur, The Hadoop Distributed File System: Architecture and Design, 2007.
  9. Hellerstein, P. Haas, and H. Wang,Online aggregation, In SIGMOD Conference, pages 171182, 1997.
  10. Jermaine, S. Arumugam, A. Pol, and A. Dobra, Scalable approximate query processing with the dbo engine, In SIGMOD Conference, pages 725736, 2007.
  11. The apache hadoop project page, http://hadoop.apache.org/, 2013, last visited on 1 May, 2013.
  12. Dean and S. Ghemawat, Mapreduce: simplied data processing on large clusters, Communications of the ACM, vol. 51, no. 1, pp. 107 113,2008.
  13. Agarwal, A. Panda, B. Mozafari, S. Madden, and I. Stoica, Blinkdb: Queries with bounded errors and bounded response times on very large data, in ACM EuroSys 2013, 2013.
  14. Pansare, V. R. Borkar, C. Jermaine, and T. Condie, Online aggregation for large mapreduce jobs, vol. 4, no. 11, 2011, pp. 11351145.

Publication Details

Published in : Volume 2 | Issue 5 | September-October - 2016
Date of Publication Print ISSN Online ISSN
2016-10-30 2395-1990 2394-4099
Page(s) Manuscript Number   Publisher
196-201 IJSRSET162537   Technoscience Academy

Cite This Article

Varsha B.Bobade, "Implementation of Aggregation of Map and Reduce Function for Performance Improvisation", International Journal of Scientific Research in Science, Engineering and Technology(IJSRSET), Print ISSN : 2395-1990, Online ISSN : 2394-4099, Volume 2, Issue 5, pp.196-201, September-October-2016.
URL : http://ijsrset.com/IJSRSET162537.php