IJSRSET calls volunteers interested to contribute towards the scientific development in the field of Science, Engineering and Technology

Home > IJSRSET1848154                                                     

Big Data Processing with Data Provenance Using HDM Framework


Rajat Bodankar, Roshani Talmale
  • Abstract
  • Authors
  • Keywords
  • References
  • Details
Big Data applications are becoming more complex and expe-riencing frequent changes and updates. In practice, manual optimization of complex big data jobs is time-consuming and error-prone. Maintenance and management of evolving big data applications is a challenging task as well. We demon-strate HDM, Hierarchically Distributed Data Matrix, as a big data processing framework with built-in data ow op-timizations and integrated maintenance of data provenance information that supports the management of continuously evolving big data applications. In HDM, the data ow of jobs are automatically optimized based on the functional DAG representation to improve the performance during ex-ecution. Additionally, comprehensive meta-data related to explanation, execution and dependency updates of HDM ap-plications are stored and maintained in order to facilitate the debugging, monitoring, tracing and reproducing of HDM jobs and programs.

Rajat Bodankar, Roshani Talmale

Big Data, Data Flow Optimization, Provenance Management

  1. P Carbone, A. Katsifodimos, S. Ewen, V. Markl,S.Haridi, and K. Tzoumas. Apache inkTM: Stream and batch processing in a single engine. IEEE Data Eng. Bull., 38(4):28{38, 2015.
  2. J Dean and S. Ghemawat. MapReduce: simpli ed data processing on large clusters. Commun. ACM, 51(1), 2008.
  3. S Sakr. Big Data 2.0 Processing Systems - A Survey. Springer Briefs in Computer Science. Springer, 2016.
  4. D Sculley, G. Holt, D. Golovin, E. Davydov, T.Phillips, D. Ebner, V. Chaudhary, and M. Young. Machine learning: The high interest credit card of technical debt. In SE4ML: Software Engineering for Machine Learning, 2014.
  5. D Wu, S. Sakr, L. Zhu, and Q. Lu. Composable and E cient Functional Big Data Processing Framework. In IEEE Big Data, 2015.
  6. M Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica. Spark: Cluster Computing with Working Sets. In HotCloud, 2010.

Publication Details

Published in : Volume 4 | Issue 6 | January-February - 2018
Date of Publication Print ISSN Online ISSN
2018-02-28 2395-1990 2394-4099
Page(s) Manuscript Number   Publisher
195-198 IJSRSET1848154   Technoscience Academy

Cite This Article

Rajat Bodankar, Roshani Talmale, "Big Data Processing with Data Provenance Using HDM Framework", International Journal of Scientific Research in Science, Engineering and Technology(IJSRSET), Print ISSN : 2395-1990, Online ISSN : 2394-4099, Volume 4, Issue 6, pp.195-198, January-February-2018.
URL : http://ijsrset.com/IJSRSET1848154.php