IJSRSET calls volunteers interested to contribute towards the scientific development in the field of Science, Engineering and Technology

Home > IJSRSET162168                                                     


Distributed Data Mining: Implementing Data Mining Jobs on Grid Environments

Authors(3):

Vishal Bhemwala, Bhavesh Patel, Dr. Ashok Patel
  • Abstract
  • Authors
  • Keywords
  • References
  • Details
Data mining technology is not only composed by efficient and effective algorithms, executed as standalone kernels. Rather, it is constituted by complex applications articulated in the non trivial interaction among hardware and software components, running on large scale distributed environments. This last feature turns out to be both the cause and the effect of the inherently distributed nature of data, on one side, and, on the other side, of the spatiotemporal complexity that characterizes many DM applications. For a growing number of application fields, Distributed Data Mining (DDM) is therefore a critical technology. In this research paper, after reviewing the open problems in DDM, we describe the DM jobs on Grid environments. We will introduce the design of Knowledge Grid System.

Vishal Bhemwala, Bhavesh Patel, Dr. Ashok Patel

Data Mining, Knowledge Grid, Distributed Data Mining

  1. M. Cannataro, C. Mastroianni, D. Talia, and Trunfio P. Evaluating and enhancing the use of the gridftp protocol for efficient data transfer on the grid. In Proc. of the 10th Euro PVM/MPI Users’ Group Conference, 2003.
  2. A. Chervenak, I. Foster, C. Kesselman, C. Salisbury, and S. Tuecke. The Data Grid: towards an architecture for the distributed management and analysis of large scientific datasets. J. of Network and Comp. Appl., (23):187–200, 2001.
  3. I. Foster and C. Kasselman. The Grid: blueprint for a future infrastructure. Morgan Kaufman, 1999.
  4. Bart Goethals. Efficient Frequent Itemset Mining. PhD thesis, Limburg University, Belgium, 2003.
  5. W. Allcock, J. Bester, J. Bresnahan, A. Chervenak, L. Liming, S. Meder, and S. Tuecke. Gridftp protocol specification. Technical report, GGF GridFTP Working Group Document, 2002.
  6. R. L. Grossman and R. Hollebeek. Handbook of Massive Data Sets, chapter The National Scalable Cluster Project: Three Lessons about High Performance Data Mining and Data Intensive Computing. Kluwer Academic Publishers, 2002.
  7. H. Kargupta, W. S. K. Huang, and E. Johnson. Distributed clustering using collective principal components analysis. Knowledge and Information Systems Journal, 2001.
  8. H. Kargupta, B. Park, E. Johnson, E. Sanseverino, L. Silvestre, and D. Hershberger. Collective data mining from distributed vertically partitioned feature space. In Proc. of Workshop on distributed data mining, International Conference on Knowledge Discovery and Data Mining, 1998.
  9. M. Marzolla and P. Palmerini. Simulation of a grid scheduler for data mining. Esame per il corso di dottorato in informativa, Universita’ Ca’ Foscari, Venezia, 2002.
  10. C. L. Parkinson and R. Greenstonen, editors. EOS Data Products Handbook. NASA Goddard Space Flight Center, 2000.
  11. A. L. Prodromidis, P. K. Chan, and S. J. Stolfo. Meta-learning in distributed data mining systems: Issues and approaches. In Advances in Distributed and Parallel Knowledge Discovery. AAAI/MIT Press, 2000.

Publication Details

Published in : Volume 2 | Issue 1 | January-Febuary - 2016
Date of Publication Print ISSN Online ISSN
2016-02-25 2395-1990 2394-4099
Page(s) Manuscript Number   Publisher
327-332 IJSRSET162168   Technoscience Academy

Cite This Article

Vishal Bhemwala, Bhavesh Patel, Dr. Ashok Patel, "Distributed Data Mining: Implementing Data Mining Jobs on Grid Environments", International Journal of Scientific Research in Science, Engineering and Technology(IJSRSET), Print ISSN : 2395-1990, Online ISSN : 2394-4099, Volume 2, Issue 1, pp.327-332, January-Febuary-2016.
URL : http://ijsrset.com/IJSRSET162168.php