Distributed Data Mining: Implementing Data Mining Jobs on Grid Environments

Authors(3) :-Vishal Bhemwala, Bhavesh Patel, Dr. Ashok Patel

Data mining technology is not only composed by efficient and effective algorithms, executed as standalone kernels. Rather, it is constituted by complex applications articulated in the non trivial interaction among hardware and software components, running on large scale distributed environments. This last feature turns out to be both the cause and the effect of the inherently distributed nature of data, on one side, and, on the other side, of the spatiotemporal complexity that characterizes many DM applications. For a growing number of application fields, Distributed Data Mining (DDM) is therefore a critical technology. In this research paper, after reviewing the open problems in DDM, we describe the DM jobs on Grid environments. We will introduce the design of Knowledge Grid System.

Authors and Affiliations

Vishal Bhemwala
Department of Computer Science, Hem. North Gujarat University, Patan, Gujarat, India
Bhavesh Patel
Department of Computer Science, Hem. North Gujarat University, Patan, Gujarat, India
Dr. Ashok Patel
Department of Computer Science, Hem. North Gujarat University, Patan, Gujarat, India

Data Mining, Knowledge Grid, Distributed Data Mining

  1. M. Cannataro, C. Mastroianni, D. Talia, and Trunfio P. Evaluating and enhancing the use of the gridftp protocol for efficient data transfer on the grid. In Proc. of the 10th Euro PVM/MPI Users’ Group Conference, 2003.
  2. A. Chervenak, I. Foster, C. Kesselman, C. Salisbury, and S. Tuecke. The Data Grid: towards an architecture for the distributed management and analysis of large scientific datasets. J. of Network and Comp. Appl., (23):187–200, 2001.
  3. I. Foster and C. Kasselman. The Grid: blueprint for a future infrastructure. Morgan Kaufman, 1999.
  4. Bart Goethals. Efficient Frequent Itemset Mining. PhD thesis, Limburg University, Belgium, 2003.
  5. W. Allcock, J. Bester, J. Bresnahan, A. Chervenak, L. Liming, S. Meder, and S. Tuecke. Gridftp protocol specification. Technical report, GGF GridFTP Working Group Document, 2002.
  6. R. L. Grossman and R. Hollebeek. Handbook of Massive Data Sets, chapter The National Scalable Cluster Project: Three Lessons about High Performance Data Mining and Data Intensive Computing. Kluwer Academic Publishers, 2002.
  7. H. Kargupta, W. S. K. Huang, and E. Johnson. Distributed clustering using collective principal components analysis. Knowledge and Information Systems Journal, 2001.
  8. H. Kargupta, B. Park, E. Johnson, E. Sanseverino, L. Silvestre, and D. Hershberger. Collective data mining from distributed vertically partitioned feature space. In Proc. of Workshop on distributed data mining, International Conference on Knowledge Discovery and Data Mining, 1998.
  9. M. Marzolla and P. Palmerini. Simulation of a grid scheduler for data mining. Esame per il corso di dottorato in informativa, Universita’ Ca’ Foscari, Venezia, 2002.
  10. C. L. Parkinson and R. Greenstonen, editors. EOS Data Products Handbook. NASA Goddard Space Flight Center, 2000.
  11. A. L. Prodromidis, P. K. Chan, and S. J. Stolfo. Meta-learning in distributed data mining systems: Issues and approaches. In Advances in Distributed and Parallel Knowledge Discovery. AAAI/MIT Press, 2000.

Publication Details

Published in : Volume 2 | Issue 1 | January-February 2016
Date of Publication : 2016-02-25
License:  This work is licensed under a Creative Commons Attribution 4.0 International License.
Page(s) : 327-332
Manuscript Number : IJSRSET162168
Publisher : Technoscience Academy

Print ISSN : 2395-1990, Online ISSN : 2394-4099

Cite This Article :

Vishal Bhemwala, Bhavesh Patel, Dr. Ashok Patel, " Distributed Data Mining: Implementing Data Mining Jobs on Grid Environments, International Journal of Scientific Research in Science, Engineering and Technology(IJSRSET), Print ISSN : 2395-1990, Online ISSN : 2394-4099, Volume 2, Issue 1, pp.327-332, January-February-2016. Available at doi : https://doi.org/10.32628/IJSRSET162168      Citation Detection and Elimination     |     
Journal URL : https://ijsrset.com/IJSRSET162168

Article Preview