Issues and Challenges in Big Data Mining

Authors(1) :-Sakshi

Big Data is fast becoming a big problem since last year. Big data refers to datasets which has large size and complexity. We can’t capture, store, manage and analyze with typical database software tools. Data mining is highlighted buzzword that is used to describe the range of Big data analytics, with collection, extraction, analysis and statics. Big Data mining involves to extracting useful information from these huge sets of data and streams of data, due to its volume, velocity and variety. This paper describes an overview of Big Data mining, problems related to mining and the new opportunities. During discussion we include platform and framework for managing and processing large data sets. We also discuss the knowledge discovery process, data mining, and various open source tools with current condition, issues and forecast to the future.

Authors and Affiliations

Sakshi
Assistant Professor, Department of Computer Science and Applications, Guru Nanak College, Ferozepur Cantt, Punjab, India

Data mining, Big data, Big data mining, Big data management Issues and Challenges.

[1] Julie M. David, Kannan Balakrishnan, (2011), Prediction of Key Symptoms of Learning Disabilities in School-Age Children using Rough Sets, Int. J. of Computer and Electrical Engineering, Hong Kong, 3(1), pp163-169

[2] Julie M. David, Kannan Balakrishnan, (2011), Prediction of Learning Disabilities in School-Age Children using SVM and Decision Tree, Int. J. of Computer Science and Information Technology, ISSN 0975-9646, 2(2), pp829-835.

[3] Albert Bifet, (2013), “Mining Big data in Real time”, Informatica 37, pp15-20

[4] Richa Gupta, (2014), “Journey from data mining to Web Mining to Big Data”, IJCTT, 10(1),pp18-20

[5] http://www.domo.com/blog/2014/04/data-never-sleeps-2-0/

[6] Priya P. Sharma, Chandrakant P. Navdeti, (2014), “Securing Big Data Hadoop: A Review of Security

Issues, Threats and Solution”, IJCSIT, 5(2), pp2126-2131

[7] Richa Gupta, Sunny Gupta, Anuradha Singhal, (2014), “Big Data:Overview”, IJCTT, 9 (5) [8] Wei Fan, Albert Bifet, “Mining Big Data: Current Status and Forecast to the Future”, SIGKDD Explorations, 14 (2), pp1-5

[9] Fayyad, U.M., Gregory, P.S., Padhraic, S.: From Data Mining to Knowledge Discovery: an Overview. In: Advances in Knowledge Discovery and Data Mining, pp. 1-36. AAAI Press, Menlo Park, CA (1996).

[10] James Manyika, et al. Big data: The next frontier for innovation, competition, and productivity.

[11] D. Laney. 3-D Data Management: Controlling Data volume, Velocity and Variety. META Group Research Note, February 6, 2001

[12] Gartner, http://www.gartner.com/it-glossary/big-data.

[13] Dean, J., Ghemawat, S.: MapReduce: Simplified Data Processing on Large Clusters. In: 6th Symposium on Operating System Design and Implementation (OSDI), pp. 137–150 (2004)

[14] Ghemawat, S., Gobioff, H., Leung, S.T.: The Google File System. In: 19th ACM Sympo-sium on Operating Systems Principles, Bolton Landing, New York, pp. 29–43 (2003)

[15] Dean, J., Ghemawat, S.: MapReduce: a Flexible Data Processing Tool. Communication of the ACM 53(1), 72–77 (2010)

[16] F. Diebold. On the Origin(s) and Development of the Term "Big Data". Pier working paper archive, Penn Institute for Economic Research, Department of Eco-nomics, University of Pennsylvania, 2012.

[17] S.M.Weiss and N. Indurkhya. Predictive data mining: a practical guide. Morgan Kaufmann Publishers Inc.,San Francisco, CA, USA, 1998.

[18] “F. Diebold. "Big Data" Dynamic Factor Models for Macroeconomic Measurement and Forecasting. Discus-sion Read to the Eighth World Congress of the Econo-metric Society, 2000.

[19] Greenwald, M., Fredian, T., Schissel, D., Stillerman, J.: A Metadata Catalog for Organization and Systemization of Fusion Simulation Data”. Fusion Engineering & Design, vol. 87, no. 12, pp. 2205-2208. (2012).

[20] Shmueli, G., Patel, N.R., Bruce, P.C: Data Mining for Business Intelligence: Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner. (2nd ed) Wiley & Sons, Hoboken, New Jersey (2010).

[21] Obradovic, Z., Vucetic, S.: Challenges in Scientific Data Mining: Heterogeneous, Biased, and Large Samples. Technical Report, Center for Information Science and Technology Temple University, Chapter 1, pp.1-24 (2004).

[22] Vucetic S., Obradovic Z.: Discovering Homogeneous Regions in Spatial Data through Competition. In: 17th International Conference of Machine Learning, pp. 1095-1102. Stanford, CA (2000)

[23] Cai, D., Shao, Z., He, X., Yan, X., Han, J.: Mining Hidden Communities in Heterogeneous Social Network. In: 3rd International Workshop Link Discovery (LinkKDD), pp. 58-65 (2005).

[24] Sun, Y., Han, J., Yan, X., Yu, P.S.: Mining Knowledge from Interconnected Data: A Heterogeneous Information Network Analysis Approach. In: VLDB Endowment, vol. 5, no. 12, pp. 2022-2023 (2012).

[25] Zhang, X., Ai, J., Wang, Z., Lu, J., Meng, X.: An Efficient Multi-dimensional Index for Cloud Data Management. In: 1st International Workshop on Cloud Data Management, pp. 17-24. ACM Press, Hong Kong, China (2009).

[26] Agrawal, D., Bernstein, P., Bertino, E., et al: Challenges and Opportunities With big data – A Community White Paper Developed by Leading Researchers Across the United States(2012), http://cra.org/ccc/docs/init/bigdatawhitepaper.pdf.

[27] Yin, X., Han, J., Yu, P. S.: Truth Discovery with Multiple Conflicting Information Providers on the Web. In: 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1048-1052. San Jose, California (2007).

[28] Dong, X.L., Berti-Equille, L., Srivastava, D.: Integrating Conflicting Data: The Role of Source Dependence. In: VLDB Endowment, vol. 2, no. 1, pp. 550-561 (2009).

[29] Yin, X., Tan, W.: Semi-Supervised Truth Discovery. In: 20th International Conference on World Wide Web, pp. 217-226. Hyderabad, India (2011).

[30] Tene, O., Polonetsky, J.: Privacy in the Age of big data: A Time for Big Decisions. Stanford Law Review Online, vol. 64, pp. 63-69 (2012).

[31] Pedreschi, D., Calders, T., Custers, B., et al: big data Mining, Fairness and Privacy – A Vision Statement Towards an Interdisciplinary Roadmap of Research. In: Data Mining and Analytics Software, KDnuggets Review Online, vol. 11, no. 26 (2011).

Publication Details

Published in : Volume 4 | Issue 7 | March-April 2018
Date of Publication : 2018-04-30
License:  This work is licensed under a Creative Commons Attribution 4.0 International License.
Page(s) : 524-535
Manuscript Number : IJSRSET184871
Publisher : Technoscience Academy

Print ISSN : 2395-1990, Online ISSN : 2394-4099

Cite This Article :

Sakshi, " Issues and Challenges in Big Data Mining, International Journal of Scientific Research in Science, Engineering and Technology(IJSRSET), Print ISSN : 2395-1990, Online ISSN : 2394-4099, Volume 4, Issue 7, pp.524-535, March-April-2018.
Journal URL : http://ijsrset.com/IJSRSET184871

Follow Us

Contact Us