Optimising Data Modeling Approaches for Scalable Data Warehousing Systems
DOI:
https://doi.org/10.32628/IJSRSET2358716Keywords:
Data Warehousing, Scalability, Data Modeling, Schema Optimization, Data Partitioning, Hybrid Approaches, ETL.Abstract
Data warehousing has become a cornerstone of modern decision-support systems, enabling organizations to consolidate, store, and analyze large volumes of data for informed decision-making. With the exponential growth of data and increasing complexity in data sources, traditional approaches to data modeling and warehousing face challenges in scalability, real-time processing, and integration of diverse data types. This paper provides a comprehensive overview of data warehousing, highlighting its key components, traditional and modern approaches, and the evolving challenges in handling big data's volume, variety, and velocity. It emphasizes the importance of data modeling techniques, such as normalization, denormalization, and schema design strategies, to enhance scalability and performance. Furthermore, the paper identifies the critical need for innovative solutions to optimize data modeling for modern data warehouses, ensuring robust, adaptable, and efficient systems capable of supporting advanced analytics in diverse business environments.
References
[1] N. Pandit and P. Tiwari, “Data Warehousing,” SSRN Electron. J., vol. 1, no. 6, pp. 411–415, 2014.
[2] K. Patel, “An Analysis of Quality Assurance Practices Based on Software Development Life Cycle (SDLC) Methodologies,” J. Emerg. Technol. Innov. Res., vol. 9, no. 12, pp. g587–g592, 2022.
[3] K. Patel, “Quality Assurance In The Age Of Data Analytics: Innovations And Challenges,” Int. J. Creat. Res. Thoughts, vol. 9, no. 12, pp. f573–f578, 2021.
[4] S. G. Jubin Thomas, Kirti Vinod Vedi, “Enhancing Supply Chain Resilience Through Cloud-Based SCM and Advanced Machine Learning : A Case Study of Logistics,” J. Emerg. Technol. Innov. Res., vol. 8, no. 9, pp. 357–364, 2021.
[5] S. G. Jubin Thomas, Kirti Vinod Vedi, “The Effect and Challenges of the Internet of Things (IoT) on the Management of Supply Chains,” Int. J. Res. Anal. Rev, vol. 8, no. 3, pp. 874–878, 2021.
[6] K. V. V. and S. G. Jubin Thomas , Piyush Patidar, “An analysis of predictive maintenance strategies in supply chain management,” Int. J. Sci. Res. Arch., vol. 06, no. 01, pp. 308–317, 2022, doi: DOI: https://doi.org/10.30574/ijsra.2022.6.1.0144.
[7] J. Thomas, H. Volikatla, V. Venkata, and R. Indugu, “Machine Learning Approaches for Fraud Detection in E- commerce Supply Chains,” Innov. Comput. Sci. J., vol. 8, no. 1, 2022.
[8] S. Pandey, “Transforming Performance Management Through AI: Advanced Feedback Mechanisms, Predictive Analytics, and Bias Mitigation in the Age of Workforce Optimization,” Int. J. Bus. Quant. Econ. Appl. Manag. Reseacrh, vol. 6, no. 7, 2020.
[9] S. Pandey, “Leveraging Workday for Effective Covid-19 Vaccination Tracking: Integrating Custom Objects and Security Features in Human Capital Management Systems,” Int. J. Bus. Quant. Econ. Appl. Manag. reseacrh, vol. 7, no. 1, pp. 56–63, 2021.
[10] S. Pandey, “The Future of Recruitment : Analyzing the Impact of Artificial Intelligence on Evolving Hiring Processes and Strategies,” North Am. J. Eng. Res., vol. 3, no. 1, 2022.
[11] M. Gopalsamy, “An Optimal Artificial Intelligence ( AI ) technique for cybersecurity threat detection in IoT Networks,” Int. J. Sci. Res. Arch., vol. 07, no. 02, pp. 661–671, 2022.
[12] Mani Gopalsamy, “Enhanced Cybersecurity for Network Intrusion Detection System Based Artificial Intelligence (AI) Techniques,” Int. J. Adv. Res. Sci. Commun. Technol., vol. 12, no. 1, pp. 671–681, Dec. 2021, doi: 10.48175/IJARSCT-2269M.
[13] M. Gopalsamy, “Artificial Intelligence (AI) Based Internet-ofThings (IoT)-Botnet Attacks Identification Techniques to Enhance Cyber security,” Int. J. Res. Anal. Rev., vol. 7, no. 4, pp. 414–420, 2020.
[14] M. Gopalsamy, S. Cyber, and S. Specialist, “Advanced Cybersecurity in Cloud Via Employing AI Techniques for Effective Intrusion Detection,” IJRAR, vol. 8, no. 1, pp. 187–192, 2021.
[15] M. Gopalsamy, “A review on blockchain impact on in cybersecurity : Current applications , challenges and future trends,” IJSRA, vol. 06, no. 02, pp. 325–335, 2022.
[16] S. Hosmane and S. Rakshitha, “Overview of Data Warehouse,” nternational Res. J. Eng. Technol., no. May, 2021.
[17] R. Bishukarma, “The Role of AI in Automated Testing and Monitoring in SaaS Environments,” Int. J. Res. Anal. Rev., vol. 8, no. 2, pp. 846–851, 2021.
[18] R. Bishukarma, “Adaptive AI-Based Anomaly Detection Framework for SaaS Platform Security,” Int. J. Curr. Eng. Technol., vol. 12, no. 6, pp. 541–548, 2022, doi: https://doi.org/10.14741/ijcet/v.12.6.8.
[19] R. K. Arora, A. Soni, R. Garine, and A. Kumar, “Impact of Cloud-based Mobile Application during Pandemic ( Covid-19 ),” SSRN, 2022.
[20] R. Arora, “Impact of Cloud Computing Services and Application in Healthcare Sector and to provide improved quality patient care,” IEEE, 2021.
[21] M. S. Rajeev Arora, Sheetal Gera, “Mitigating Security Risks on Privacy of Sensitive Data used in Cloud-based Mitigating Security Risks on Privacy of Sensitive Data used in Cloud-based ERP Applications,” 8th Int. Conf. “Computing Sustain. Glob. Dev., pp. 458–463, 2021.
[22] A. Goyal, “Scaling Agile Practices with Quantum Computing for Multi-Vendor Engineering Solutions in Global Markets,” Int. J. Curr. Eng. Technol., vol. 12, no. 06, 2022, doi: : https://doi.org/10.14741/ijcet/v.12.6.10.
[23] A. Goyal, “Enhancing Engineering Project Efficiency through Cross-Functional Collaboration and IoT Integration,” Int. J. Res. Anal. Rev., vol. 8, no. 4, pp. 396–402, 2021.
[24] N. Mishra, “a Survey on Traditional and Cloud Based Data Warehousing Systems,” Ijsart, vol. 4, no. 2, 2018.
[25] Pranav Khare and Abhishek, “Cloud Security Challenges: Implementing Best Practices for Secure SaaS Application Development,” Int. J. Curr. Eng. Technol., vol. 11, no. 06, 2021, doi: https://doi.org/10.14741/ijcet/v.11.6.11.
[26] A. Goyal, “Optimising Software Lifecycle Management through Predictive Maintenance : Insights and Best Practices,” Int. J. Sci. Res. Arch., vol. 07, no. 02, pp. 693–702, 2022.
[27] V. S. Thokala, “A Comparative Study of Data Integrity and Redundancy in Distributed Databases for Web Applications,” Int. J. Res. Anal. Rev., vol. 8, no. 4, pp. 383–389, 2021.
[28] Vasudhar Sai Thokala, “Efficient Data Modeling and Storage Solutions with SQL and NoSQL Databases in Web Applications,” Int. J. Adv. Res. Sci. Commun. Technol., pp. 470–482, Apr. 2022, doi: 10.48175/IJARSCT-3861B.
[29] V. S. Thokala, “Integrating Machine Learning into Web Applications for Personalized Content Delivery using Python,” Int. J. Curr. Eng. Technol., vol. 11, no. 6, pp. 652–660, 2021, doi: https://doi.org/10.14741/ijcet/v.11.6.9.
[30] V. S. Thokala, “Utilizing Docker Containers for Reproducible Builds and Scalable Web Application Deployments,” Int. J. Curr. Eng. Technol., vol. 11, no. 6, pp. 661–668, 2021, doi: https://doi.org/10.14741/ijcet/v.11.6.10.
[31] B. Boddu, “IMPORTANCE OF NOSQL DATABASES: BUSINESS STRATEGIES WITH ADMINISTRATION TACTICS,” https://ijcem.in/archive/volume-7-issue-02-2022-current-issue/, vol. 7, no. 2, p. 5, 2022.
[32] B. Boddu, “Ensuring Data Integrity and Privacy: A Guide for Database Administrators,” https://www.ijfmr.com/research-paper.php?id=10880, vol. 4, no. 6, p. 6, 2022.
[33] B. Boddu, “Data Governance and Quality in Data Warehousing and Business Intelligence,” IJFMR, vol. 3, no. 6, p. 9, 2021.
[34] G. Saxena and B. B. Agarwal, “Data Warehouse Designing : Dimensional Modelling and E-R Modelling,” Int. J. Eng. Invent., vol. 3, no. 9, pp. 28–34, 2014.
[35] B. Boddu, “Challenges and Best Practices for Database Administration in Data Science and Machine Learning,” IJIRMPS, vol. 9, no. 2, p. 7, 2021, doi: https://www.ijirmps.org/research-paper.php?id=231461.
[36] V. Gaede and O. Günther, “Multidimensional Access Methods,” ACM Comput. Surv., 1998, doi: 10.1145/280277.280279.
[37] N. Beckmann, H. P. Kriegel, R. Schneider, and B. Seeger, “The R-tree: An Efficient and Robust Access Method for Points and Rectangles,” ACM SIGMOD Rec., 1990, doi: 10.1145/93605.98741.
[38] S. Berchtold, D. A. Keim, and H. P. Kriegel, “An index structure for high-dimensional data,” Readings Multimed. …, 2001.
[39] A. Guttman, “R-trees: A dynamic index structure for spatial searching,” ACM SIGMOD Rec., 1984, doi: 10.1145/971697.602266.
[40] N. Abid, “A Climbing Artificial Intelligence for Threat Identification in Critical Infrastructure Cyber Security,” Int. J. Res. Anal. Rev., vol. 9, no. 4, 2022.
[41] S. Pandya, “Predictive Analytics in Smart Grids : Leveraging Machine Learning for Renewable Energy Sources,” Int. J. Curr. Eng. Technol., vol. 11, no. 6, pp. 677–683, 2021.
[42] P. Khare and S. Srivastava, “SIGNATURE-BASED BIOMETRIC AUTHENTICATION: A DEEP DIVE INTO DEEP LEARNING APPROACHES,” Int. Res. J. Mod. Eng. Technol. Sci., vol. 4, no. 8, Aug. 2022, doi: 10.56726/IRJMETS29522.
[43] V. V. Kumar, A. Sahoo, and F. W. Liou, “Cyber-enabled product lifecycle management: A multi-agent framework,” in Procedia Manufacturing, 2019. doi: 10.1016/j.promfg.2020.01.247.
[44] S. S. Pranav Khare, “AI-Driven Palm Print Authentication: A comprehensive Analysis of Deep Learning Approaches for Efficient Biometrics,” Int. J. Sci. Res. Arch., vol. 6, no. 1, pp. 318–327, 2022.
[45] V. V. Kumar and F. T. S. Chan, “A superiority search and optimisation algorithm to solve RFID and an environmental factor embedded closed loop logistics model,” Int. J. Prod. Res., 2011, doi: 10.1080/00207543.2010.503201.
[46] M. S. Mahmud, J. Z. Huang, S. Salloum, T. Z. Emara, and K. Sadatdiynov, “A survey of data partitioning and sampling methods to support big data analysis,” Big Data Min. Anal., vol. 3, no. 2, pp. 85–101, 2020, doi: 10.26599/BDMA.2019.9020015.
[47] V. V. Kumar, F. T. S. Chan, N. Mishra, and V. Kumar, “Environmental integrated closed loop logistics model: An artificial bee colony approach,” in SCMIS 2010 - Proceedings of 2010 8th International Conference on Supply Chain Management and Information Systems: Logistics Systems and Engineering, 2010.
[48] V. V. Kumar, F. W. Liou, S. N. Balakrishnan, and V. Kumar, “Economical impact of RFID implementation in remanufacturing: a Chaos-based Interactive Artificial Bee Colony approach,” J. Intell. Manuf., 2015, doi: 10.1007/s10845-013-0836-9.
[49] V. Kumar, V. V. Kumar, N. Mishra, F. T. S. Chan, and B. Gnanasekar, “Warranty failure analysis in service supply Chain a multi-agent framework,” in SCMIS 2010 - Proceedings of 2010 8th International Conference on Supply Chain Management and Information Systems: Logistics Systems and Engineering, 2010.
[50] V. V. Kumar, M. K. Pandey, M. K. Tiwari, and D. Ben-Arieh, “Simultaneous optimization of parts and operations sequences in SSMS: A chaos embedded Taguchi particle swarm optimization approach,” J. Intell. Manuf., 2010, doi: 10.1007/s10845-008-0175-4.
[51] V. V Kumar, M. Tripathi, S. K. Tyagi, S. K. Shukla, and M. K. Tiwari, “An integrated real time optimization approach (IRTO) for physical programming based redundancy allocation problem,” 3rd Int. Conf. Reliab. Saf. Eng., pp. 692–704, 2007.
[52] N. G. Singh, Abhinav Parashar A, “Streamlining Purchase Requisitions and Orders : A Guide to Effective Goods Receipt Management,” J. Emerg. Technol. Innov. Res., vol. 8, no. 5, 2021.
[53] S. K. R. Anumandla, V. K. Yarlagadda, S. C. R. Vennapusa, and K. R. V. Kothapalli, “Unveiling the Influence of Artificial Intelligence on Resource Management and Sustainable Development: A Comprehensive Investigation,” Technol. \& Manag. Rev., vol. 5, no. 1, pp. 45–65, 2020.
[54] M. G. Kahn et al., “Migrating a research data warehouse to a public cloud : challenges and opportunities,” vol. 29, no. December 2021, pp. 592–600, 2022.
[55] V. V Kumar, M. Tripathi, M. K. Pandey, and M. K. Tiwari, “Physical programming and conjoint analysis-based redundancy allocation in multistate systems: A Taguchi embedded algorithm selection and control (TAS&C) approach,” Proc. Inst. Mech. Eng. Part O J. Risk Reliab., vol. 223, no. 3, pp. 215–232, Sep. 2009, doi: 10.1243/1748006XJRR210.
[56] X. Feng, “The Optimization of Privacy Data Management Model in Big Data Era,” in IEEE Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), 2021. doi: 10.1109/IAEAC50856.2021.9390675.
[57] S. Banerjee and K. C. Davis, “Modeling data warehouse schema evolution over extended hierarchy semantics,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 5530 LNCS, pp. 72–96, 2009, doi: 10.1007/978-3-642-03098-7_3.
[58] A. Q. Khan et al., “Smart Data Placement for Big Data Pipelines: An Approach based on the Storage-as-a-Service Model,” in Proceedings - 2022 IEEE/ACM 15th International Conference on Utility and Cloud Computing, UCC 2022, 2022. doi: 10.1109/UCC56403.2022.00056.
[59] X. Du, Y. He, and J. Z. Huang, “Random Sample Partition-Based Clustering Ensemble Algorithm for Big Data,” in Proceedings - 2021 IEEE International Conference on Big Data, Big Data 2021, 2021. doi: 10.1109/BigData52589.2021.9671297.
[60] A. R. Ali, “Real-time big data warehousing and analysis framework,” in 2018 IEEE 3rd International Conference on Big Data Analysis, ICBDA 2018, 2018. doi: 10.1109/ICBDA.2018.8367649.
Downloads
Published
Issue
Section
License
Copyright (c) IJSRSET

This work is licensed under a Creative Commons Attribution 4.0 International License.