Multivariate Missing Data Handling with Iterative Bayesian Additive Lasso (IBAL) Multiple Imputation in Multicore Environment on Cloud

Authors(3) :-Lavanya. K, L. S. S. Reddy, B. Eswara Reddy

Dealing with high dimensional data of the form p>n for multivariate analysis of missingness is very complicated. It arises in many fields mainly in social science, economics and medical study; genome is an example for that where is to mention that samples are very less compared to study elements nothing but variables. The analysis is a combination of large covariate vectors with response and non-response effects of unknown functional form related to response variable of interest. Thus, there is a need for regularized regression models, with effect of smoothing parametric method to do this in this work combine regularization by incorporating different types of covariates. Although regularization approaches fits to framework but the computation high demands in high dimensional analysis they also rely on penalized estimation. The solution is to implement regularization in iteration based smoothing approaches to fit such analysis. The proposed algorithm called Iterative Bayesian Additive Lasso (IBAL) is compared with standard methods in medical analysis and produced unbiased results. The overall work done in multi core environment offered by Cloud Service called Microsoft Azure. The performance is estimated with benchmarks like Standard Error (SE), Mean Square Error (MSE), and Confidence Interval (CI).

Authors and Affiliations

Lavanya. K
Research Scholar,Department of Computer Science & Engineering, JNTUA College of Engineering, Ananthapuramu, Andhra Pradesh, India
L. S. S. Reddy
Professor, Department of Computer Science and Engineering, KLUniversity,Vaddeswaram, Guntur(Dt.) , Andhra Pradesh, India
B. Eswara Reddy
Professor, Department of Computer Science and Engineering, JNTUA College of Engineering,Kalikiri, Chittoor(Dt.), Andhra Pradesh, India

Multiple Imputation, Regularized Regression, Additive Lasso, High Dimensional, and Multicore Environment.

  1. Aittokallio. Dealing with missing values in large-scale studies: microarray data imputation and beyond. Briefings in Bioinformatics, 11(2):253–264, 2010.
  2. Graham, J. W., Hofer, S. M., Piccinin, A. M. (1994), “Analysis with missing data in drug prevention research." National Institute on Drug Abuse Research Monograph 142, 13-63.
  3. Aittokallio. Dealing with missing values in large-scale studies: microarray data imputation and beyond. Briefings in Bioinformatics, 11(2):253–264, 2010.
  4. Little RJ, D’Agostino R, Cohen ML, et al. The prevention and treatment of missing data in clinical trials. N Engl J Med. 2012;367(14):1355–1360
  5. Mazumder, T. Hastie, and R. Tibshirani. Spectral regularization algorithms for learning large incomplete matrices. Journal of Machine Learning Research, 99:2287–2322, 2010.
  6. Ibrahim J, Molenberghs G. Missing data methods in longitudinal studies: A review. Test (Madr) 2009;18:1–43
  7. Gromski, P. S., Xu, Y., Kotze, H. L., Correa, E., Ellis, D. I., Armitage, E. G., Turner, M. L., & Goodacre, R. (2014). Influence of missing values substitutes on multivariate analysis of metabolomics data. Metabolites, 4(2), 433-452.
  8. Chiu C-C, Chan S-Y, Wang C-C, Wu W-S. Missing value imputation for microarray data: a comprehensive comparison study and a web tool. BMC Syst Biol. 2013;7(S-6):12. doi: 10.1186/1752-0509-7-S6-S12.
  9. Stuart EA, Azur M, Frangakis C, et al. Multiple imputation with large data sets: a case study of the Children’s Mental Health Initiative. Am J Epidemiol. 2009;169(9):1133–1139.
  10. Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (2014). Bayesian data analysis (Vol. 2). Boca Raton, FL, USA: Chapman & Hall/CRC.
  11. Gilks, W. R. and Wild, P. P. (1992). Adaptive rejection sampling for gibbs sampling. Appl. Statist, 41(2):337–348.
  12. Allen and R. Tibshirani. Transposable regularized covariance models with an application to missing data imputation. Annals of Applied Statistics, 4(2):764–790, 2010.
  13. Consentino, F. and Claeskens, G. (2011). Missing covariates in logistic regression, estimation and distribution selection. Statistical Modelling, 11(2):159–183.
  14. Josse, J. and Husson, F. (2016). missMDA: A package for handling missing values in multivariate data analysis. Journal of Statistical Software, 70(1):1–31.
  15. de Jong, S. van Buuren, and M. Spiess. Multiple imputation of predictor variables using generalized additive models. Communications in Statistics - Simulation and Computation, 45(3):968–985, 2014. ISSN 1532-4141

Publication Details

Published in : Volume 6 | Issue 3 | May-June 2019
Date of Publication : 2019-06-30
License:  This work is licensed under a Creative Commons Attribution 4.0 International License.
Page(s) : 194-200
Manuscript Number : IJSRSET196319
Publisher : Technoscience Academy

Print ISSN : 2395-1990, Online ISSN : 2394-4099

Cite This Article :

Lavanya. K, L. S. S. Reddy, B. Eswara Reddy, " Multivariate Missing Data Handling with Iterative Bayesian Additive Lasso (IBAL) Multiple Imputation in Multicore Environment on Cloud , International Journal of Scientific Research in Science, Engineering and Technology(IJSRSET), Print ISSN : 2395-1990, Online ISSN : 2394-4099, Volume 6, Issue 3, pp.194-200, May-June-2019. Available at doi : https://doi.org/10.32628/IJSRSET196319
Journal URL : http://ijsrset.com/IJSRSET196319

Article Preview