Parallel Fractional Hot-Deck Imputation and Variance Estimation for Big Incomplete Data Curing


The fractional hot-deck imputation, also known as FHDI, is a method for handling multivariate missing data that is assumption-free and serves as a general-purpose imputation technique. This method fills in each missing item with multiple observed values rather than resorting to artificially created values. The corresponding R package, FHDI J. Im, I. Cho, and J. K. Kim, "An R package for fractional hot deck imputation," R J., vol. 10, no. 1, pp. 140–154, 2018 possesses generality and efficiency; however, due to the requirement of excessive memory and a lengthy running time, it is not suitable for dealing with large amounts of incomplete data. We developed a new version of a parallel fractional hot-deck imputation program (named as P-FHDI), which is suitable for cleaning up large incomplete datasets, as a first step toward addressing large amounts of incomplete data by utilizing the FHDI. This program will be used to leverage the FHDI. When the P-FHDI was applied to large datasets containing up to millions of instances or 10,000 variables, the results demonstrated a speedup that was to the users' advantage. This paper explains the detailed parallel algorithms of the P-FHDI for large instances (big- n ) or high-dimensionality (big- p ) datasets and confirms the favorable scalability of the proposed approach. The proposed program takes all of the benefits of the serial FHDI and adds the ability to estimate variance in parallel, which will be of use to a wide variety of people working in the fields of science and engineering.

Did you like this research project?

To get this research project Guidelines, Training and Code... Click Here

PROJECT TITLE : Private Facial Prediagnosis as an Edge Service for Parkinson's DBS Treatment Valuation ABSTRACT: Facial phenotyping for the purpose of medical prediagnosis has recently been successfully exploited as a novel way
PROJECT TITLE : A Time-Series Feature-Based Recursive Classification Model to Optimize Treatment Strategies for Improving Outcomes and Resource Allocations of COVID-19 Patients ABSTRACT: This paper presents a novel Lasso Logistic
PROJECT TITLE :A Transformer-less Bipolar/Unipolar High-Voltage Pulse Generator with Low-Voltage Components for Water Treatment Applications - 2017ABSTRACT:Pulsed electric field could be a commonly used and effective disinfection
PROJECT TITLE :Hydrophobicity improvement of contaminated HTV silicone rubber by atmospheric plasma jet treatmentABSTRACT:Hydrophobicity improvement of contaminated warm temperature vulcanization (HTV) silicone rubber is of nice
PROJECT TITLE :Computational approaches for understanding the diagnosis and treatment of Parkinson's diseaseABSTRACT:This study describes how the appliance of evolutionary algorithms (EAs) will be used to review motor operate

Ready to Complete Your Academic MTech Project Work In Affordable Price ?

Project Enquiry