Speed Up Big Data Analytics by Unveiling the Storage Distribution of Sub-Datasets - 2018


During this Project, we have a tendency to study the matter of sub-dataset analysis over distributed file systems, e.g., the Hadoop file system. Our experiments show that the sub-datasets distribution over HDFS blocks, that is hidden by HDFS, will typically cause corresponding analyses to suffer from a seriously imbalanced or inefficient parallel execution. Specifically, the content clustering of sub-datasets results in some computational nodes carrying out much more workload than others; furthermore, it results in inefficient sampling of sub-datasets, as analysis programs can typically browse massive amounts of irrelevant data. We have a tendency to conduct a comprehensive analysis on how imbalanced computing patterns and inefficient sampling occur. We have a tendency to then propose a storage distribution aware technique to optimize sub-dataset analysis over distributed storage systems referred to as DataNet. First, we tend to propose an economical algorithm to get the meta-knowledge of sub-dataset distributions. Second, we tend to design an elastic storage structure called ElasticMap based mostly on the HashMap and BloomFilter techniques to store the meta-information. Third, we have a tendency to employ distribution-aware algorithms for sub-dataset applications to attain balanced and economical parallel execution. Our proposed method can profit completely different sub-dataset analyses with varied computational necessities. Experiments are conducted on PRObEs Marmot 128-node cluster testbed and also the results show the performance edges of DataNet.

Did you like this research project?

To get this research project Guidelines, Training and Code... Click Here

PROJECT TITLE :Road Traffic Speed Prediction: A Probabilistic Model Fusing Multi-Source Data - 2018ABSTRACT:Road traffic speed prediction could be a difficult downside in intelligent transportation system (ITS) and has gained
PROJECT TITLE :High speed and low power preset-able modified TSPC D flip-flop design and performance comparison with TSPC D flip-flop - 2018ABSTRACT:Positron emission tomography (PET) could be a nuclear functional imaging technique
PROJECT TITLE :Novel High speed Vedic Multiplier proposal incorporating Adder based on Quaternary Signed Digit number system - 2018ABSTRACT:This paper presents a high-speed Vedic multiplier based mostly on the Urdhva Tiryagbhyam
PROJECT TITLE :Control Scheme for Open-Ended Induction Motor Drives With a Floating Capacitor Bridge over a Wide Speed Range - 2017ABSTRACT:An electrical drive for high-speed applications is analyzed during this paper. The drive
PROJECT TITLE :Topology and Capacitor Voltage Balancing Control of a Symmetrical Hybrid Nine-Level Inverter for High Speed Motor Drives - 2017ABSTRACT:So as to increase the output voltage levels and reduce the isolated dc sources,

Ready to Complete Your Academic MTech Project Work In Affordable Price ?

Project Enquiry