Speed Up Big Data Analytics by Unveiling the Storage Distribution of Sub-Datasets - 2018


During this Project, we have a tendency to study the matter of sub-dataset analysis over distributed file systems, e.g., the Hadoop file system. Our experiments show that the sub-datasets distribution over HDFS blocks, that is hidden by HDFS, will typically cause corresponding analyses to suffer from a seriously imbalanced or inefficient parallel execution. Specifically, the content clustering of sub-datasets results in some computational nodes carrying out much more workload than others; furthermore, it results in inefficient sampling of sub-datasets, as analysis programs can typically browse massive amounts of irrelevant data. We have a tendency to conduct a comprehensive analysis on how imbalanced computing patterns and inefficient sampling occur. We have a tendency to then propose a storage distribution aware technique to optimize sub-dataset analysis over distributed storage systems referred to as DataNet. First, we tend to propose an economical algorithm to get the meta-knowledge of sub-dataset distributions. Second, we tend to design an elastic storage structure called ElasticMap based mostly on the HashMap and BloomFilter techniques to store the meta-information. Third, we have a tendency to employ distribution-aware algorithms for sub-dataset applications to attain balanced and economical parallel execution. Our proposed method can profit completely different sub-dataset analyses with varied computational necessities. Experiments are conducted on PRObEs Marmot 128-node cluster testbed and also the results show the performance edges of DataNet.

Did you like this research project?

To get this research project Guidelines, Training and Code... Click Here

PROJECT TITLE : A Multitask Learning Model for Traffic Flow and Speed Forecasting ABSTRACT: Accurate short-term traffic state forecasting is beneficial to Intelligent Transportation Systems (ITS) research and applications. This
PROJECT TITLE : Accelerating GMM-Based Patch Priors for Image Restoration Three Ingredients for a 100_ Speed-Up ABSTRACT: The goal of picture restoration is to restore a clear image from a smudged one. In order to restore natural
PROJECT TITLE : Lifetime Estimation of DC-link Capacitors in Adjustable Speed Drives Under Grid Voltage Unbalances ABSTRACT: In grid-connected diode rectified adjustable speed drives, an electrolytic capacitor with a dc-side
PROJECT TITLE : Model Reference Neural Adaptive Control Based BLDC Motor Speed Control ABSTRACT: A multi-variable, non-linear, strong-coupling system is employed in the brushless DC (BLDC) motor control system to show resilient
PROJECT TITLE : Sensor less Speed Control for Brushless DC Motors ABSTRACT: For the speed management of Brushless DC Motors, this study provides a fuzzy controlled integrated speed - Sensorless technique (BLDCM). This speed

Ready to Complete Your Academic MTech Project Work In Affordable Price ?

Project Enquiry