RFHOC: A Random-Forest Approach to Auto-Tuning Hadoop's Configuration


Hadoop is a widely-used implementation framework of the MapReduce programming model for massive-scale data processing. Hadoop performance but is considerably littered with the settings of the Hadoop configuration parameters. Unfortunately, manually tuning these parameters is very time-consuming, if in any respect sensible. This paper proposes an approach, known as RFHOC, to automatically tune the Hadoop configuration parameters for optimized performance for a given application running on a given cluster. RFHOC constructs two ensembles of performance models using a random-forest approach for the map and reduce stage respectively. Leveraging these models, RFHOC employs a genetic algorithm to automatically search the Hadoop configuration space. The analysis of RFHOC using five typical Hadoop programs, every with five totally different input information sets, shows that it achieves a performance speedup by a factor of 2.eleven on average and up to seven.4 over the recently proposed cost-based optimization (CBO) approach. In addition, RFHOC's performance profit increases with input knowledge set size.

Did you like this research project?

To get this research project Guidelines, Training and Code... Click Here

PROJECT TITLE : Joint Routing and Medium Access Control in Fixed Random Access Wireless Multihop Networks - 2014 ABSTRACT: We study cross-layer design in random-access-based fixed wireless multihop networks under a physical
PROJECT TITLE : Hop-by-Hop Message Authenticationand Source Privacy in WirelessSensor Networks - 2014 ABSTRACT: Message authentication is one of the most effective ways to thwart unauthorized and corrupted messages from being
PROJECT TITLE : Cross-Layer Approach for Minimizing Routing Disruption in IP Networks - 2014 ABSTRACT: Backup paths are widely used in IP networks to protect IP links from failures. However, existing solutions such as the commonly
PROJECT TITLE :Network Traffic Classification Using Correlation Information - 2013ABSTRACT:Traffic classification has wide applications in network management, from security monitoring to quality of service measurements. Recent
PROJECT TITLE :T-Drive Enhancing Driving Directions with Taxi Drivers’ Intelligence - 2013ABSTRACT:This paper presents a smart driving direction system leveraging the intelligence of experienced drivers. In this system, GPS-equipped

Ready to Complete Your Academic MTech Project Work In Affordable Price ?

Project Enquiry