Optimizing Big Data processing performance in the public cloud: opportunities and approaches


Nowadays's lightning fast data generation from large sources is looking for efficient huge information processing, that imposes unprecedented demands on the computing and NetWorking infrastructures. State-of-the-art tools, most notably MapReduce, are usually performed on dedicated server clusters to explore knowledge parallelism. For grass roots users or non-computing professionals, the cost of deploying and maintaining a large-scale dedicated server clusters will be prohibitively high, not to say the technical skills concerned. On the other hand, public clouds allow general users to rent virtual machines and run their applications in an exceedingly pay-as-you-go manner with ultra-high scalability with minimal upfront prices. This new computing paradigm has gained tremendous success in recent times, changing into a highly attractive alternative to dedicated server clusters. This text discusses the essential challenges and opportunities when big knowledge meet the general public cloud. We identify the key variations between running massive data processing in an exceedingly public cloud and in dedicated server clusters. We then present 2 vital issues for efficient massive information processing in the public cloud, resource provisioning (i.e., a way to rent VMs) and VM-MapReduce job/task scheduling (i.e., how to run MapReduce once the VMs are created). Each of those 2 questions have a set of problems to resolve. We present solution approaches for bound issues, and provide optimized design tips for others. Finally, we discuss our implementation experiences.

Did you like this research project?

To get this research project Guidelines, Training and Code... Click Here

PROJECT TITLE : Adaptive Lower-level Driven Compaction to Optimize LSM-Tree Key-Value Stores ABSTRACT: Log-structured merge (LSM) tree key-value stores have been widely implemented in many NoSQL and SQL systems. These stores
PROJECT TITLE :Optimizing Performance of Co-Existing Underlay Secondary Networks - 2018ABSTRACT:In this Project, we have a tendency to analyze total throughput and (asymptotic) total ergodic rate performance of 2 co-existing downlink
PROJECT TITLE :Optimizing Internet Transit Routing for Content Delivery Networks - 2018ABSTRACT:Content delivery networks (CDNs) maintain multiple transit routes from content distribution servers to eyeball ISP networks that
PROJECT TITLE :A Ternary Unification Framework for Optimizing TCAM-Based Packet Classification Systems - 2018ABSTRACT:Packet classification is that the key mechanism for enabling many networking and security services. Ternary
PROJECT TITLE :Optimizing for Tail Sojourn Times of Cloud Clusters - 2018ABSTRACT:A standard pitfall when hosting applications in these days's cloud environments is that virtual servers often experience varying execution speeds

Ready to Complete Your Academic MTech Project Work In Affordable Price ?

Project Enquiry