PROJECT TITLE :
HDM: A Composable Framework for Big Data Processing
Frameworks like MapReduce and Spark have been established in recent years to make constructing big data programs and applications easier. The jobs in these frameworks, on the other hand, are only loosely specified and bundled as executable jars, with no functionality exposed or explained.
As a result, deployed jobs are not naturally composable and reusable for future development. It also makes it difficult to add optimizations to the data flow of job sequences and pipelines. The Hierarchically Distributed Data Matrix (HDM) is a functional, strongly-typed data representation for creating composable big data applications, which we discuss in this work.
A runtime framework is included with HDM to help with application execution, integration, and management on distributed infrastructures. Multiple optimizations are implemented to increase the performance of running HDM jobs based on the functional data dependency graph of HDM.
When compared to the current state of the art, Apache Spark, the experimental findings demonstrate that our optimizations can reduce Job-Completion-Time by 10 to 40% for various types of applications.
Did you like this research project?
To get this research project Guidelines, Training and Code... Click Here