On Fault Tolerance for Distributed Iterative Dataflow Processing - 2017


Large-scale graph and machine learning analytics widely use distributed iterative processing. Typically, these analytics are a half of a comprehensive workflow, that includes knowledge preparation, model building, and model evaluation. General-purpose distributed dataflow frameworks execute all steps of such workflows holistically. This holistic read enables these systems to reason regarding and automatically optimize the whole pipeline. Here, graph and machine learning analytics are known to incur a long runtime since they require multiple passes over the information till convergence is reached. Thus, fault tolerance and a fast-recovery from any intermittent failure is important for efficient analysis. During this paper, we have a tendency to propose novel fault-tolerant mechanisms for graph and machine learning analytics that run on distributed dataflow systems. We tend to ask for to scale back checkpointing costs and shorten failure recovery times. For graph processing, rather than writing checkpoints that block downstream operators, our mechanism writes checkpoints in an unblocking manner that doesn't break pipelined tasks. In contrast to the traditional approach for unblocking checkpointing (e.g., that manage checkpoints independently for immutable datasets), we tend to inject the checkpoints of mutable datasets into the iterative dataflow itself. Hence, our mechanism is iteration-aware by design. This simplifies the system architecture and facilitates coordinating checkpoint creation throughout iterative graph processing. Moreover, we tend to are able to rapidly rebound, via confined recovery, by exploiting the actual fact that log files exist regionally on healthy nodes and managing to avoid a whole recomputation from scratch. Furthermore, we tend to propose duplicate recovery for machine learning algorithms, whereby we tend to use a broadcast variable that enables us to quickly recover without having to introduce any checkpoints. So as to judge our fault tolerance strategies, we have a tendency to conduct each a theoretical study and experimental analyses using Apache Flink and see that they outperform blocking checkpointing and complete recovery.

Did you like this research project?

To get this research project Guidelines, Training and Code... Click Here

PROJECT TITLE :Enhancing Fault Tolerance and Resource Utilization in Unidirectional Quorum-Based Cycle Routing - 2018ABSTRACT:Cycle-based optical network routing, whether or not using synchronous optical networking rings or p-cycles,
PROJECT TITLE :Faultprog: Testing the Accuracy of Binary-Level Software Fault Injection - 2018ABSTRACT:Off-The-Shelf (OTS) software parts are the cornerstone of contemporary systems, as well as safety-important ones. However,
PROJECT TITLE :Symbolic Synthesis of Timed Models with Strict 2-Phase Fault Recovery - 2018ABSTRACT:In this article, we tend to concentrate on economical synthesis of fault-tolerant timed models from their fault-intolerant version.
PROJECT TITLE :Fault Space Transformation: A Generic Approach to Counter Differential Fault Analysis and Differential Fault Intensity Analysis on AES-like Block Ciphers - 2017ABSTRACT:Classical fault attacks, like differential
PROJECT TITLE :Fault Tolerant Logic Cell FPGA - 2017ABSTRACT:It is proposed fault tolerant logic cell - LUT FPGA consistent with concept of the functionally complete tolerant element (FCT). The FCT component (logic element with

Ready to Complete Your Academic MTech Project Work In Affordable Price ?

Project Enquiry