An Iterative MapReduce Based Frequent Subgraph Mining Algorithm


Frequent subgraph mining (FSM) is an important task for exploratory data analysis on graph data. Over the years, many algorithms have been proposed to solve this task. These algorithms assume that the data structure of the mining task is small enough to fit in the main memory of a computer. However, as the real-world graph data grows, both in size and quantity, such an assumption does not hold any longer. To overcome this, some graph database-centric methods have been proposed in recent years for solving FSM; however, a distributed solution using MapReduce paradigm has not been explored extensively. Since MapReduce is becoming the de-facto paradigm for computation on massive data, an efficient FSM algorithm on this paradigm is of huge demand. In this work, we propose a frequent subgraph mining algorithm called FSM-H which uses an iterative MapReduce based framework. FSM-H is complete as it returns all the frequent subgraphs for a given user-defined support, and it is efficient as it applies all the optimizations that the latest FSM algorithms adopt. Our experiments with real life and large synthetic datasets validate the effectiveness of FSM-H for mining frequent subgraphs from large graph datasets. The source code of FSM-H is available from software/

Did you like this research project?

To get this research project Guidelines, Training and Code... Click Here

PROJECT TITLE :Iterative Receivers for Downlink MIMO-SCMA: Message Passing and Distributed Cooperative Detection - 2018ABSTRACT:The fast development of mobile communications requires even higher spectral potency. Non-orthogonal
PROJECT TITLE :Diagnosing and Minimizing Semantic Drift in Iterative Bootstrapping Extraction - 2018ABSTRACT:Semantic drift is a common problem in iterative information extraction. Previous approaches for minimizing semantic drift
PROJECT TITLE :Iterative Block Tensor Singular Value Thresholding For Extraction Of Low Rank Component Of Image Data - 2017ABSTRACT:Tensor principal component analysis (TPCA) is a multi-linear extension of principal component
PROJECT TITLE : Efficiently Promoting Product Online Outcome: An Iterative Rating Attack Utilizing Product and Market Property - 2017 ABSTRACT: The prosperity of on-line rating system makes it a popular place for malicious
PROJECT TITLE : On Fault Tolerance for Distributed Iterative Dataflow Processing - 2017 ABSTRACT: Large-scale graph and machine learning analytics widely use distributed iterative processing. Typically, these analytics are

Ready to Complete Your Academic MTech Project Work In Affordable Price ?

Project Enquiry