On Star-Schema Heterogeneous Graphs, Effective Distributed Clustering Algorithms PROJECT TITLE : Efficient Distributed Clustering Algorithms on Star-Schema Heterogeneous Graphs ABSTRACT: Graphs are a useful modeling tool for a wide variety of datasets, including the data collected from social media platforms and bibliographic databases. The process of clustering these types of graphs can provide valuable insights into the organization of the data. The quality of the clustering process can be improved by taking into account the attributes of the nodes, which will produce attributed graphs. In practice, existing methods for clustering attributed graphs tend to separate the importance of attribute similarity and structural similarity. In this paper, we represent attributed graphs as star-schema heterogeneous graphs. This means that the attributes of the graph are modeled as different types of graph nodes. This makes it possible to use personalized pagerank (PPR) as a unified distance measure that takes into account both the structural and attribute similarities between two websites. We cluster the data using DBSCAN, and in order to strike an appropriate balance between the relative importance of the various attributes, we iteratively modify the edge weights. Traditional clustering algorithms are being put to the test by the ever-increasing amounts of data that are available today; as a result, we need a distributed method. As a result, we make use of a widely used distributed graph computing system known as Blogel. On the basis of this system, we develop four exact and approximate methods that enable efficient PPR score computation whenever edge weights are modified. We propose a straightforward and efficient method for updating the edge weights that is based on entropy in order to enhance the efficiency of the clustering process. In addition, we present a method based on game theory that enables a trade-off between the quality of the results and their efficiency. Extensive testing of our hypotheses on actual datasets from the real world gives us insights into the usefulness and practicality of our ideas. Did you like this research project? To get this research project Guidelines, Training and Code... Click Here facebook twitter google+ linkedin stumble pinterest Time Series Classification Using Efficient Shapelet Discovery A Meta-path Free Approach to Effective Similarity Search on Heterogeneous Networks