PROJECT TITLE :

A Review for Weighted MinHash Algorithms

ABSTRACT:

The computation of data similarity, also known as data distance, is a fundamental research topic that serves as the basis for a large number of high-level applications in the fields of Machine Learning and Data Mining that are based on similarity measures. In large-scale real-world scenarios, however, the exact computation of similarity has become challenging as a result of the "3V" nature of Big Data, which refers to the volume, velocity, and variety of the data. In this instance, the hashing procedures have been proven to be effective at performing similarity estimation in both theory and practice. This verification was performed on both sets of data. At the moment, one of the most common methods for quickly estimating the Jaccard similarity of binary sets is called MinHash. In addition, weighted MinHash can be generalized to estimate the generalized Jaccard similarity of weighted sets. In this review, the primary focus is on classifying the various works of weighted MinHash algorithms and having a discussion about them. In this review, we focus primarily on classifying the weighted MinHash algorithms into quantization-based approaches, "active index"-based ones, and others. We also demonstrate the development and inherent connection of the weighted MinHash algorithms, beginning with the integer weighted MinHash algorithms and progressing to the real-valued weighted MinHash algorithms. In addition to that, we have created a Python toolbox for the algorithms, and we have made it available for download on our github. Within the context of the information retrieval task and the similarity estimation error, we conduct an experimental investigation into the comprehensive study of the standard MinHash algorithm as well as the weighted MinHash ones.


Did you like this research project?

To get this research project Guidelines, Training and Code... Click Here


PROJECT TITLE : Index-based Intimate-Core Community Search in Large Weighted Graphs ABSTRACT: On a number of different kinds of graphs, community search that locates communities dependent on a query has been investigated. Intimate-core
PROJECT TITLE : W-GeoR: Weighted Geographical Routing for VANET’s Health Monitoring Applications in Urban Traffic Networks ABSTRACT: The infrastructure-based communication system that is currently in place is susceptible
PROJECT TITLE : A Computationally Efficient Connectivity Index for Weighted Directed Graphs With Application to Underwater Sensor Networks ABSTRACT: The global connectivity of complex networks that have random links is the
PROJECT TITLE : Active Learning From Imbalanced Data A Solution of Online Weighted Extreme Learning Machine ABSTRACT: Active learning is well known for its ability to improve the quality of a classification model while also reducing
PROJECT TITLE : Weighted Guided Image Filtering With Steering Kernel ABSTRACT: The guided image filter (GIF) is prone to halo artefacts at the margins because of its local characteristic. As a workaround, a weighted guided image

Ready to Complete Your Academic MTech Project Work In Affordable Price ?

Project Enquiry