Gradual Machine Learning for Entity Resolution


Entity resolution (ER), which is typically thought of as a classification problem, can pose a great deal of difficulty when applied to real data due to the prevalence of dirty values. The most advanced solutions for ER were constructed using a variety of learning models, the most prominent of which were deep neural networks. These models require a large amount of training data that has been accurately labeled. Unfortunately, high-quality labeled data typically require a significant amount of expensive manual labor, and as a result, they are not easily accessible in many real-world scenarios. In this paper, we propose a new learning paradigm for ER that we call gradual Machine Learning. The goal of this learning paradigm is to enable efficient machine labeling without the need for any manual labeling effort to be made. It starts off by labeling some of the easier instances in a task, which can be done automatically by the machine with a high level of accuracy. After that, it labels some of the more difficult instances by using iterative factor graph inference. When performing a task using gradual Machine Learning, the difficult instances in the task are labeled gradually in small stages based on the estimated evidential certainty provided by the instances that have already been labeled as easier. The results of our in-depth experiments on real data have demonstrated that the performance of the proposed method is noticeably superior to that of its unsupervised alternatives, and it is highly competitive in comparison to the most advanced supervised techniques currently available. We show, by using ER as a test case, that gradual Machine Learning is a promising paradigm that could potentially be applied to other difficult classification tasks that require extensive labeling effort.

Did you like this research project?

To get this research project Guidelines, Training and Code... Click Here

PROJECT TITLE : MAGNETIC: Multi-Agent Machine Learning-Based Approach for Energy Efficient Dynamic Consolidation in Data Centers ABSTRACT: Two of the most significant challenges for effective resource management in large-scale
PROJECT TITLE : Proposing Causal Sequence of Death by Neural Machine Translation in Public Health Informatics ABSTRACT: Over 2.7 million people pass away every year in the United States alone, contributing to the annual global
PROJECT TITLE : MM-UrbanFAC Urban Functional Area Classification Model Based on Multimodal Machine Learning ABSTRACT: The majority of the classification methods that are currently used for urban functional areas are only based
PROJECT TITLE : Performance Improvement of a Parsimonious Learning Machine Using Metaheuristic Approaches ABSTRACT: When dealing with data stream mining, autonomous learning algorithms operate in an online fashion. This is desirable
PROJECT TITLE : Misbehavior Detection for Position Falsification Attacks in VANETs Using Machine Learning ABSTRACT: Vehicles are able to communicate with one another and with infrastructures through the use of an advanced technology

Ready to Complete Your Academic MTech Project Work In Affordable Price ?

Project Enquiry