Cleaning Uncertain Data with Crowdsourcing - a General Model with Diverse Accuracy Rates


The uncertainty of data has emerged as a significant challenge for database management systems as a result of the widespread presence of errors in a variety of applications. Probabilistic databases, which can be used to store uncertain data, and querying facilities, which can yield answers with confidence, are provided as a solution to the problem of dealing with uncertain data. However, when uncertainty spreads throughout a system, the results of a query or mining process may no longer be reliable. In this article, we make use of the power of crowdsourcing by developing a series of Human Intelligence Tasks, also known as HITs for short, in order to ask a large group of people to improve the quality of uncertain data. When answering the HITs, in particular, we take into account the fact that crowds are comprised of workers whose accuracy rates vary. We devise solutions with the goal of achieving the highest possible data quality while reducing the total number of HITs. There are two challenges associated with this non-trivial optimization, both of which contribute to the extremely high computational cost associated with choosing the best set of HITs. To begin, there is a possibility that a crowd will provide incorrect answers, albeit with varying probabilities. Second, the HITs that are decomposed from uncertain data frequently have strong correlations with one another. In this paper, we address these challenges by developing an efficient approximation algorithm as well as an effective heuristic solution, particularly for crowds with varying individual accuracy rates. We derive tight lower and upper bounds for effective filtering and estimation, which allows us to further improve the efficiency of the process. In order to accurately assess the efficacy of our solutions, we run exhaustive tests on a simulated crowd as well as on an actual crowdsourcing platform.

Did you like this research project?

To get this research project Guidelines, Training and Code... Click Here

PROJECT TITLE : An Edge Computing-based Photo Crowdsourcing Framework for Real-time 3D Reconstruction ABSTRACT: The process of image-based three-dimensional (3D) reconstruction takes a collection of photographs and uses them
PROJECT TITLE : Reliable Crowdsourcing and Deep Locality-Preserving Learning for Unconstrained Facial Expression Recognition ABSTRACT: Although face expression is fundamental to human experience, most previous databases and studies
PROJECT TITLE :Efficient and Flexible Crowdsourcing of Specialized Tasks With Precedence Constraints - 2018ABSTRACT:Several companies currently use crowdsourcing to leverage external furthermore internal crowds to perform specialized
PROJECT TITLE :Multi-Objective Optimization Based Allocation of Heterogeneous Spatial Crowdsourcing Tasks - 2018ABSTRACT:With the speedy development of mobile networks and the proliferation of mobile devices, spatial crowdsourcing,
PROJECT TITLE :Personalized and Diverse Task Composition in Crowdsourcing - 2018ABSTRACT:We have a tendency to study task composition in crowdsourcing and therefore the effect of personalization and diversity on performance. A

Ready to Complete Your Academic MTech Project Work In Affordable Price ?

Project Enquiry