Randomized Multi-Dimensional Response

PROJECT TITLE :

Multi-Dimensional Randomized Response

ABSTRACT:

In the world of data, there are many controllers who collect information on individual subjects, and not all of them can be trusted. The individual needs to be given control over her own data so that she can maintain her privacy and, more generally, her ability to make decisions based on the information that is available to her. Local anonymization provides maximum agency because it enables each person to anonymize her own data before submitting them to a data controller. This allows for the maximum amount of control that can be exercised over one's own data. Randomized response, also known as RR, is an approach to local anonymization that can produce multi-dimensional full sets of anonymized microdata that are suitable for exploratory analysis and Machine Learning. This is the case due to the fact that it is possible to obtain an unbiased estimate of the distribution of the true data of individuals by pooling their randomized data together. In addition, RR provides stringent assurances regarding users' privacy. The main limitation of RR is that it suffers from the curse of dimensionality when it is applied to multiple attributes. This means that the accuracy of the estimated true data distribution quickly decreases as the number of attributes increases. In order to alleviate the negative effects of the dimensionality problem, we propose several methods that complement one another. First, we will discuss the limitations of the two fundamental protocols that we will present, which are the separate RR on each attribute and the joint RR for all attributes. Then, we present an algorithm for forming clusters of attributes so that attributes in different clusters can be viewed as independent and joint RR can be performed within each cluster. This allows for the maximum amount of flexibility in the analysis. After that, we present an adjustment algorithm for the randomized data set. This algorithm repairs some of the accuracy loss that occurred as a result of assuming independence between attributes when applying RR separately on each attribute or due to assuming independence between clusters when applying cluster-wise RR. Both of these errors occurred when using RR separately on each attribute. In order to illustrate the proposed methods, we present some empirical work as well.

Did you like this research project?

To get this research project Guidelines, Training and Code... Click Here

Randomized Multi-Dimensional Response

QUICK LINKS

Ready to Complete Your Academic MTech Project Work In Affordable Price ?