Protecting User Data Through Local Differential Privacy Protocols
Disciplines
Computer Sciences | Databases and Information Systems | Information Security | Theory and Algorithms
Abstract (300 words maximum)
As we entered the internet age humans lacked the knowledge and insight that it would bring an immense struggle for data privacy. As social media becomes more widespread, we have increasingly documented every part of our daily life, and every bit of information about us. Experts are constantly looking for new ways to protect our data. We must find a way to get data anonymously to groups who want it without disclosing the users true identity. This project explores a privacy preserving technique to collect data while protecting each user’s privacy. These protocols are known as Local Differential Privacy, and in this project, we implemented one such algorithm, known as Optimized Local Hashing (OLH). The algorithm has three major steps, hashing, perturb, and aggregation. Hashing separates the answer choice into buckets, perturb randomly changes the choice into another bucket, and aggregation simply computes and returns the expected values. As we continue to use this process and analyze the data it provides and compare it to the data that was received without any variation. When comparing the two sets of data it is very similar overall. Our changed data is more spread, however this does not mean it is inaccurate, or a bad representation of the population. We are still working to interpret data we receive. As studies continue to develop and this technique is refined, we will be able to safely get information from users to better their experience, while simultaneously protecting their identities.
Academic department under which the project should be listed
CCSE - Computer Science
Primary Investigator (PI) Name
Xinyue Zhang
Protecting User Data Through Local Differential Privacy Protocols
As we entered the internet age humans lacked the knowledge and insight that it would bring an immense struggle for data privacy. As social media becomes more widespread, we have increasingly documented every part of our daily life, and every bit of information about us. Experts are constantly looking for new ways to protect our data. We must find a way to get data anonymously to groups who want it without disclosing the users true identity. This project explores a privacy preserving technique to collect data while protecting each user’s privacy. These protocols are known as Local Differential Privacy, and in this project, we implemented one such algorithm, known as Optimized Local Hashing (OLH). The algorithm has three major steps, hashing, perturb, and aggregation. Hashing separates the answer choice into buckets, perturb randomly changes the choice into another bucket, and aggregation simply computes and returns the expected values. As we continue to use this process and analyze the data it provides and compare it to the data that was received without any variation. When comparing the two sets of data it is very similar overall. Our changed data is more spread, however this does not mean it is inaccurate, or a bad representation of the population. We are still working to interpret data we receive. As studies continue to develop and this technique is refined, we will be able to safely get information from users to better their experience, while simultaneously protecting their identities.