Abstract (300 words maximum)
Online survey platforms enable faster and more cost-effective data collection across public, private, and academic sectors, but they also introduce challenges related to fraud. The ever-looming, evolving nature of fraud requires constant adaptation to preserve data quality. Our prior research examined data quality amongst the three leading pay-for-data collection platforms, where only 10.2% of MTurk respondents provided acceptable-quality data. This study aims to demonstrate how follow-up verification surveys can improve the security and efficiency of incentive payouts by verifying respondent demographic information to reduce fraud.
A new survey was disseminated via MTurk with updated quality checks to refine fraud detection, yielding over 5,000 responses. The survey's administrator received emails surrounding survey compensation issues, prompting a standardized follow-up survey designed to verify previously reported demographic information sent in response to email communications. Responses from the original survey were match merged with the follow-up survey data, and descriptive and comparative analyses were conducted to identify inconsistencies.
Preliminary findings suggest that follow-up surveys can successfully identify fraud and potentially allow for a more rigorous compensation system. While few discrepancies were found in zip code, birth year, and email among respondents who completed the follow-up survey, significant acts of fraud were found among those who initiated contact but failed to complete the survey. After omitting duplicate emails and survey submissions, 28 emails were received regarding issues with incentives. Of those, only 50% of respondents completed the follow-up survey. Among those who completed, 1 had a zip code discrepancy, 2 had birth year discrepancies, and 1 had an email inconsistency. Of the 14 who did not complete the follow-up survey, nearly half of the distinct MTurk Worker IDs shared the same email in the original survey. These findings have implications for future applications for clinicians and other professionals relying on online surveys for data collection.
Academic department under which the project should be listed
CCSE - Data Science and Analytics
Primary Investigator (PI) Name
Kevin Gittner
Think You Can Fake It? We'll Make You Verify It: Utilizing Follow-up Surveys to Detect and Prevent Fraud in Online Crowdsourced Data
Online survey platforms enable faster and more cost-effective data collection across public, private, and academic sectors, but they also introduce challenges related to fraud. The ever-looming, evolving nature of fraud requires constant adaptation to preserve data quality. Our prior research examined data quality amongst the three leading pay-for-data collection platforms, where only 10.2% of MTurk respondents provided acceptable-quality data. This study aims to demonstrate how follow-up verification surveys can improve the security and efficiency of incentive payouts by verifying respondent demographic information to reduce fraud.
A new survey was disseminated via MTurk with updated quality checks to refine fraud detection, yielding over 5,000 responses. The survey's administrator received emails surrounding survey compensation issues, prompting a standardized follow-up survey designed to verify previously reported demographic information sent in response to email communications. Responses from the original survey were match merged with the follow-up survey data, and descriptive and comparative analyses were conducted to identify inconsistencies.
Preliminary findings suggest that follow-up surveys can successfully identify fraud and potentially allow for a more rigorous compensation system. While few discrepancies were found in zip code, birth year, and email among respondents who completed the follow-up survey, significant acts of fraud were found among those who initiated contact but failed to complete the survey. After omitting duplicate emails and survey submissions, 28 emails were received regarding issues with incentives. Of those, only 50% of respondents completed the follow-up survey. Among those who completed, 1 had a zip code discrepancy, 2 had birth year discrepancies, and 1 had an email inconsistency. Of the 14 who did not complete the follow-up survey, nearly half of the distinct MTurk Worker IDs shared the same email in the original survey. These findings have implications for future applications for clinicians and other professionals relying on online surveys for data collection.