What I Learned About Sample Access, Bots, and Data Quality in Social Science Research with Limited…

What I Learned About Sample Access, Bots, and Data Quality in Social Science Research with Limited Resources

Receiving 700 responses in just two weeks, I thought I nailed it!

But when I reviewed the survey responses, I was disappointed. Of course, over 80% of my responses were machine-generated answers. I thought activating bot prevention tools would be enough, but clearly, it was not.

It was a tough lesson that showed me a challenging situation of doing social science research with limited resources, especially funding and connections. This post is for anyone navigating similar situations, so that you can plan ahead and avoid the pitfalls I experienced.

One: Choosing research topic given a lack of networks
Even before my proposal was approved, I was very concerned about data accessibility, considering resource constraints like costs, time, and a lack of networks. I remember the time when I was so indecisive; kept changing my research topics. Part of the reason was sample accessibility, given my lack of resources. So questions like: “Would I be able to access my sample easily and independently? How feasible is it for me to recruit my respondents?” also shaped my research direction.

For example, studying how farmers learn and accept smart farming technology sounds interesting, but, do I have network to those farmers here in the U.S.? Are they accessible to me? As the answer was no, so I pivoted.

Tip 1: If you have a lack of existing connections, prioritize a research question where sampling is reachable, feasible, and achievable, given your lack of resources.

I ended up focusing on studying public acceptance of broadband government program, targeting low-income households. Choosing this target, I could imagine several recruitment strategies to reach these participants. Even though the sample size of low-income households in the U.S. is large, at least, I felt confident enough that the sample was possible to reach, even if I had no strong connection.

Two: Survey & funding limitations
I already settled on a topic that I could realistically do. Then, the next challenge was how to reach the sample of my survey. Ideally, research should be funded.

Research funding is important in social science research, unless we conduct low cost studies, such as secondary data analysis, and a systematic review. However, if we aim for collecting primary data to understand people’s behavior, experiences, or perceptions, using either quantitative or qualitative analysis, funding is definitely crucial. Recruitment barriers due to financial issue will not only influence the data quality but also the speed of research.

Tip 2: Find any dissertation funding. If you are doing dissertation research, check your university or external sources for awards and grants and stay updated.

But, what if we end up with… we don’t get any funding? What if we have limited funding? As that was my case, I had to rethink about my sampling strategy to find cost-effective approaches. The idea is how to get a sample size with minimal resources, without sacrificing data quality.

Initially, I was gonna spend $1500 for data gathering on platforms like Prolific or MTurk. But after carefully thinking, I pivoted and did the following approaches:
(1) Recruiting participants on social media platforms, like Reddit. I posted on some relevant subreddits. Some of them were accepted, others were moderated. I also posted on survey exchange subreddit.
(2) Posting on Facebook Groups, especially on groups relevant to my research, such as Social Security Program, Food Programs, and other local/community events.
(3) As an incentive, I offered free digital products with Master Resell Rights (MMR) and Private Label Rights (PLR), as well as a raffle. 10 people could win a $10 gift card.

The result? I received a lot of survey responses, like around 700 in less than two weeks (too good to be true, right?). But to my dismay, the quality of the data was very low, with over 80% turned out to be bots.

Even though I used Qualtrics with captcha and bots prevention activated, unfortunately, it could not effectively block them. This is where I realized the importance of adding an attention check question in my survey. Add a question like “Please select ‘Never’ to show that you have read this question attentively.” As bots generate answers randomly, they will most likely fail to precisely answer this attention check question.

Tip 3: If you offer incentives on your surveys, you will most likely attract bots. Prevent them on two layers: the survey platform like Qualtrics (e.g., bot detection, captcha, prevent ballot box stuffing, etc.) and in your survey questions (attention check questions).

Three: Cleaning the data from bots
Even I had activated bots prevention features, Qualtrics was only able to detect around 50% of the bots responses. Meanwhile, after manually reviewing the data, I found nearly 80% responses were actually from machines.

Tip 4: If you do surveys on social media platforms, always recheck your responses quality independently, even if you use a platform with bot filters.

Bots or non-human responses could potentially create noise, a threat to data validity. Therefore, cleaning the data from bots responses was an important step in my methodology.

Okay, I will make the closing sentences so positive >>

Indeed, you can do much more with research funding, it opens up many possibilities. But even with limited challenges, you can still gather meaningful data with careful planning, creative strategies, and meticulous checking. Do not stop, keep going. Every challenge you encounter will make you even stronger, make you understand both the good and bad research practices. It prepares you for an amazing research journey ahead. But sure, don’t stop seeking any research grants, lol. When you receive research funding later, you’ll have the capacity to unlock even greater possibilities.

Source link

Leave a Reply Cancel reply