I consider myself a savvy consumer of news. I read widely, subscribing to organizations across the political spectrum, including the Wall Street Journal, the Economist, the Financial Times, the Washington Post, the New York Times, and a slew of Substacks. By doing so I attempt to better uncover the “truthiness” of an issue.

In an era marked by increasing political polarization and self-reinforcing echo chambers, it’s paramount that not only do we read widely but engage with high-quality news that can withstand strict scrutiny. And we must push back when we encounter articles or assertions that may feel true but lack real evidence for their claims.

I’m not perfect — far from it. I, too, have latched onto an article, scoop, or series that seemed rigorously reported and undoubtedly reinforced my prior beliefs. I closed the browser tab and felt more secure in my position and worldview. But I was wrong. I didn’t do the legwork required — I trusted but didn’t verify.

This brings me to this post. Since 2016 I’ve read a few dozen pieces that asserted, directly or indirectly, that social media discourse — and by extension, a massive misinformation scheme — is being driven by social bots. I’ll adopt Allem and Ferrara’s (2018) definition of a social bot:

“Social bots are automated accounts that use artificial intelligence to steer discussions and promote specific ideas or products on social media such as Twitter and Facebook. To typical social media users browsing their feeds, social bots may go unnoticed as they are designed to resemble the appearance of human users (e.g., showing a profile photo and listing a name or location) and behave online in a manner similar to humans (e.g., retweeting or quoting others’ posts and liking or endorsing others’ tweets).”

To put a finer point on it: Social bots are not the same thing as troll farms or automated bots. Troll farms exist without question and automated bots are useful tools for disseminating information — major news organizations use them to automatically post their articles on Twitter, Facebook, and other social media platforms. Additionally, these tools are widely available for personal use. I’ve previously used IFTTT to effortlessly post across platforms. But such automated tools are not by definition social bots as being discussed and heralded as the harbingers of doom for public discourse and civil society in America. (Troll farms — or any orchestrated misinformation campaign by domestic or foreign actors — are a threat to genuine discourse, but I do not focus on them here.)

These articles entice readers to imagine that our discourse is directed or unduly influenced by true social bots. Let me state my position clearly: I believe social bots are harmful to society, broadly. I find articles asserting that half of the accounts tweeting about Covid-19 are bots frightening but also unfairly hyperbolic, and in particular I take issue with the current methodologies that researchers use to arrive at these findings. (If my high school teachers or college professors are reading this, I apologize for burying the lede this far down.)

These methodologies — one in particular — are the focus of this post. How are researchers arriving at these conclusions? Are their methods sound and defensible? Are the tools we use to discern bots from humans reliable? That is to say: Do they accurately discriminate between bots and humans? If the answer is no, then, we must: (1) work to develop methods that are robust, valid, and accurate, and (2) interrogate our priors about who is doing what on social media and why.

Finally, I commit to conducting my own analysis using the 2022 mid-term elections to test the theory that social bots have an outsized influence on public discourse, specifically our online discussions about politics.

Methods for Detecting Social Bots

Three major methods for detecting social bots on social media platforms exist today:

The Oxford Method
The Berkeley/Swansea Approach
USC/Indiana’s Botometer

I’ll focus on Botometer’s methodology and accuracy. Researchers built a tool that allows users to input a username or list of handles to discriminate social bots from humans. Without getting too in the weeds, they employed a supervised learning method. Specifically, the authors of the tool used random forest classification. Their classification is based on more than 1,000 features and grouped into six main classes: network features, user features, friends features, temporal features, content features, and sentiment features.

Before April 18, 2018, a list of Congress members’ Twitter handles was fed into Botometer. The result? A Gaussian distribution. Yes, the tool reported (and based on its prior learning believed) that nearly one-half of Congress members were bots — and while I disagree with many in Congress on most things, they in fact are not bots. A year later, Botometer performed remarkably better on the same list of Congress members — dropping the false positive error rate from 47% to 0.4%. You may wonder how. Well, they were manually added to the training data and tagged as human users. The tool did not suddenly increase its accuracy writ large, no, the authors of the tool patched over the issue manually. Botometer’s application to real-world scenarios remains likely just as poor.

At issue, here, is how researchers coded and taught Botometer to function. By design Botometer incorporates a prior — an estimate of the prior probability that a substantial number of social bots exist — the previous build (April 2018) demonstrated that.

In fact, if you navigate to Botometer and input President Biden's Twitter account, the tool thinks he's a bot. Sure he has a team that manages his social media account, unlike President Trump. But that does not a social bot make.

Using the 2022 Midterm Election as a Case Study

Taking the assumption that social bots are substantially affecting our public discourse, specifically how we engage in political issues online, I will use the 2022 midterm cycle as a natural experiment. On the day that primaries are held and winners declared later this year, I will download a list of all followers for each winning primary candidate in the 435 House and 34 Senate races. I will pull a similar list of users on the evening of election night later this November.

I’ll analyze the net new followers and score them using Botometer. With a list of users and their “how like a bot is this account” scores, I’ll sample the pool of accounts and manually analyze them to determine their humanity. Do they take selfies? Do they post about their friends? Do they tweet about innocuous things that are local to their communities? (A snowstorm that brings I-95 to a halt? A noisy parade of motorcycles going through an otherwise quiet North Dakotan town? Do they link to their LinkedIn, work, or other online presence that supports their humanity? Did they live-tweet their tears while listening to Adele or a new Taylor Swift album?)

In short: Are the automated methods used to discriminate humans from social bots accurate and valid when deployed at scale — and often at a scale that makes manual checks an insurmountable lift? I’m not sold that half of Twitter is driven by bots-pretending-to-be-humans, even though types of bots and trolls do exist on the platform.

And if it’s not true if social bots are not running amok on social media platforms today, what does that say about public discourse online? Folks that do little but tweet and retweet hundreds of times a week are affecting our online discussions. But that’s their prerogative.

Me? I’d prefer they engage with higher-quality content: Less QAnon conspiracy theories, fewer Tucker Carlson clips, less vaccine misinformation. But if the issue is that people, real flesh and blood Americans, are the purveyors of this content, then, that’s another question entirely. And it requires a different set of tools to remedy.

I’ll start with me and what I can do. I’ll inspect the automated tools that are proffered as the panacea to our online woes. I’ll keep you updated on what I find. Perhaps I’m wrong.