The Problem With a Crowd of New Online Polls

They’ve become cheap to produce but they underperform the competition, falling short of their original promise.

by · NY Times
Anyone can find a group of people. But will it be representative?
Credit...Yeong-Ung Yang for The New York Times

The polls were one of the big winners of the 2012 presidential election. They showed Barack Obama ahead, even though many believed a weak economy would propel Mitt Romney to victory.

The polls conducted online were among the biggest winners of all.

The most prominent online pollsters — Google Consumer Surveys, Reuters/Ipsos and YouGov — all produced good or excellent results. With the right statistical adjustments, even a poll of Xbox users fared well.

These successes seemed to herald the dawn of a new era of public opinion research, one in which pollsters could produce accurate surveys cheaply, by marrying online polls with big data and advanced statistical techniques.

A decade later, the new era has arrived — and has fallen far short of its promise. Ever since their 2012 breakout performance, the public polls relying exclusively on data from so-called online opt-in panels have underperformed the competition.

Only YouGov, long at the cutting edge of this kind of polling, is still producing reasonably accurate results with these panels.

Many of the online pollsters who excelled in 2012 have left public polling altogether:

  • Google Consumer Surveys — by 538’s reckoning perhaps the best poll of 2012 — was arguably the single worst pollster in the 2016 election, and it has stepped out of the political polling game.
  • The Xbox poll did not return. The researchers behind it used a different online survey platform, Pollfish, to predict Hillary Clinton victories in Texas and Ohio in 2016.
  • And last year, Reuters/Ipsos abandoned opt-in, or nonprobability, polling. There are still Reuters/Ipsos polls, but they’re now traditional surveys with panelists recruited by mail.

Nonetheless, a majority of polls are now conducted in exactly this way: fielded online using people who joined (that is, opted into) a panel to take a survey, whether by clicking on a banner ad or via an app. Traditional polling, in contrast, attempts to take a random sample of the population, whether by calling random phone numbers or mailing random addresses.

The newer opt-in pollsters haven’t fared any better. SurveyMonkey and Morning Consult, two of the most prolific opt-in pollsters to pop up since 2012, have posted well-below-average despite having established pollsters and political scientists leading their political polling.

More recently, a whole new wave of online pollsters has popped up. In just the last few months, pollsters like SoCal Strategies, Quantus Polls, ActiVote and Outward Intelligence have published dozens of polls, often with scant methodological detail. Maybe one of these firms is a diamond in the rough, but history offers no reason to expect it.

Online opt-in pollsters have fared so poorly in recent cycles that they receive less weight than other surveys in our polling averages.

Why are these polls faring poorly? The core challenge was always obvious: how to find a representative sample without the benefit of random sampling, in which everyone has an equal chance of being selected for a poll. Over the last decade, this has gotten harder and harder. Even the best firms have struggled to keep up; for the rest, it’s hard to tell how much they’re even trying.

Declining data quality

A decade ago, the Reuters/Ipsos poll was one of those online opt-in polls that seemed to be on the rise. Now, it has stopped nonprobability polling, largely because of declining data quality.

“The difficulty of getting a quality sample continues to increase,” said Chris Jackson, who heads Ipsos public opinion research in the United States. “The people who you get into these panels aren’t representative of the full population, or they’re people trying to game the system.”

What has gotten more difficult? Just about everything.

First, there’s the growing compartmentalization of the internet. It’s hard to remember now, but before smartphones were ubiquitous, most people accessed the internet on a desktop computer; maybe they opened a web browser with a landing page like Yahoo or MSN. A banner ad on the right websites could plausibly reach a huge — and broadly representative — swath of the population. To Mr. Jackson, this was nonprobability polling at its zenith: a time when huge numbers of sample respondents were available and data quality was good.

Now, the internet is segmented. Most people access the internet through apps on their phone. Even the widely used social media networks, like Instagram, TikTok, Facebook or Snapchat, attract very different and unrepresentative audiences. The online panel providers try to recruit panelists from a mix of websites and apps who — hopefully — can be cobbled together to yield something plausibly representative. Unfortunately, “none of those really provide a good cross-section of the population all at once,” Mr. Jackson said.

Then there’s what Pew Research calls “bogus respondents.” Many online panel providers offer financial incentives to encourage people to take surveys. Unfortunately, that’s also an incentive for people — including people outside the United States — to try to collect gift cards and cash rewards by taking many surveys or even by programming A.I. bots to take polls.

Horror stories abound. I’ve talked to pollsters who remove as many as half of their opt-in respondents on data quality checks intended to identify bogus respondents, like removing respondents who answer questions too quickly or who provide nonsensical answers in a text box. Other pollsters, especially partisan pollsters, rely on matching respondents to a voter file to help establish a valid respondent — a helpful tool, but one that significantly culls the pool of available panelists. As a result, voter-file-matched state polls may be impossible for many pollsters; the national panelists who remain may be inundated by political content, like campaign ad testing.

Excluding respondents is an art, not a science. The pollsters almost certainly exclude some number of valid respondents, and they almost certainly do not exclude everything problematic. And as soon as pollsters figure out something that works, the people trying to game the system try to adapt. A.I. makes this only more challenging.

The actual humans earnestly participating in the panels may not necessarily be the most representative, either. Even beyond the professional survey takers in it for the gift cards, the people who opt in to online panels are naturally much more politically engaged. That’s especially true of those who stay in the panels for the long term and take surveys over and over.

It’s hard to measure the accumulated effect of all of these biases, but it adds up to a serious challenge. A variety of Pew Research studies found that opt-in online panel data produces highly misleading results for young and Hispanic adults, while bogus respondents can skew survey results toward answers like “yes.”

Declining pollster quality

Online polls don’t have the benefit of random sampling, the foundation of modern survey research. There is no comprehensive list of email addresses, as there is for phone numbers or home addresses. Instead, people’s chances of seeing an online advertisement to take a poll — let alone deciding to participate — depend on their online habits.

Researchers took the problem seriously, employing sophisticated methods like YouGov’s synthetic sampling and matching or the statistical modeling behind the Xbox poll. The arcane details of these methods aren’t as important as the assumption implicitly behind them: It wasn’t enough to take a sample of online panelists and weight them to match the demographics of the population.

Yet that’s exactly what most nonprobability pollsters have done over the last decade. The typical opt-in poll today is simply weighted by age, race, gender, education and party or recalled vote in the last presidential election, as if it were any other poll. It’s often weighted with less sophistication than a rigorous traditional survey, like a New York Times/Siena College or Pew Research poll, even though the underlying data quality is usually far worse.

Why did nonprobability polling regress?

One possibility: The new entrants, coming up in a world where nonprobability polling was common, forgot there was a distinct challenge in the first place.

Another possibility is the rise of recalled vote weighting, where the pollster asks how respondents voted in the last election and then adjusts the sample to match the actual result of that election. This month, more than 80 percent of online opt-in polls weighted on recalled vote, compared with about 40 percent of other kinds of polls. There’s a reason pollsters rely on it: It’s a bludgeon, allowing them to hammer even the worst sample into the ballpark of a plausible result.

A final factor: It’s never been easier to field an online poll. One of this year’s pop-up pollsters says it conducted a poll of Michigan for $475. At that cost, just about anyone can jump into the game.

No clear standard

It’s harder than ever to do high-quality research, but that doesn’t mean it’s impossible. There’s no reason the right panel recruitment and management, sample selection, data quality checks and weighting couldn’t still do the trick.

Unfortunately, it’s almost impossible to tell whether pollsters are pulling it off.

Most online pollsters don’t disclose much about how they’re addressing these kinds of problems. Many say nothing more than “online panel” and “weighted by race, age, gender, education and recalled vote in the 2020 election.” Sometimes there are references to data quality checks, but rarely with much detail. Any serious discussion of the panel itself is even rarer — and may not even be known to the pollster, if it is using another company’s panel.

And even if pollsters did disclose more, it’s not necessarily clear whether it would tell us much. There’s no set of standards that practitioners agree yields high-quality nonprobability research. As a result, even a detailed methodology statement doesn’t necessarily offer any conclusions about the poll. Instead, it mostly tells us something about the pollster: whether it’s serious and conscientious about the problems at hand.

Which poll is better: one that says it didn’t remove any respondents on data quality checks, or one that removed 50 percent? Neither is especially reassuring.

Is anything trustworthy?

With these challenges, it’s increasingly difficult to say whether and when opt-in polls are worthy of consideration, especially when alternative, traditionally collected data is available.

YouGov, as mentioned, is one exception. Across all plausible criteria for evaluation, it stands apart. Over the years, it has released a collection of detailed methodology statements revealing a sophisticated sample selection process and the use of its own proprietary panel. Its data has been analyzed by third parties, used by academics, and found to outperform other nonprobability data. It has also amassed a decade-long record of solid results in election polling.

The other exception: the (often partisan) pollsters specifying that they match their panelists to the voter file. This isn’t necessarily sufficient, and it has its own challenges. But it’s a basis for verifying the authenticity of the respondent, and it adds information that can be used for weighting.

And, well, it’s a sign of both real effort and familiarity with political polling, something it doesn’t seem we can take for granted for much of the rest of online polling.