Why the Polls Got It Wrong: Deliberate Versus Accidental Survey Bias

Wed, Nov 23, 2016 - 7:45am

To understand “deliberate bias” among census and surveys, we can look all the way back to the Jewish Bible.

The Book of Numbers in the Jewish Bible is named as such because it begins with a partial census of the Israelites. Partial because Moses tallies up only fighting men over the age of 20 while ignoring women and children. This was deliberate bias.

Now, let’s look at “accidental bias.” We can look at major events like “Brexit” and the recent Trump victory in the U.S. election.

The consensus was overwhelmingly wrong because of accidental survey bias.

It’s the difference between theory and reality, and it gets tripped up in the basics of methodology. Current approaches follow a wisdom-of-the-crowds approach where multiple predictions are gathered, the extremely divergent results get dumped, and the more common result gets used. We see it in the Olympics where the low score and the high score get dumped.

But it keeps failing:

  • Polls missed Brexit
  • Polls missed Trump win
  • Betting websites miss both

The common problem: the surveys are biased in favor of white collar workers.

When it came to both Brexit and Trump, the masses were the deciding factor.

And polls, surveys, and betting websites tend to exclude the masses.

Let’s start with the betting websites because it missed both.

Studies by the Brookings Institute and the American Casino Association show a universal theme: low-income gamblers play the lottery and middle/upper-income gamblers hit the casinos and betting websites.

Entry price is one of the reasons (lottery tickets are $1).

Access is the other: website gambling requires advance payments, direct bank account access and internet access (any of which can be problematic for low-wage earners).

Now add in the exotic aspect of betting on political events, and representation of lower-income gamblers is even less likely. (Not that they aren't interested in politics, but betting on an election is more obscure as a wager than a sports game, for example.)

Political polls and surveys also tend to exclude low-income workers.

Pollsters are calling home landlines. The U.S. now has more people than ever holding down 2 or more jobs: pollsters are calling when lower income workers are away at work.

Conventional Macro Data-points Are Biased Against Small Businesses and Lower-Income Workers

One recent example of “conventional macroeconomic data” missing major real-world trends is government retail data.

Recent data showed a surge in food services. But we know for a fact that consumers are spending much less on dining out: restaurants are reporting falling sales and the onset of a restaurant recession (We’ve reported on it here).

The Retail data is one of those popular data points that is more model than data. That is, the statisticians have a model that predicts spending and they fine-tune based on some results that come from a handful of surveys. In other words, the model says that we should see x% growth unless some new data says otherwise. And the model gets revised every year in the Spring and history gets re-written.

The net result is that the Retail data is frequently changed and re-written because the data is incomplete at best. And when reality hits the model (i.e. restaurants reporting falling revenues but the model says otherwise), the model trumps reality.

Again, accidental bias caused by methodology.

Now let’s move on to U.S. payrolls (which turn into wages).

Payrolls are calculated based on a survey form sent to companies.

Large companies have dedicated human resource staff to handle the inquiries.

Small businesses like the local dry cleaners don't have the time or inclination to participate in these surveys. (Moneyball data fixes this problem by harvesting two types of data: (1) hiring data for public companies in the Russell 1000 and (2) local job postings that focus on small, private companies).

By default, every government issued dataset will under-represent small businesses.

(Although with retail food services spending, major companies were reporting to the government a falloff in spending, so the disconnect points to even bigger flaws with the models.)

When it comes to inflection points, conventional data is late to the game.

Official data lags reality not only because it gets released late but also because it frequently excludes big chunks of the broader economy.

Instead, focus on primary data like Moneyball’s hiring data, capital goods data and Vice spending data which have a proven track record of predicting economic shifts (I was recently ranked the 4th most accurate economic forecaster by Bloomberg).

About the Author