WHY Polling is Dead, Dead, Dead.

My major in college, back in the 60’s, was “public policy”, which meant polling. In those days it was considered highly predictive of reality, IF done correctly. We learned about all the techniques, all the pitfalls, all the possible mistakes that had been made, or could possibly be made, and how to avoid them. I spent two years learning to “do it correctly”, and my senior thesis was a survey which I designed, administered, analyzed, and wrote up as a 90 page report.

Then I graduated, and held three successive jobs in various aspects of that industry. I thought I was going to spend my life in polling. Luckily for me, I found that other facets of life attracted me more. Yet I remained interested, and from time to time have conducted small-scale surveys for organizations to which I’ve belonged.

Why was that lucky?

It was lucky for me, because if I had stayed in that industry I would have been part of its slow deterioration, though by now I would probably have retired and have perhaps missed the nose dive of the past several years.

There are people right on this site who still believe in the validity of polling—or at least they believe it when it tells them something they want to hear. Most recently, the ones who don’t like Biden have been latching onto poll results that claim he’s losing to Trump. Previously there were other issues.

In fact, none of these poll results, whether we like them or not, have any relationship to reality, and here’s why.

The bedrock of a valid poll is to collect a true random sample of the population you want to study. This is easy enough to do when it’s a small population. I can sample my church by going down the Directory and marking every 20th name, and I’ll have a true random 5% sample of that population—at least if I’ve made sure the Directory is up to date. But if you want to study a larger population, it becomes progressively more difficult. Studying, let’s say, likely voters in all 50 states, becomes much more difficult because you can’t begin by making a true list of all of them, and then using some mathematical method to pick a random sample.

But let’s say you COULD get an excellent random sample of that group. The next pitfall is much more daunting, though at one time that was not the case. You have to get a high response rate from your sample. 75% would have been considered a good rate back in the day. (Now a “good rate” can be as little as 5%. Keep that in mind.) That means you have made a list of names of those whose responses you want, and 75% of those people will complete your questionnaire. So first you have to contact each of them in some way, and then you have to convince them to take the time to answer all the questions, either verbally, in writing, or on line, depending on the method of data collection you’re using.

What happens if you don’t get a high response rate? In that case you have what’s called a “self-selected sample”. That is, some people chose to answer while others didn’t, but you have no way of knowing on what basis they made that choice. Maybe it’s a factor irrelevant to the subject of your study, or maybe not. If it’s a political study, maybe only the red-haired people answered, or maybe only the Southerners answered, or maybe only meat-eaters answered. Are those factors relevant? Can you possibly know?

Actually, sometimes you can---if you’ve asked the right questions. That’s why questionnaires typically have a demographic section at the end, where the respondent is asked questions like sex, age, race, level of education, etc. These can get quite detailed, and the answers can be worthwhile—in fact absolutely vital. But they’re at the end for a reason: they may alienate the respondent, who may then fail to complete the questionnaire. Moreover, the more demographic questions there are, the more likely they are to produce vital information—and the more likely they are to cause termination before finishing. Ethical pollsters will throw out all questionnaires where every single question was not answered—but there is a huge temptation to retain them to bolster the perceived size of the sample.

Now there’s still another factor to keep in mind: did respondents answer honestly? In the past this was not an issue, but it seems to be one now. There are rumors—and how reliable are they? that some people will deliberately lie to pollsters, for reasons unknown. This is still another factor to take into account.

In the past, getting the required response rate was in most cases not a concern, although possibly a question of cost. When I was employed as a door-to-door interviewer (my first job after college), the sample had already been drawn. I would be sent to the selected addresses to interview the selected respondents. If no one was home, I’d be sent back, more than once if necessary. I was paid hourly plus mileage, but to get the necessary high response rate, every effort would be made.

It seems impossible now to get a response rate that ought to be considered even minimally qualifying. In the past it was assumed that most people had a phone, and that someone would be home to answer it, and that that person would be willing to talk to an interviewer, and would answer questions honestly. Or, in the case of a personally administered questionnaire, that someone would be home to answer the door, would be willing to do so, and would agree to answer the questions—and honestly—even if it would take some time. (I was often invited in, given refreshments, and maybe a baby to hold.)

None of that can now be assumed. In fact, from what people say, the opposite should be considered the norm. They say, “I never answer my phone unless I know who’s calling.” They say, “I wouldn’t answer the door unless I was expecting someone I know.” They say, “I’m never home!”---and I believe that, after trying to canvas as far back as the Obama campaigns. I’ve heard that some people brag about lying to pollsters, though no one has ever said that to me directly.

That’s obviously why a very low response rate is now considered “good”—because it’s probably as good as can be obtained. But is it really good enough to create any sort of validity in the results?

Then there are on-line polls. How these can possibly be thought have the slightest validity boggles my mind. People can represent themselves as anything they like, on line. They can tell the truth or lie, about anything whatsoever. The 27 year old New York Catholic with two PHD’s who voted for Biden can in reality be a 60 year old guy in Idaho who never graduated from HS, hasn’t been employed in ten years, and spends his time on line bragging to his MAGA friends about how he had fun fooling some pollster while waiting to be raptured. Or maybe not. And on top of that, you’re just “polling” people who want to be polled—which is automatically a serious fault in sampling. It’s the mother of all self-selected samples.

So, what they apparently rely on now is “weighting”. Weighting is nothing new. It has been used in polling forever to cure some faults in sampling. If you find that your sample is under-or-over-supplied with people who fall into some known demographic category, you correct for that in the analysis phase. Up to a point, that can work.

Up to a point. But from what is leaking out of polling organizations now, that point has long been passed. According to an insider who made a comment here some months ago, now it’s basically ALL weighting. They try to line up the results according to a predetermined idea of what would likely be said by a truly random sample, by aligning as many demographic factors as they can with an abstract idea of how those measurements would line up in the real world.

Yet they can have less and less idea, as time goes by, of how the population actually falls according to any of these factors. There are no real base lines. They can say, “We don’t have enough Catholics”, or “We have too many people with college degrees”, and then try to align the results accordingly. Yet they are further and further from actually knowing how many Catholics there really are, or even how many people with college degrees live in given areas. On some factors they can try to align with the most recent Federal Census, but that’s extremely limited because the Census no longer collects any more than the most basic, meager information. On other possibly relevant factors there’s no data at all. In fact it’s reached the point where it’s not actually possible to know about many factors that might be relevant—new ones that have cropped up. It’s just too long now since there was any reliable base data.

So they may try to correct for a multitude of factors, without having any sort of base-line for those factors. Then on top of that, according to the person who wrote that comment, there’s a final “hand-waving” stage where they say in effect, “That just doesn’t look right”, and try to adjust the figures until it does.

After they do all that, the published results may or may not “smell right” to people like us, depending on how skillfully they did it. Sometimes we all agree that something is off, if they did it badly. But either way, the figures are in fact equally meaningless. After reading this screed you can see why—or if you can’t, that means I’ve done a poor job of explaining it.

--30--

(And a comment from the original story)

It is a very good diary. But the one thing nobody ever wants to say out loud is that the underlying “science” was always BS. To take the author’s example of the church, let’s say there are 1000 members in the directory, and 50 are selected “at random”. And of those, only ¾ participate. That’s 38 people representing the views of 1000. In the case of a church, if the questionnaire relates to church business, that may not be too bad because one would expect an extremely high level of uniformity of opinion because the very nature of a church is indoctrination and uniformity of thought. That would never be nearly a large enough sample to represent a diverse population.

Fine. But now let’s look at the voting for town council. Let’s say that 5,000 votes are normally cast per council district. To have that same level of representation, you would need about 200 survey results. But wait a minute. Even the most monolithic of districts will never be nearly as uniform as a church congregation, so you would actually need more like 500, considering that the base of POTENTIAL voters is probably 10,000 or more.

Nobody polls at that level. But it gets worse. Let say you are polling for Senator of a medium-sized state. Now we are talking 2 million votes. Using the church ratio, that would be about 75,000 poll responses. Nobody has ever done a poll like that. Pollsters have always used pseudo-science to “prove” that it is “scientific” to pronounce results with a tiny fraction of that sample size. There was never any legitimate statistics behind that. However, pollsters got away with that because usually the questions were simple and binary, with opinions well solidified in advance. In addition, all pollsters came up with their “secret sauce” to manipulate the totals based on other factors that were more intuition than science. And at the same time, databases evolved that could do a decent job of characterizing people based on where they lived. If you lived in Archie Bunker’s neighborhood, chances were good that you were pretty racist, for example.

But all of that has broken down. There is less red-lining of neighborhoods today and greater mobility. And it is no longer a mostly monolithic white population versus a mostly monolithic black population. The melting pot has melted a lot more in the past generation or two. Location-based assumptions aren’t as good as they used to be. Basically, this has all exposed the fact that there was no real science in the first place.

--30--

Written by Mercy Ormont. Cross-posted from Daily Kos.