Monday, October 29, 2012

Rasmussen, Systemic Polling Failure, and "Truth"

There's quite a brouhaha spinning around the interwebs with conservatives complaining that Nate Silver is cooking the books in his polling model because he's a liberal shill. Here's the argument in its original full form. It is pure drivel. Silver himself had this brilliant retort today ...

But, anyway, the whole thing got me to thinking about the state of the polls and the state of the race. Conservatives are reading Rasmussen and Gravis and Gallup and celebrating what is sure to be a big victory for Romney next Tuesday. Liberals are reading ... well, everything else and are sweating the national vote polls that have it close (actually, if you take Gallup and Rasmussen out of the naional poll averages, Obama is very slightly ahead) but feeling pretty comfortable about the state-level polls that show Obama leading in more than enough states to win a majority in the Electoral College.

What is the state of the race and whose polls are right? The truth is we don't really know for sure which set of polls is right. But, if I had to bet, I'd say Rasmussen and Gallup are wrong (for different reasons). Let me lay out the case for each side and explain why I think Rasmussen has it wrong but it is important to acknowledge that ...

Rasmussen could be right. Rasmussen has been identified as a pollster that has a "Republican lean" but this is really a mislabeling of what they are doing. It suggests that the pollster is just putting his thumb on the scale and cooking the books. That's not what Rasmussen is doing. At least I'm pretty sure they're not. What Rasmussen does is to use "dynamic party weighting." They weight their sample of voters so that they get to a designated number of Republicans and Democrats in their sample. The party weighting is "dynamic" because Rasmussen comes up with the relative weights of Republicans and Democrats based on some recent previous period of polling. Effectively, Rasmussen is making an educated guess about what the partisan makeup of the electorate will be (nationally or in a state) and then they make sure their sample looks like that.

So is it possible Rasmussen is right? Absolutely, it is possible. If their dynamic weights are accurate, then their polls will be accurate. And the flipside is also true. If their weights are wrong, they'll be wrong.

So what's the case against Rasmussen? A couple of days ago, Nate Cohn wrote that, if the polls stay where they are up to Election Day, Mitt Romney's only hope will be "systemic polling failure." This means that the vast majority of polls must have done something (question wording, poor sampling, etc.) that led them all to the wrong answer. Nate Cohn is right about this.

Systemic polling failure does happen. To understand why it is unlikely, we have to understand what other pollsters are doing differently than Rasmussen to get such systematically different results. First, other pollsters are not weighting by party ID at all. To most pollsters, party ID is something to be discovered in a poll, not something to use to weight the sample. This is because party ID is a fluid construct. People tend to change their response to a party ID question based on how they feel about the competing presidential candidates or based on how enthused they are to vote for their preferred candidate. Other pollsters simply ask respondents who they are planning to vote for and then use likely voter screens (of different varieties) to determine who is likely to vote and who is a non-voter.

It is entirely possible that this methodology will lead to a sample that is not reflective of the population as a whole. One reason is simply statistical variation. Maybe you just happened to get a disproportionate number of Romney supporters in your sample. This happens but we know the precise likelihood it will happen and this problem is corrected for by the various polling averages out there. Only a very small number of polls will be outliers for this reason and, when they are averaged into the rest, they will not have a substantial effect on the overall average number. A second problem is some kind of problem with the way we are sampling. One problem along these lines is the undersampling of certain kinds of populations like cell phone users or Hispanic voters who do not speak English, either of which may not be called by pollsters for different reasons. Some pollsters do call cell phones and some call Spanish-only voters. Here's Stan Greenberg talking about why undersampling of cell phone voters is causing Obama voters to be under-represented in the public polls we see:

Other pollsters who are good at what they do, use some kind of weighting mechanism to try to get a representative sample without calling them. This is increasingly problematic but is done well by some pollsters (like PPP in my view).

The point is that these pollsters are either getting a truly representative sample or they are using demographic weighting (not partisan ID weighting) to make sure they don't have a problem with their sample. The fact that so many pollsters are coming to basically the same conclusions (for instance, that Barack Obama has a small lead in Ohio) is evidence that they are likely right.

Could Ohio (and other states) still shift? Yes, but it is getting very late.

Could Rasmussen be right and the other polls wrong? Yes. I don't think so but it could happen and, if it did, it would be exactly the kind of "systemic polling failure" Nate Cohn was talking about.

We'll know next Wednesday morning.

1 comment:

Thomas Hartman said...

Great post for someone as under the weather as you are.