Generalizing Statistical Results to the Entire Population

Statistics All-in-One For Dummies

Making conclusions about a much broader population than your sample actually represents is one of the biggest no-no's in statistics. This kind of problem is called generalization, and it occurs more often than you might think. People want their results instantly; they don't want to wait for them, so well-planned surveys and experiments take a back seat to instant Web surveys and convenience samples.

For example, a researcher wants to know how cable news channels have influenced the way Americans get their news. He also happens to be a statistics professor at a large research institution and has 1,000 students in his classes. He decides that instead of taking a random sample of Americans, which would be difficult, time-consuming, and expensive, he will just put a question on his final exam to get his students' answers. His data analysis shows that only 5 percent of his students read the newspaper and/or watch network news programs anymore; the rest watch cable news. For his class, the ratio of students who exclusively watch cable news compared to those students who don't is 20 to 1. The professor reports this and sends out a press release about it. The cable news channels pick up on it and the next day are reporting, "Americans choose cable news channels over newspapers and network news by a 20-to-1 margin!"

Do you see what's wrong with this picture? The problem is that the professor's conclusions go way beyond his study, which is wrong. He used the students in his statistics class to obtain the data that serves as the basis for his entire report and the resulting headline. Yet the professor reports the results about all Americans. It's safe to say that a sample of 1,000 college students taking a statistics class at the same time at the same college doesn't represent a cross section of America.

If the professor wants to make conclusions in the end about America, he has to select a random sample of Americans to take his survey. If he uses 1,000 students from his class, then his conclusions can be made only about that class and no one else.

To avoid or detect generalization, identify the population that you're intending to make conclusions about and make sure the selected sample represents that population. If the sample represents a smaller group within that population, then the conclusions have to be downsized in scope also.