Learn more with dummies

Enter your email to join our mailing list for FREE content right to your inbox. Easy!

How to Test Data Proportions with R

By Andrie de Vries, Joris Meys

Let’s look at an example to illustrate the basic R tests for data proportions. The following example is based on real research, published by Robert Rutledge, MD, and his colleagues in the Annals of Surgery (1993).

In a hospital in North Carolina, the doctors registered the patients who were involved in a car accident and whether they used seat belts. The following matrix represents the number of survivors and deceased patients in each group:

> survivors <- matrix(c(1781,1443,135,47), ncol=2)
> colnames(survivors) <- c('survived','died')
> rownames(survivors) <- c('no seat belt','seat belt')
> survivors
      survived died
no seat belt   1781 135
seat belt    1443  47

To know whether seat belts made a difference in the chances of surviving, you can carry out a proportion test. This test tells how probable it is that both proportions are the same. A low p-value tells you that both proportions probably differ from each other. To test this in R, you can use the prop.test() function on the preceding matrix:

> result.prop <- prop.test(survivors)

You also can use the prop.test() function on tables or vectors. If you use it with vectors, remember that the first vector has to be the number of successes, and the second number has to be the total number of cases.

The prop.test() function then gives you the following output:

> result.prop
 2-sample test for equality of proportions with continuity correction
data: survivors
X-squared = 24.3328, df = 1, p-value = 8.105e-07
alternative hypothesis: two.sided
95 percent confidence interval:
 -0.05400606 -0.02382527
sample estimates:
  prop 1  prop 2
0.9295407 0.9684564

This test report is almost identical to the one from t.test() and contains essentially the same information. At the bottom, R prints for you the proportion of people who died in each group. The p-value tells you how likely it is that both the proportions are equal.

So, you see that the chance of dying in a hospital after a crash is lower if you’re wearing a seat belt at the time of the crash. R also reports the confidence interval of the difference between the proportions.