Breaking Down Statistical Formulas

Formulas abound in statistics problems — there’s just no getting around them. However, there’s typically a method to the madness if you can break the formulas into pieces. Here are some helpful tips:

• Formulas for descriptive statistics basically take the values in the data set and apply arithmetic operations. Often, the formulas look worse than the process itself. The key: If you can explain to your friend how to calculate a standard deviation, for example, the formula is more of an afterthought.

• Formulas for the regression line have a basis in algebra. Instead of the typical y = mx + b format everyone learns in school, statisticians use y = a + bx.

• The slope, b, is the coefficient of the x variable.

• The y-intercept, a, is where the regression line crosses the y-axis.

The formulas for finding a and b involve five statistics: the mean of the x-values, the mean of the y-values, the standard deviations for the x‘s, the standard deviations for the y‘s, and the correlation.

• All the various confidence interval formulas, when made into a list, can look like a hodge-podge of notation. However, they all have the same structure: a descriptive statistic (from your sample) plus or minus a margin of error. The margin of error involves a z*-value (from the Z-distribution) or t*-value (from the t-distribution) times the standard error. The parts you need for standard error are generally provided in the problem, and the z*- or t*-values come from tables.

• Hypothesis tests also have a common structure. Although each one involves a series of steps to carry out, they all boil down to one thing: the test statistic. A test statistic measures how far your data is from what the population supposedly looks like. It takes the difference between your sample statistic and the (claimed) population parameter and standardizes it so you can look it up on a common table and make a decision.