U Can: Statistics For Dummies Cheat Sheet - dummies

# U Can: Statistics For Dummies Cheat Sheet

Details explaining what to do with data in every conceivable study or experiment can’t be contained on a single page. However, this Cheat Sheet summarizes many of the basic methods and formulas used in data analysis.

## Seeing What Statistical Symbols Stand For

Symbols (or notation) found in statistics problems fall into three main categories: math symbols, symbols referring to a population, and symbols referring to a sample.

• Math symbols are easy enough to decipher with a simple review of algebra. They involve items such as square root signs, equations of a line, and combinations of math operations.

• Population symbols are almost always lowercase Greek letters. They refer to the unknown population values that you’re trying to estimate.

• Sample symbols are almost always lowercase English letters with accents. They refer to the known statistics that are calculated from data.

## Getting Familiar with Common Statistics

After data has been collected, the first step in analyzing it is to crunch some descriptive statistics to get an initial feeling for the data. For example:

• Where is the center of the data located?

• How spread out are the data?

• How correlated are the data from two variables?

The most common descriptive statistics are in the following table, along with their formulas and a short description of what each one measures.

## Figuring Sample Size in Statistics

When designing a study, the sample size is an important consideration because the larger the sample size, the more data you have and the more precise your results will be (assuming high-quality data). If you know the level of precision you want (that is, your desired margin of error), you can calculate the sample size needed to achieve it.

To find the sample size needed to estimate a population mean,

or a population proportion (p), use the following formula:

where z* is the critical value for the confidence level you need; MOE represents the desired margin of error; and

represents the population standard deviation.

If

σ is unknown,

• When looking for

estimate

with the sample standard deviation, s, from a pilot study.

• When looking for p, estimate

with p0(1 – p0), where p0 is some initial guess (usually 0.50) at p.

## Checking Out Formulas for Confidence Intervals

In statistics, a confidence interval gives a range of plausible values for some unknown population characteristic. It contains an initial estimate plus or minus a margin of error (the amount by which you expect your results to vary if other samples were taken). The following table shows formulas for the components of the most common confidence intervals and keys for when to use them.

## Looking at Confidence Interval Critical Values

Critical values (z*-values) are an important component of confidence intervals (the statistical technique for estimating population parameters). The z*-value, which appears in the margin of error formula, measures the number of standard errors to be added and subtracted in order to achieve your desired confidence level (the percentage confidence you want).

The following table shows common confidence levels and their corresponding z*-values.

Confidence Level z*-value
80% 1.28
85% 1.44
90% 1.645
95% 1.96
98% 2.33
99% 2.58

You can also use these critical z*-values for hypothesis tests in which the test statistic follows a Z-distribution. If the absolute value of the test statistic is greater than the corresponding z*-value, then reject the null hypothesis.

## Evaluating Claims with Hypothesis Tests

You use hypothesis tests to challenge whether some claim about a population is true (for example, a claim that 90 percent of Americans own a cellphone). To test a statistical hypothesis, you take a sample, collect data, form a statistic, standardize it to form a test statistic, and decide whether the test statistic refutes the claim. The following table lays out the important details for hypothesis tests.

Note that for the tests involving the difference of two population values

it’s typical that

is 0.

You can also use these critical z*-values for hypothesis tests in which the test statistic follows a Z-distribution. If the absolute value of the test statistic is greater than the corresponding z*-value, then reject the null hypothesis.