Cheat Sheet
Excel 2007 Data Analysis For Dummies Cheat Sheet
Excel 2007 has great data analysis tools. You can create any statistic you want; just make sure that you know what the statistic means. For that, you need to know the statistical measures available, the Boolean expressions Excel 2007 uses, and the Excel 2007 database functions.
Excel 2007 Database Functions
Excel 2007’s data analysis tools include a set of database functions for making statistical calculations using information from lists. The following table shows these functions and explains what each does:
Function  Description 

DAVERAGE  Calculates arithmetic mean 
DCOUNT  Counts the number of cells with values 
DCOUNTA  Counts the number of cells that aren’t empty 
DGET  Returns a value from a database list 
DMAX  Finds the largest value in a list 
DMIN  Finds the smallest value in a list 
DPRODUCT  Calculates the product of values matching criteria 
DSTDEV  Calculates the standard deviation of a sample 
DSTDEVP  Calculates the standard deviation of a population 
DSUM  Calculates the sum of values matching criteria 
DVAR  Calculates the variance of a sample 
DVARP  Calculates the variance of a population 
All these database functions use a standard threeargument syntax. For example, the DAVERAGE function looks like this: =DAVERAGE(database,field,criteria)where database is a range reference to the Excel list that holds the value you want to examine, field tells Excel which column in the database to examine, and criteria is a range reference that identifies the fields and values used to define your selection criteria. The field argument can be a cell reference holding the field name, the field name enclosed in quotation marks, or a number that identifies the column (1 for the first column, 2 for the second column, and so on).
Excel 2007’s Quick Data Analysis from the Status Bar
Excel 2007 can tell you quite a bit about a selected range of cells just from the status bar. When you select cells that have data in them, Excel’s status bar can tell you some useful information about those cells. When you rightclick the status bar, Excel displays a popup menu that includes several statistical measures that you can make on the selected range; they’re shown in the following table:
Statistical Measures Option  What It Does  Statistical Measures Option  What It Does 

[None]  Tells Excel that you don’t want it to calculate and then show a statistic on the status bar. 
Maximum  Finds the largest value in the selected range 
Average  Finds the mean of values in selected range.  Minimum  Finds the smallest value in the selected range 
Count  Tallies the cells that hold labels, values, or formulas. Use this when you want to count the number of cells that aren’t empty. 
Sum  Adds the values in the selected range 
Numerical Count  Tallies the number of cells in a selected range that hold values or formulas. 
Using Boolean Expressions in Data Analysis with Excel 2007
Excel 2007 data analysis tools include Boolean expressions, of course. To construct a Boolean expression, such as when you filter criteria, use a comparison operator and then a value in the comparison (>5, for example). The following table shows the Excel comparision operators and what they represent:
Comparison Operator  What It Means 

=  Equals 
>  Greater than 
>=  Greater than or equal to 
<  Less than 
<=  Less than or equal to 
<>  Not equal to 
Statistics Terms to Know when Using Excel 2007 Data Analysis Tools
With the data analysis tools available in Excel 2007, you can create spreadsheets that show the details of any statistic you can create a formula to find — and you can find any number. It helps to know what you’re looking for and what to expect, and the terms in the following list help you understand what kinds of statistics you can produce.

average: Typically, an average is the arithmetic mean for a set of values. Excel supplies several average functions.

chisquare: Use chisquares to compare observed values with expected values, returning the level of significance, or probability (also called a pvalue). A pvalue helps you to assess whether differences between the observed and expected values represent chance.

crosstabulation: This is an analysis technique that summarizes data in two or more ways. Summarizing sales information both by customer and product is a crosstabulation.

descriptive statistics: Descriptive statistics just describe the values in a set. For example, if you sum a set of values, that sum is a descriptive statistic. Finding the biggest value or the smallest value in a set of numbers is also a descriptive statistic.

exponential smoothing: Exponential smoothing calculates the moving average but weights the values included in the moving average calculations so that more recent values have a bigger effect.

inferential statistics: Inferential statistics are based on the very useful, intuitive idea that if you look at a sample of values from a population and the sample is representative and large enough, you can draw conclusions about the population based on characteristics of the sample.

kurtosis: This is a measure of the tails in a distribution of values.

median: The median is the middle value in a set of values. Half of the values fall below the median, and half of the values fall above the median.

mode: Mode is the most common value in a set.

moving average: A moving average is calculated using only a specified set of values, such as an average based on just the last three values.

normal distribution: Also known as a Gaussian distribution, normal distribution is the infamous bell curve.

pvalue: A pvalue is the level of significance, or probability.

regression analysis: Regression analysis involves plotting pairs of independent and dependent variables in an XY chart and then finding a linear or exponential equation that best describes the plotted data.

skewness: This is a measure of the symmetry of a distribution of values.

standard deviation: A standard deviation describes dispersion about the data set’s mean. You can kind of think of a standard deviation as an average deviation from the mean.

variance: A variance describes dispersion about the data set’s mean. The variance is the square of the standard deviation; the standard deviation is the square root of the variance.

zvalue: This is the distance between a value and the mean in terms of standard deviations.