# Excel 2007 Data Analysis For Dummies

From Excel 2007 Data Analysis For Dummies by Stephen L. Nelson

Excel 2007 has great data analysis tools. You can create any statistic you want; just make sure that you know what the statistic means. For that, you need to know the statistical measures available, the Boolean expressions Excel 2007 uses, and the Excel 2007 database functions.

## Excel 2007 Database Functions

Excel 2007's data analysis tools include a set of database functions for making statistical calculations using information from lists. The following table shows these functions and explains what each does:

Function Description
DAVERAGE Calculates arithmetic mean
DCOUNT Counts the number of cells with values
DCOUNTA Counts the number of cells that aren’t empty
DGET Returns a value from a database list
DMAX Finds the largest value in a list
DMIN Finds the smallest value in a list
DPRODUCT Calculates the product of values matching criteria
DSTDEV Calculates the standard deviation of a sample
DSTDEVP Calculates the standard deviation of a population
DSUM Calculates the sum of values matching criteria
DVAR Calculates the variance of a sample
DVARP Calculates the variance of a population

All these database functions use a standard three-argument syntax. For example, the DAVERAGE function looks like this: =DAVERAGE(database,field,criteria)where database is a range reference to the Excel list that holds the value you want to examine, field tells Excel which column in the database to examine, and criteria is a range reference that identifies the fields and values used to define your selection criteria. The field argument can be a cell reference holding the field name, the field name enclosed in quotation marks, or a number that identifies the column (1 for the first column, 2 for the second column, and so on).

## Excel 2007's Quick Data Analysis from the Status Bar

Excel 2007 can tell you quite a bit about a selected range of cells just from the status bar. When you select cells that have data in them, Excel's status bar can tell you some useful information about those cells. When you right-click the status bar, Excel displays a pop-up menu that includes several statistical measures that you can make on the selected range; they’re shown in the following table:

Statistical Measures Option What It Does Statistical Measures Option What It Does
[None] Tells Excel that you don't want it to calculate and then show a statistic on the status bar. Maximum Finds the largest value in the selected range
Average Finds the mean of values in selected range. Minimum Finds the smallest value in the selected range
Count Tallies the cells that hold labels, values, or formulas. Use this when you want to count the number of cells that aren’t empty. Sum Adds the values in the selected range
Numerical Count Tallies the number of cells in a selected range that hold values or formulas.

## Using Boolean Expressions in Data Analysis with Excel 2007

Excel 2007 data analysis tools include Boolean expressions, of course. To construct a Boolean expression, such as when you filter criteria, use a comparison operator and then a value in the comparison (>5, for example). The following table shows the Excel comparision operators and what they represent:

Comparison Operator What It Means
= Equals
> Greater than
>= Greater than or equal to
< Less than
<= Less than or equal to
<> Not equal to

## Statistics Terms to Know when Using Excel 2007 Data Analysis Tools

With the data analysis tools available in Excel 2007, you can create spreadsheets that show the details of any statistic you can create a formula to find — and you can find any number. It helps to know what you’re looking for and what to expect, and the terms in the following list help you understand what kinds of statistics you can produce.

• average: Typically, an average is the arithmetic mean for a set of values. Excel supplies several average functions.

• chi-square: Use chi-squares to compare observed values with expected values, returning the level of significance, or probability (also called a p-value). A p-value helps you to assess whether differences between the observed and expected values represent chance.

• cross-tabulation: This is an analysis technique that summarizes data in two or more ways. Summarizing sales information both by customer and product is a cross-tabulation.

• descriptive statistics: Descriptive statistics just describe the values in a set. For example, if you sum a set of values, that sum is a descriptive statistic. Finding the biggest value or the smallest value in a set of numbers is also a descriptive statistic.

• exponential smoothing: Exponential smoothing calculates the moving average but weights the values included in the moving average calculations so that more recent values have a bigger effect.

• inferential statistics: Inferential statistics are based on the very useful, intuitive idea that if you look at a sample of values from a population and the sample is representative and large enough, you can draw conclusions about the population based on characteristics of the sample.

• kurtosis: This is a measure of the tails in a distribution of values.

• median: The median is the middle value in a set of values. Half of the values fall below the median, and half of the values fall above the median.

• mode: Mode is the most common value in a set.

• moving average: A moving average is calculated using only a specified set of values, such as an average based on just the last three values.

• normal distribution: Also known as a Gaussian distribution, normal distribution is the infamous bell curve.

• p-value: A p-value is the level of significance, or probability.

• regression analysis: Regression analysis involves plotting pairs of independent and dependent variables in an XY chart and then finding a linear or exponential equation that best describes the plotted data.

• skewness: This is a measure of the symmetry of a distribution of values.

• standard deviation: A standard deviation describes dispersion about the data set’s mean. You can kind of think of a standard deviation as an average deviation from the mean.

• variance: A variance describes dispersion about the data set’s mean. The variance is the square of the standard deviation; the standard deviation is the square root of the variance.

• z-value: This is the distance between a value and the mean in terms of standard deviations.