Statistical Data Used in Data Driven Marketing

By David Semmelroth

Part of Data Driven Marketing For Dummies Cheat Sheet

Anybody who’s ever used a spreadsheet is familiar with the idea of data types. Data comes in two basic flavors: numerical and character — numbers and text. Character data isn’t involved in statistical analysis. Numerical data breaks down into integer data and decimal data and can be formatted in various ways.

But when it comes to performing statistical analysis of data, some differences are important to keep in mind. Not all data is created equal when it comes to calculating statistics.

Following are the basic data types along with a brief description of the kinds of statistics you can meaningfully perform with them. Note that each data type in this list supports the calculations described in all the preceding types:

  • Categorical data: This is data that is, from a statistical point of view, is non-numeric. It simply classifies records by categories. The numbers on football jerseys are an example. With this type of data, the only meaningful statistic is the number of records in each category.

  • Ordinal data: This type of data simply indicates some sort of order in which records fall. A typical example is a survey question that asks responders to rank something on a scale of 1 to 10. This sort of data supports the calculation of percentiles. The notion of median is also meaningful here. It is important to note that averages are not meaningful with ordinal data.

  • Interval data: Interval data supports comparisons of intervals. Dollar amounts, age, and temperature all have this property. For example, the difference between 1 dollar and 2 dollars is exactly the same as the difference between $100 and $101. This type of data supports most common statistical calculations such as means and standard deviations.

  • Ratio data: Ratio data is the most robust data type. It’s characterized by allowing comparisons of ratios. Ten years is twice as long as five years, for example. This type of data supports virtually every statistical calculation imaginable, including the coefficient of variation as well as more esoteric means like the geometric mean.