All data aren’t created equal. As you begin your Six Sigma quest to organize your data, you first need to know what type of performance data you have. Just as knowing what the fish are biting tells you which lure to use, knowing what kind of data you’re dealing with tells you which tools to use. There are two important data categories: attribute and continuous.

Data Type | Description | Examples |
---|---|---|

Attribute/category | Data observations fall into discrete, named value categories. | Eye color: brown, blue, green |

Location: Factory 1, Factory 2, Factory 3 | ||

No mathematical operations can be performed on the raw data. | Inspection result: pass, fail | |

Size: large, medium, small | ||

Fit check: go, no-go | ||

Questionnaire response: yes, no | ||

You can count the number of occurrences you see of each category. | Attendance: present, absent | |

Employee: Fred, Suzanne, Holly | ||

Processing: Treatment A, Treatment B | ||

Continuous | Data observations can take on numerical value and aren’t confined to nominal categories. | Bank account balance: dollars |

Length: meters | ||

Time: seconds | ||

Electrical current: amps | ||

Any two data values can be meaningfully added and subtracted. | Survey response: 1 = disagree, 2 = neutral, 3 = agree |

## Attribute (category) data

Some data consist of measurements that describe an attribute of the characteristic or process. These data are called *attribute *or* category* data.

Attribute data are all around you:

Telephone area codes

S, M, L, XL, XXL clothing sizes

“Pass” or “fail” judgments pronounced on just-assembled products

“Good” or “bad” assessments of the output from a process

How do you know whether you’re working with attribute data? The telltale test is to ask yourself, “Can I meaningfully add or subtract values of this data?”

If the answer is “no,” what you have is attribute data. For example, what do you get when you add a S-sized shirt to a M-sized shirt? Nothing meaningful. Or, if you subtract telephone area code 213 from area code 415, does the resulting area code of 202 mean something? Of course not! And so you know that you’re dealing with attribute data.

What you can do with attribute data is count how many times each category or attribute appears. For example, you may find that a process produces 152 “good” items and 28 “bad” items over a given period of time. You use the results of these types of category counting studies as the starting point for many Six Sigma analyses.

A subset category of attribute data that provides a little more horsepower is called *ordinal data *(also known as *rank order data*). Ordinal data are attribute data that can be logically placed in an order from smallest to greatest or in an order of time, such as the months of the year: January, February, March, and so on.

If you have “month” data on a set of last year’s invoices, you can sort them into buckets of occurrence starting with January and moving throughout the year. Or you may not have actual completion times, but you may have data about which employees finished a task first, second, third, and so on. You have a powerful set of ordinal data that you can use to begin analysis and improvement.

## Continuous (variable) data

If you find that you can meaningfully add or subtract any two values of your data, you’re working with *continuous* (or *variable*) data rather than attribute data.

When testing whether data is attribute or continuous, be sure to apply the “meaningfully add or subtract the values” question to the raw data and not to any summarized counts of the data.

For example, the fact that you can subtract five M-sized shirts from seven L-sized shirts to get a two-shirt difference doesn’t indicate that you have continuous data. You have to apply the question to the raw data: a L-sized shirt minus a M-sized shirt has no meaningful answer.

Both *continuous *and* variable *are poor names for this type of data, but for whatever reason, these are the names that have stuck. The name “continuous” is meant to convey the idea that this data type can have any value from a continuous scale, like the reading on a mercury thermometer.

“Variable” is an attempt to say the same thing — that the measured values can vary anywhere along a given scale. You can get 98.23 degrees Fahrenheit or 98.25 degrees Fahrenheit or 98.37 degrees Fahrenheit.

The problem is that no matter how continuous or variable you think your measurement scale is, as soon as you record a measurement, you always truncate its reading to some fixed length, making it no longer continuous. But the powers that be want you to use the names *continuous* and *variable,* so go ahead and use them anyway.

A few more examples of continuous data include

A numbered GPA scale representing letter grades at school

The temperature in your oven

The amount of money you spend on groceries

The time it takes to complete a process task

The gas mileage of your car

Any two values of continuous or variable data can always be meaningfully added or subtracted. For example, a count of the number of children in each household can only occur in integer values — you can’t physically have 2.3 children — so the scale of measure of children in a household isn’t continuous at all.

But you can take the integer measurement from each household and perform mathematical operations to calculate a meaningful average or standard deviation. Being able to mathematically operate on any two values of continuous data is what sets it apart from attribute data.