How to Use Different Types of Data for Six Sigma
All data aren’t created equal. As you begin your Six Sigma quest to organize your data, you first need to know what type of performance data you have. Just as knowing what the fish are biting tells you which lure to use, knowing what kind of data you’re dealing with tells you which tools to use. There are two important data categories: attribute and continuous.
|Attribute/category||Data observations fall into discrete, named value categories.||Eye color: brown, blue, green|
|Location: Factory 1, Factory 2, Factory 3|
|No mathematical operations can be performed on the raw data.||Inspection result: pass, fail|
|Size: large, medium, small|
|Fit check: go, no-go|
|Questionnaire response: yes, no|
|You can count the number of occurrences you see of each category.||Attendance: present, absent|
|Employee: Fred, Suzanne, Holly|
|Processing: Treatment A, Treatment B|
|Continuous||Data observations can take on numerical value and aren’t confined to nominal categories.||Bank account balance: dollars|
|Electrical current: amps|
|Any two data values can be meaningfully added and subtracted.||Survey response: 1 = disagree, 2 = neutral, 3 = agree|
Attribute (category) data
Some data consist of measurements that describe an attribute of the characteristic or process. These data are called attribute or category data.
Attribute data are all around you:
Telephone area codes
S, M, L, XL, XXL clothing sizes
Pass or fail judgments pronounced on just-assembled products
Good or bad assessments of the output from a process
How do you know whether you’re working with attribute data? The telltale test is to ask yourself, Can I meaningfully add or subtract values of this data?
If the answer is no, what you have is attribute data. For example, what do you get when you add a S-sized shirt to a M-sized shirt? Nothing meaningful. Or, if you subtract telephone area code 213 from area code 415, does the resulting area code of 202 mean something? Of course not! And so you know that you’re dealing with attribute data.
What you can do with attribute data is count how many times each category or attribute appears. For example, you may find that a process produces 152 good items and 28 bad items over a given period of time. You use the results of these types of category counting studies as the starting point for many Six Sigma analyses.
A subset category of attribute data that provides a little more horsepower is called ordinal data (also known as rank order data). Ordinal data are attribute data that can be logically placed in an order from smallest to greatest or in an order of time, such as the months of the year: January, February, March, and so on.
If you have month data on a set of last year’s invoices, you can sort them into buckets of occurrence starting with January and moving throughout the year. Or you may not have actual completion times, but you may have data about which employees finished a task first, second, third, and so on. You have a powerful set of ordinal data that you can use to begin analysis and improvement.
Continuous (variable) data
If you find that you can meaningfully add or subtract any two values of your data, you’re working with continuous (or variable) data rather than attribute data.
When testing whether data is attribute or continuous, be sure to apply the meaningfully add or subtract the values question to the raw data and not to any summarized counts of the data.
For example, the fact that you can subtract five M-sized shirts from seven L-sized shirts to get a two-shirt difference doesn’t indicate that you have continuous data. You have to apply the question to the raw data: a L-sized shirt minus a M-sized shirt has no meaningful answer.
Both continuous and variable are poor names for this type of data, but for whatever reason, these are the names that have stuck. The name continuous is meant to convey the idea that this data type can have any value from a continuous scale, like the reading on a mercury thermometer.
Variable is an attempt to say the same thing — that the measured values can vary anywhere along a given scale. You can get 98.23 degrees Fahrenheit or 98.25 degrees Fahrenheit or 98.37 degrees Fahrenheit.
The problem is that no matter how continuous or variable you think your measurement scale is, as soon as you record a measurement, you always truncate its reading to some fixed length, making it no longer continuous. But the powers that be want you to use the names continuous and variable, so go ahead and use them anyway.
A few more examples of continuous data include
A numbered GPA scale representing letter grades at school
The temperature in your oven
The amount of money you spend on groceries
The time it takes to complete a process task
The gas mileage of your car
Any two values of continuous or variable data can always be meaningfully added or subtracted. For example, a count of the number of children in each household can only occur in integer values — you can’t physically have 2.3 children — so the scale of measure of children in a household isn’t continuous at all.
But you can take the integer measurement from each household and perform mathematical operations to calculate a meaningful average or standard deviation. Being able to mathematically operate on any two values of continuous data is what sets it apart from attribute data.