Levels of Measurement for Biostatistics Data - dummies

Levels of Measurement for Biostatistics Data

By John Pezzullo

Around the middle of the 20th century, the idea of levels of measurement caught the attention of biological and social-science researchers, and, in particular, psychologists. One classification scheme, which has become very widely used (at least in statistics textbooks), recognizes four different levels at which variables can be measured: nominal, ordinal, interval, and ratio:

  • Nominal variables are expressed as mutually exclusive categories, like gender (male or female), race (white, black, Asian, and so forth), and type of bacteria (such as coccus, bacillus, rickettsia, mycoplasma, or spirillum), where the sequence in which you list a variable’s different categories is purely arbitrary.

    For example, listing a choice of races as black, asian, and white is no more or less “natural” than listing them as white, black, and asian.

  • Ordinal data has categorical values (or levels) that fall naturally into a logical sequence, like the severity of an adverse event (slight, moderate, or severe), or an agreement scale (strongly disagree, disagree, no opinion, agree, or strongly agree), often called a Likert scale. Note that the levels are not necessarily “equally spaced” with respect to the conceptual difference between levels.

  • Interval data is a numerical measurement where, unlike ordinal data, the difference (or interval) between two numbers is a meaningful measure of the amount of difference in what the variable represents, but the zero point is completely arbitrary and does not denote the complete absence of what you’re measuring.

    An example of this concept is the metric Celsius temperature scale. A change from 20 to 25 degrees Celsius represents the same amount of temperature increase as a change from 120 to 125 degrees Celsius. But 0 degrees Celsius is purely arbitrary — it does not represent the total absence of temperature; it’s simply the temperature at which water freezes (or, if you prefer, ice melts). Interval data usually can can have numerical values that are positive, negative, or zero.

  • Ratio data, unlike interval data, does have a true zero point. The numerical value of a ratio variable is directly proportional to how much there is of what you’re measuring, and a value of zero means there’s nothing at all. Generally, ratio data cannot have negative values, because that would indicate less than nothing.

    Mass is a ratio measurement, as is the Kelvin temperature scale — it starts at the absolute zero of temperature (about 273 degrees below zero on the Celsius scale), where there is no thermal energy at all.

Statisticians tend to beat this topic to death — they love to point out cases that don’t fall neatly into one of the four levels and to bring up various counterexamples. But you need to be aware of the concepts and terminology in the preceding list because you’ll see them in statistics textbooks and articles, and because teachers love to include them on tests.

And, more practically, knowing the level of measurement of a variable can often help you choose the most appropriate way to analyze that variable.