The Building Blocks of Mathematical Formulas for Biostatistics - dummies

The Building Blocks of Mathematical Formulas for Biostatistics

By John Pezzullo

No matter how they’re written, mathematical formulas are just concise “recipes” that tell you how to calculate something or how something is defined. You just have to know how to read the recipe. To start, look at the building blocks from which formulas are constructed: constants (whose values never change) and variables (names that stand for quantities that can take on different values at different times).


Constants can be represented explicitly (using the numerals 0–9, with or without a decimal point) or symbolically (using a letter in the Greek or Roman alphabet to stand for a value that’s especially important in mathematics, physics, or some other discipline). For example:

  • The Greek letter π (spelled pi and pronounced pie) almost always represents 3.14159 (plus a zillion more digits), which is the ratio of the circumference of any circle to its diameter.

  • The strange number 2.71828 (plus a zillion more digits) is called e (usually italicized). You see e in statistical formulas throughout this book and in almost every other mathematical and statistical textbook. Whenever you see an italicized e, it refers to the number 2.718 unless stated otherwise.

    The official mathematical definition of e is the value that the expression (1 + 1/n)n approaches as n gets larger and larger (approaching infinity). Unlike pi, e has no simple geometrical interpretation, but one (somewhat far-fetched) example of where e pops up is this: Assume you put exactly one dollar in a bank account that’s paying 100 percent annual interest, compounded continuously. After exactly one year, your account will have e dollars in it. The interest earned on your original dollar, plus all the interest earned on the interest, would come to about $1.72 (rounded to the nearest penny), for a total of $2.72 in your account. Start saving for that summer home!

Mathematicians and scientists use lots of other special Greek and Roman letters as symbols for special numerical constants, but you need only a few of them in your biostatistics work. Pi and e are the most common.


The term variable has several slightly different meanings in different fields:

  • In mathematics and the sciences, a variable is a symbol (usually a letter of the alphabet) that represents some quantity in a formula. You see variables like x and y in algebra, for example.

  • In computer science, a variable is a name (usually made up of one or more letters (and perhaps also numeric digits) that refers to a place in the computer’s memory where one or more numbers (or other kinds of data) can be stored and manipulated.

    For example, a computer programmer writing a statistical software program may use a variable called SumXY to stand for a quantity that’s used in the computation of a correlation coefficient.

  • In statistics, a variable is the data element you collect (by counting, measuring, or calculating) and store in a file for analysis. This data doesn’t have to be numerical; it can also be categorical or textual. So the variables Name, ID, Gender, Birthdate, and Weight refer to the data that you acquire on subjects.

Variables names may be written in uppercase or lowercase letters, depending on typographic conventions or preferences, or on the requirements of the software being used.

In typeset format formulas, variables are always italicized; in plain text formulas, they’re not.