##### Biostatistics For Dummies

Biostatistics, in its present form, is the cumulative result of four centuries of contributions from many mathematicians and scientists. Some are well known, and some are obscure; some are famous people you never would’ve suspected of being statisticians, and some are downright eccentric and unsavory characters. This list gives you some highlights of the contributions of a few of the many people who made statistics (and therefore biostatistics) what it is today.

• Thomas Bayes (ca. 1701–1761). A Presbyterian minister and amateur mathematician, Bayes lived long before the field of statistics (as we know it) existed; people were still struggling to work out the basic laws of probability. Bayes dabbled with the “inverse probability” problem (figuring out what a population must be like, based on observing a sample from that population), but he never bothered to publish his work. Nevertheless, a formula he developed eventually became the foundation for Bayesian statistics — one of the two major branches of statistical theory (the other being frequentist statistics). Bayesian statistics wasn’t used to solve real-world problems until more than two centuries after the death of its creator. For more information, check out Thomas Bayes in Encyclopedia.com and Thomas Bayes in Wikipedia.org.

• Pierre-Simon LaPlace (1749–1827). While LaPlace did most of his work in astronomy (one of his driving ambitions was to prove that the solar system wouldn’t fly apart), he also made fundamental discoveries in mathematics. He helped put Bayesian statistics on a firm theoretical foundation, and he helped formulate the least-squares criterion for estimating population parameters. He was also one of the first scientists to suggest the existence of black holes due to gravitational collapse (and you thought that was a “modern” concept)! For more information, check out Pierre-Simon LaPlace in Encyclopedia.com and Wikipedia.org.

• Carl Friedrich Gauss (1777–1855). Sometimes called “the Prince of Mathematicians,” Gauss’s contributions ranged from the most abstract and theoretical to the most practical. He developed nonlinear least-squares regression, found efficient ways to solve simultaneous equations, and discovered what’s now called the “fast Fourier transform” (FFT), without which the creation of CAT scans and MRI images would be hopelessly time-consuming. The normal distribution, with its bell-shaped curve, is often called a Gaussian distribution in his honor. For more information, check out Encyclopedia.com and Wikipedia.org coverage of Carl Friedrich Gauss.

• John Snow (1813–1858). A London physician, Snow was investigating a cholera outbreak and noticed that all the victims had been using a recently dug, public water pump, located three feet from an old, leaking cesspool. After he convinced skeptical local officials to remove the pump handle, the epidemic quickly petered out (after which, the officials promptly reinstalled the handle). Snow’s study marks the birth of the science of epidemiology (closely related to biostatistics), which studies the patterns, causes, and effects of health and disease conditions in specific populations. Snow also played a major role in popularizing the use of anesthesia in surgical and obstetrical procedures (helped by his giving chloroform to Queen Victoria during the deliveries of the last two of her nine children). For more information, check out Encyclopedia.com and Wikipedia.org.

• Florence Nightingale (1820–1910). Who would think that the famous “Lady with the Lamp” from the Crimean War, the founder of professional nursing, was also a statistician! But she was. She could convey complicated ideas in simple English and summarize data with easily understood graphs, including a special kind of pie chart she invented, called a polar area diagram. With the help of graphics that even politicians could understand, she was able to bring about profound improvements in medical care and public health. For more information, check out Encyclopedia.com and Wikipedia.org.

• Karl Pearson (1857–1936). The “founder of mathematical statistics” was an interesting character, to say the least — anti-Semitic, socialist, and an ardent eugenicist whose extreme views were part of the philosophical underpinnings of the Third Reich’s holocaust. But his influence on the development of statistics was enormous — including the concept of statistical hypothesis testing, the correlation coefficient, the chi-square test, the p value, and factor analysis (to mention only a few), all of which he developed to further the scientific credibility of his outlandish views. For more information, check out Encyclopedia.com and Wikipedia.org.

• William S. Gosset (1876–1937). Gosset worked for the Guinness Brewery in Dublin, where he encountered the problem of comparing the means of small samples. With some help from Karl Pearson, Gosset came up with the correct solution. Not being a high-powered mathematician, he relied on brilliant intuition to come up with a guess at the answer, which he then confirmed by painstaking and time-consuming simulations conducted entirely by hand (computers hadn’t been invented yet). Guinness wouldn’t let him publish his results under his real name; they made him use the pen name “Student” instead, forever depriving him of the name recognition he truly deserved. What everyone calls the Student t test and the Student t distribution should really have been the Gosset t test and the Gosset t distribution. A pity indeed. For more information, check out Encyclopedia.com and Wikipedia.org.

• Ronald A. Fisher (1890–1962). Perhaps the most towering figure in the development of statistical techniques in use today, Fisher invented the analysis of variance and the Fisher exact test for analyzing cross-tabulated data (the chi-square test was only approximate). Like Karl Pearson, Fisher was a rabid eugenicist and racist and (in retrospect) was on the wrong side of other important issues — he argued against the idea that smoking caused lung cancer. And his opposition to Bayesian statistics may be partly responsible for the subordinate role of Bayesian methods during most of the 20th century. For more information, check out Encyclopedia.com and Wikipedia.org.

• John W. Tukey (1915–2000). A pioneer in promoting exploratory data analysis (carefully examining what the data’s trying to say before jumping into formal statistical testing), Tukey invented the box-and-whiskers plot and the stem-and-leaf plot as aids to visualizing how a set of numbers are distributed. He also developed one of the best so-called post-hoc tests to determine which pairs of groups of numbers are significantly different from which others. A true computer scientist, he coined the term bit as a nickname for “binary digit” and was either the first or second person to use the term software in print. For more information, check out Wikipedia.org.

• David R. Cox (1924–). A very productive, “modern” statistician, Cox made pioneering contributions to many areas of statistics, including the design of experiments. He’s most famous for developing a way to apply regression analysis to survival data when the general shape of the survival curve can’t be represented by a mathematical formula. His original paper describing this proportional-hazards model (now usually referred to simply as Cox regression) is one of the most often-cited articles in all medical literature. For more information, check out Wikipedia.org.