Simplifying Excess Statistical Data in a Time Chart

By Deborah J. Rumsey

If a time chart includes too much statistical data, the result can be so complex that it makes it impossible to interpret the data. By reducing the amount of data, it is easier to see patterns emerge from the data.

A chart of the time between eruptions for Old Faithful geyser in Yellowstone Park is shown in the following figure. You see 222 dots on this graph; each one represents the time between one eruption and the next, for every eruption during a 16-day period.

This figure looks very complex; data are everywhere, there are too many points to really see anything, and you can’t find the forest for the trees. There is such a thing as having too much data, especially nowadays when you can measure data continuously and meticulously using all kinds of advanced technology.

To get a clearer picture of the Old Faithful data, you can combine all the observations from a single day and find its mean; The following time chart is the result of making calculations for all 16 days, and then plotting all the means in order. This reduces the data from 222 points to 16 points.

From this time chart you see a little bit of a cyclical pattern to the data; every day or two it appears to shift from short times between eruptions to longer times between eruptions. While these changes are not definitive, it does provide important information for scientists to follow up on when studying the behavior of geysers like Old Faithful.