Statistical Analysis with R For Dummies
Book image
Explore Book Buy On Amazon
The R function for calculating standard scores is called scale(). Supply a vector of scores, and scale() returns a vector of z-scores along with, helpfully, the mean and the standard deviation.

To show scale() in action, isolate a subset of the Cars93 data frame. (It's in the MASS package. On the Packages tab, check the box next to MASS if it's unchecked.)

Specifically, create a vector of the horsepowers of 8-cylinder cars from the USA:

> Horsepower.USA.Eight <- Cars93$Horsepower[Origin == "USA" & Cylinders == 8]

> Horsepower.USA.Eight [1] 200 295 170 300 190 210

And now for the z-scores:

> scale(Horsepower.USA.Eight) [,1] [1,] -0.4925263 [2,] 1.2089283 [3,] -1.0298278 [4,] 1.2984785 [5,] -0.6716268 [6,] -0.3134259 attr(,"scaled:center") [1] 227.5 attr(,"scaled:scale") [1] 55.83458 That last value is s, not Σ. If you have to base your z-scores on Σ, divide each element in the vector by the square root of (N-1)/N:

> N <- length(Horsepower.USA.Eight) > scale(Horsepower.USA.Eight)/sqrt((N-1)/N) [,1] [1,] -0.5395356

[2,] 1.3243146 [3,] -1.1281198 [4,] 1.4224120 [5,] -0.7357303 [6,] -0.3433408 attr(,"scaled:center") [1] 227.5 attr(,"scaled:scale") [1] 55.83458

Notice that scale() still returns s.

About This Article

This article is from the book:

About the book author:

Joseph Schmuller, PhD, has taught undergraduate and graduate statistics, and has 25 years of IT experience. The author of four editions of Statistical Analysis with Excel For Dummies and three editions of Teach Yourself UML in 24 Hours (SAMS), he has created online coursework for and is a former Editor in Chief of PC AI magazine. He is a Research Scholar at the University of North Florida.

This article can be found in the category: