# How to Calculate Parameters and Estimators

In econometrics, when you collect a random sample of data and calculate a statistic with that data, you’re producing a *point estimate, *which is a single estimate of a population parameter.

*Descriptive statistics* are measurements that can be used to summarize your sample data and, subsequently, make predictions about your population of interest. When descriptive measures are calculated using population data, those values are called *p**a**rameters.* When you calculate descriptive measures using sample data, the values are called *e**s**tim**a**tors *(or *statistics*).

You could estimate many population parameters with sample data, but here you calculate the most popular statistics: mean, variance, standard deviation, covariance, and correlation. The following list indicates how each parameter and its corresponding estimator is calculated.

**Mean (average):**The*mean*is the simple average of the random variable,*X.*The population mean for*X*iswhere

*X*represents the individual measurements and_{i}*N*is the size of the population. The sample mean isThe difference between the sample and population mean is that that the sample mean uses the sample size

*n*instead of the population size*N.***Variance:**The*variance*is the average of the squared differences from the mean. The population variance for a random variable*X*iswhere

*X*represents the individual measurements,_{i}is the population mean, and

*N*is the size of the population. The sample variance isNote that the denominator for the sample variance not only uses the sample size

*n*but also subtracts 1 from that number. This change is known as a*degrees of freedom*adjustment. Degrees of freedom adjustments are usually important in proving that estimators are unbiased.**Standard deviation:**The*standard deviation*measures how spread out the random variable is, on average, from the mean. The standard deviation is the square root of the variance, so the population standard deviation for random variable*X*isand the sample standard deviation is

**Covariance:**The*covariance*measures how much two random variables change together. The population covariance between two random variables*X*and*Y*iswhere

*X*represents the individual_{i}*X*values,*Y*represents the individual_{i}*Y*values, and*N*is the total number of measurements in the population. The sample covariance iswhere

is the sample mean of

*X*,is the sample mean of

*Y*, and*n*is the sample size.**Correlation:**The*correlation*refers to the relationship between two random variables or sets of data. The population correlation coefficient between two random variables*X*and*Y*iswhere

is the population covariance,

is the population standard deviation of

*X*, andis the population standard deviation of

*Y*. The sample correlation coefficient iswhere

*s*is the sample covariance,_{XY}*s*is the sample standard deviation of_{X}*X*, and*s*is the sample standard deviation of_{Y}*Y*.

Now, try working with some numbers. The table shows five observations of hamburger sales and prices. Use the formulas to calculate the mean, variance, standard deviation, covariance, and correlation.

Hamburger Sales (in units), Y | Hamburger Price (in $), X |
---|---|

100 | 1 |

80 | 2 |

63 | 3 |

45 | 4 |

21 | 5 |

You can use computer software, such as STATA, to calculate descriptive statistics from the data. By typing “sum” on the command line, you get the descriptive statistics for all the variables in your dataset. If you want the correlation between two variables, select **Statistics→Summaries, tables, and tests→Summary and descriptive statistics→Correlations and covariances** from the menu bar.

Or you can enter “corr *variable1* *vari**a**ble2*” on the command line. In your command, replace *variable1* and *variable2* with the actual names you’ve given the variables in your dataset. You can get covariance by adding an option to the correlation command; type “corr *var**i**able1* *variable2*, cov” on the command line.

You should verify that your manual calculations of these measures are consistent with STATA’s output.

Summarizing data with descriptive statistics is a relatively simple procedure, but make sure you examine the values carefully. You can use descriptive measures to ensure that your sample contains measurements that are realistic. For example, if your population of interest is college graduates, you wouldn’t expect your random sample from that group to have an average age of 21.

Careful attention to these details provides more credibility in your data and the subsequent inferences you make.