# How to Predict the future with Conditional Probability Density

Prediction in econometrics involves some prior knowledge. For example, you may attempt to predict how many “likes” your status update will get on Facebook given the number of “friends” you have and time of day you posted. In order to do so, you’ll want to be familiar with conditional probabilities.

*Conditional probabilities* calculate the chance that a specific value for a random variable will occur *given* that another random variable has already taken a value.

## Calculating conditional probability density

Conditional probabilities use two variables, so you’ll need the *joint* and *marginal* probabilities. Typically, this information is displayed in a table. The joint probabilities for random variables *X* and *Y* are shown in the middle rows and columns, and the marginal probabilities are on the outside row for variable *X* and outside column for variable *Y*.

Y | X | f (Y) | ||
---|---|---|---|---|

1 | 2 | 3 | ||

1 | 0.25 | 0 | 0.10 | 0.35 |

2 | 0.05 | 0.05 | 0.10 |
0.20 |

3 | 0 | 0.05 | 0.20 | 0.25 |

4 | 0 | 0 | 0.20 | 0.20 |

f(X) |
0.30 | 0.10 | 0.60 | 1.00 |

You can calculate conditional probabilities using the following formula:

It reads, the *probability of* *Y given X* equals the *probability of Y and X* divided by the *pro**b**a**bility of X.*

Suppose you’re interested in calculating a specific conditional probability using the table; the probability that *Y* equals 1 given that *X* equals 3. Using this formula and plugging in the probabilities from the table, your answer would be

The numerator in your calculation of a conditional probability is a joint probability, so it doesn’t matter if you write it as *Y* and *X* or *X* and *Y*.

## Checking for statistical independence

Regardless of the strength of your theory and the appeal of your common sense, in econometrics you’ll ultimately want to examine the statistical relationship between variables. You may first want to determine if any relationship exists at all.

Events are said to be* independent* if one event has no statistical relationship with the other event. One way you can determine statistical independence is by observing that the probability of one event is unaffected by the occurrence of another event.

If *f*(*Y** *| *X*) = *f*(*Y*), then the events are statistically independent; that is, the events are independent if the conditional and unconditional probabilities are equal. If

(meaning the conditional and unconditional probabilities are not equal), then they are dependent.

You can calculate the probability that *Y* equals 4 given that *X* equals 3, as follows:

You can also calculate the probability that *Y* equals 4 by summing the values in row 4: *f*(*Y* = 4) = 0 + 0 + 0.20 = 0.20.

Because the values (the conditional and unconditional probabilities) are unequal, you can conclude that *X* and *Y* are *dependent.*