The goal of regression is to look at past data to determine whether there are any variables that are influencing financial movements. This process now typically utilizes very advanced computer programs, such as analytics software and databases, to perform something called data mining.

Basically, data mining works by including all the data you can possibly get your hands on and letting a computer program figure out whether any correlation exists between the thing you’re trying to forecast and other variables. You can do data mining on your own, but unless you already have some idea of what you may be looking for, it’s just guess-and-check, which stinks.

For example, you may find that your corporation’s costs increase with the temperature outside. As the temperature increases, so does total costs; as temperature decreases, the corporate costs also decrease. You may even find that, on average, costs change by 1 percent for every 3 percent change in temperature. This relationship is called a correlation.

Note that a correlation doesn’t mean that the temperature is causing a price increase — just that the two are related. You can think of a correlation like this: If all relationships were causational, you could say that Bono from U2 kills people because a high correlation exists between short life-expectancies and countries Bono has visited. While a correlation exists, both the short life-expectancy and the visits by Bono are caused by poverty.

So, in this example, if temperature and cost are correlated, the relationship may look something like this:

The little dots are the actual values included. You plot them as you would on any graph: Find the correct spot on the horizontal axis (temperature), move up to the correct spot on the vertical axis (cost), and place the dot where the two intersect. The line going through them illustrates the proportion of the relationship. (In this case, a one-third slope indicates that for every 1 unit increase in cost, temperature increases by 3 units.)

Consider the following about the figure:

• As one factor increases, the other increases as well. That’s called a positive correlation.

• If one factor decreases as the other increases, it’s called a negative correlation.

• The closer the dots are to the line, the stronger the relationship is. If the dots are far away from the line and don’t look like they’re in a pattern, the relationship is very weak. The relationship is fairly strong because you can see the pattern even without the line present.

## What to do with correlations

Ideally, if you can find a relationship, then you want to be able to use that relationship to make financial predictions. For example, if it’s possible to determine what your costs will look like next week by measuring the temperature today, then temperature is a good thing to know. If the weather report says it will be 90 degrees next week, can you use that to predict your corporate costs?

Particularly in regards to investing, any correlations that exist that will allow you to predict the movement in the price of a stock will be highly prized.

You can also use multiple variables to create more accurate correlations.

These multivariate regressions attempt to show how each variable plays an influence on the thing you’re measuring and that, when used together, you can create an even more accurate model that not only explains what is causing changes in the thing you’re measuring, but also how much of a role each variable plays and how you can use that to predict what will happen in the future.

## How to do a regression analysis

You can do a regression analysis using Microsoft Excel:

1. In cells A1 and A2, title each column with the label of the type of data that will be used in each.

For example, you can use labels such as “Temp” and “Costs.”

2. In column A, below the title, start inputting the appropriate data.

For example, you can include the temperature on a given day with a new value in each cell.

3. In column B, below the title, input the proper data there as well.

Be very careful to match the proper data together. For example, if you’re putting the cost for a particular day in a cell of column B, make sure that it’s next to the correct temperature for the same day.

4. Use the Excel function LINEST

For “Known_y’s” include all of one column, including the title. For “Known_x’s” use all of the other column.

5. Press Enter to get a decimal value.

The closer to 1 that value is, the stronger the relationship. The closer to 0 that value is, the weaker the relationship. A value of 1 means that a perfect correlation exists, while a value of 0 means no correlation exists at all. If the number is positive, it’s a positive correlation; if it’s negative, you have a negative correlation.

As a side note, if you can identify the influences on your finances, then you can manage those influences to make them work in your favor. You are empowered to change your financial future.