How to Perform a Regression Analysis in Excel

By Stephen L. Nelson, E. C. Nelson

Excel’s regression functions let you perform regression analysis. In a nutshell, regression analysis involves plotting pairs of independent and dependent variables in an XY chart and then finding a linear or exponential equation that describes the plotted data.

FORECAST: Forecast dependent variables using a best-fit line

The FORECAST function finds the y-value of a point on a best-fit line produced by a set of x- and y-values given the x-value. The function uses the syntax

=FORECAST(x,known_y's,known_x's)

where x is the independent variable value, known_y’s is the worksheet range holding the dependent variables, and known_x’s is the worksheet range holding the independent variables.

The FORECAST function uses the known_y’s and known_x’s values that you supply as arguments to calculate the y=mx+b equation that describes the best-fit straight line for the data. The function then solves that equation using the x argument that you supply to the function.

To use the linear regression functions such as the FORECAST function, remember the equation for a line is y=mx+b. y is the dependent variable, b is the y-intercept or constant, m is the slope, and x gives the value of the independent variable.

INTERCEPT: y-axis intercept of a line

The INTERCEPT function finds the point where the best-fit line produced by a set of x- and y-values intersects the y-axis. The function uses the syntax

=INTERCEPT(known_y's,known_x's)

where known_y’s is the worksheet range holding the dependent variables and known_x’s is the worksheet range holding the independent variables.

If you’ve ever plotted pairs of data points on an XY graph, the way the INTERCEPT function works is pretty familiar. The INTERCEPT function uses the known_y’s and known_x’s values that you supply as arguments to calculate the best-fit straight line for the data — essentially figuring out the y=mx+b equation for the line.

The function then returns the b value because that’s the value of the equation when the independent, or x, variable equals zero.

LINEST

The LINEST function finds the m and b values for a line based on sets of and variables. The function uses the syntax

=LINEST(known_y's,[known_x's],[const],[stats])

where known_y’s equals the array of y-values that you already know, known_x’s supplies the array of x-values that you may already know, const is a switch set to either TRUE (which means the constant b equals 0) or to FALSE (which means the constant b is calculated), and is another switch set to either TRUE (meaning the function returns a bunch of other regression statistics) or FALSE (meaning enough already).

SLOPE: Slope of a regression line

The SLOPE function calculates the slope of a regression line using the x- and y-values. The function uses the syntax

=SLOPE(known_y's,known_x's)

An upward slope indicates that the independent, or x, variable positively affects the dependent, or y, variable. In other words, an increase in x produces an increase in y. A downward slope indicates that the independent, or x, variable negatively affects the dependent, or y, variable. The steeper the slope, the greater the effect of the independent variable on the dependent variable.

STEYX: Standard error

The STEYX function finds the standard error of the predicted y-value of each of the x-values in a regression. The function uses the syntax

=STEYX(known_y's,known_x's)

TREND

The TREND function finds values along a trend line, which the function constructs using the method of least squares. The syntax looks like this:

=TREND(known_y's,[known_x's],[new_x's],[const])

LOGEST: Exponential regression

The LOGEST function returns an array that describes an exponential curve that best fits your data. The function uses the syntax

=LOGEST(known_y's,[known_x's],[const],[stats])

where known_y’s is the set of y-values, known_x’s is the set of x-values, const is a switch set to either TRUE (meaning that b is calculated normally) or FALSE (meaning that b is forced to equal 1), and stats is a switch that’s set to either TRUE (in which case, the LOGEST function returns additional regression statistics) or FALSE (which tells the function to skip returning extra information).

In an exponential regression, Excel returns an equation that takes the form y=abx that best fits your data set.

GROWTH: Exponential growth

The GROWTH function calculates exponential growth for a series of new x-values based on existing x-values and y-values. The function uses the syntax

=GROWTH(known_y's,[known_x's],[new_x's],[const])

where known_y’s is the set of y-values, known_x’s is the set of x-values, is the set of x-values for which you want to calculate new y-values, and const is a switch set to either TRUE (which means that b is calculated normally) or FALSE (which means that b is forced to equal 1).