What is Linear Regression?

Linear Regression is the modelling of the relationship of a dependent variable to one or more independent variables. You want to predict the value of a dependent variable given other independent variable values.

The aim of linear regression is to find the line with the best fit for the given data.

Simple and Multiple Linear Regression

Simple linear regression is the case of only one independent variable. The equation is written as:

`y = b + mx`

Multiple linear regression has many independent variables and the equation is written as:

`y = b + m_1x_1 + m_2x_2 + m_3x_3 + ...`

where:
`y` is the dependent variable
`x_i` are the independent variables
`m_i` are the coefficient for the corresponding `x_i` variables
`b` is a constant, sometimes known as error or offset

`b` also has a special property. It is the y-intersect when `x_i = 0` for all `i`.

For simple linear regression `m` and `b` are calculated using the formula:

`m = sum_((x - barx)(y - bary))/sum_((x - barx)^2)`

`b = bary - m barx`

For multiple linear regression the `m_i` and `b` are calculated by solving a matrix equation shown here.

LINEST and TREND Function

Thankfully Excel's has LINEST function. It is very easy to use and saves you and me a lot of trouble. It assumes the training data known-y values and known-x values are in single columns. For multiple linear regression, it assumes the x-knowns are in single columns adjacent to each other.

Take note that the return m-coefficients and the constant are returned in the order

`m_n, m_(n-1),..., m_1, b`

You can then use these values to calculate the predicted value of y given a set of x values.

If you wish to simply the plug the values in to find the predicted value(s), Excel also has TREND function.

In my next blog, I will explain why and reinvent both these functions using LAMBDA.

In the meantime try out LINEST and TREND in your DC-DEN!

Comments