Posts

Showing posts with the label Linear Regression

Linear Discriminant Analysis LDA - Using Linear regression for classification

Image
Linear Discriminant Analysis LDA uses linear regression to supervise the classification of data. Essentially you assign each class a numerical value. Then use  linear regression method  to calculate the projection of your observations to the assigned numerical values. Finally you calculate the thresholds to distinguish between classes. Essentially LDA attempts to find the best linear function that separates your data points into distinct classes. The above diagram illustrates this idea. Implementing LDA using LAMBDA Fit Steps in implementing LDA's Fit : 1. Find the distinct classes and assign each with an arbitrary value - UNIQUE and SEQUENCE . 2. Designate each observation with the arbitrary assigned value depending on its class - XLOOKUP . 3. Find the linear regression coefficients for this observations - dcrML.Linear.Fit . 4. Project each observation on the linear regression - dcrML.Linear.Predict . 5. Find the threshold of each class - classCutOff  from the spread of each re

Linear Regression: Why you should reinvent Excel's LINEST?

Image
In the previous article on  Linear Regression , I mentioned Excel's LINEST function. But if you tried using the returned coefficients, you may notice something peculiar. The order of the returned linear coefficients is in the reverse order of the input data. LINEST documents: The equation for the line is: `y = m_1x_1 + m_2x_2 + ... + m_nx_n+ b` if there are multiple ranges of x-values, where the dependent y-values are a function of the independent x-values. The m-values are coefficients corresponding to each x-value, and b is a constant value. Note that y, x, and m can be vectors. The array that the LINEST function returns is `{m_n, m_(n-1), ..., m_1, b}`.  The input is in the order 1st, 2nd, 3rd, ... but the returned coefficients are in the reverse. And if you were to use the coefficients to predict y for a given `x_1, x_2, x_3, ...`, you would either swap the x-s around or the coefficients around. This isn't intuitive. For this reason you should reinvent LINEST . The inten

What is Linear Regression?

Image
Linear Regression is the modelling of the relationship of a dependent variable to one or more independent variables. You want to predict the value of a dependent variable given other independent variable values. The aim of linear regression is to find the line with the best fit for the given data. Simple and Multiple Linear Regression Simple linear regression is the case of only one independent variable. The equation is written as: `y = b + mx` Multiple linear regression has many independent variables and the equation is written as: `y = b + m_1x_1 + m_2x_2 + m_3x_3 + ...` where: `y` is the dependent variable `x_i` are the independent variables `m_i` are the coefficient for the corresponding `x_i` variables `b` is a constant, sometimes known as error or offset `b` also has a special property. It is the y-intersect when `x_i = 0` for all `i`. For simple linear regression `m` and `b` are calculated using the formula: `m = sum_((x - barx)(y - bary))/sum_((x - barx)^2)` `b = bary - m barx`