Next: Standard Deviation, Previous: Common Calculations, Up: Common Calculations [Index]

The “least squares” or “linear regression” algorithm produces a best
fitting straight line through the middle of a set of N data points
*x1,y1,
..., xN,yN*. In Chart this means a set of prices Y, and dates X
(with non-trading days collapsed out).

For a possible fitted line *L(X)= a + b*X*, the vertical distance
from the line to each point is squared, and a total deviation formed.

SumSquares = (y1 - L(x1))^2 + ... + (yN - L(xN))^2

The line parameters *a* and *b* are then chosen to make SumSquares
as small as possible (hence the name “least squares”), and there’s just one
line with that smallest SumSquares. The calculation is made easier if the X
coordinates are shifted so that *Mean(X)=0*. With that the formulas for
*a* and *b* are

y1 + ... + yN a = Mean Y = ------------- N x1*y1 + ... + xN*yN b = ------------------- x1^2 + ... xN^2

A least squares fit is “best” under certain mathematical assumptions: basically that the data points were a straight line to which normally distributed random amounts (positive or negative) have been added. Of course an underlying straight line is unlikely in market price data, or in economics generally, and in particular any cyclical component invalidates the assumptions. Even so the algorithm is quite widely used because it offers an objective basis for fitting a line.

The slope of the linear regression line, the *b* above, is sometimes
called the *regression coefficient*. This is available as an indicator
(Linear Regression Slope), to show how steep the fitted trend line is. The
units are price change per day, which is negative for a downward sloping line.
This may or may not be particularly useful so it’s under “Low Priority” in
the indicator lists.

Standard error (stderr) is a statistical measure of how much values differ from an assumed underlying curve. It’s calculated as the quadratic mean of the vertical distances from each point to the curve.

Standard error from a linear regression line *y=a+bx* is

/ (y1 - (a+b*x1))^2 + ... + (yN - (a+b*xN))^2 \ Stderr = sqrt | ------------------------------------------- | \ N /

Notice the numerator is the same SumSquares which was minimized above.
Standard error is similar to standard deviation (see Standard Deviation);
but where stddev takes differences from a horizontal line (the *Y* mean),
stderr here goes from the sloping linear regression line.

For reference, there’s no need to actually calculate the linear regression
*a* and *b*, the stderr can be formed directly as

/ Covariance(X,Y)^2 \ Stderr = sqrt | Variance(Y) - ----------------- | \ Variance(X) /

where variance and covariance are as follows (and notice they simplify if
*X* values are chosen to make *Mean(X)* zero),

Covariance X,Y = Mean (X*Y) - (Mean X) * (Mean Y) Variance X = Mean(X^2) - (Mean X)^2

Standard error from a linear regression like this is used as a channel width in Kirshenbaum Bands (see Kirshenbaum Bands). It can also be viewed directly as an indicator, but this is probably of limited use and for that reason is under “Low Priority” in the indicator lists.

- http://mathworld.wolfram.com/LeastSquaresFitting.html – on calculating stderr without the a,b parameters

Next: Standard Deviation, Previous: Common Calculations, Up: Common Calculations [Index]

Copyright 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2014, 2015, 2016, 2017 Kevin Ryde

Chart is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3, or (at your option) any later version.