Regression - Assumptions
As with any other method, linear regression is based on assumptions
which have to be fulfilled for correct results:
-
The expected relationship between X and Y is linear: one should carefully
distinguish linear, curvilinear and non-linear relationships. While curvilinear
relationships can be transformed into
linear ones, non-linear relationships cannot.
-
All measurements are independent of each other; any trend over time, or
any common correlation to a third variable, must be avoided.
-
For each X, the Y values are distributed normally.
-
For each X, the Y-distribution has the same variance (homoscedastic
data). This requirement is often not met, especially with data covering
a large range (several orders of magnitude).
These assumptions should be checked by inspecting the data and
the residuals. One should always look at the X-Y plot, at the histogram of the residuals, and at the residuals plotted against Xi. Further, it is a good idea to check whether the residuals are uncorrelated (e.g. using the Durbin-Watson-Test) as the confidence intervals of the parameters will be wrong in case of serial correlation among the residuals.
|