Appendix H — Overview of General Linear Models
H.1 Overview
This document present basic concepts and assumptions regarding general linear models. The main goal is to provide a reference for the model diagnostics that will be performed in the other sections of the thesis supplementary material.
H.2 Definitions
Definition H.1 (Response/Predictor/Regression) The variables \(X_{1}, \dots, \text{X}_{k}\) are called predictors, and the random variable \(Y\) is called the response. The conditional expectation of \(Y\) for given values \(x_{1}, \dots, \text{x}_{k}\) of \(X_{1}, \dots, \text{X}_{k}\) (\(E(Y|x_{1}, \dots, \text{x}_{k})\)) is called the regression function of \(Y\) on \(X_{1}, \dots, \text{X}_{k}\), or simply the regression of \(Y\) on \(X_{1}, \dots, \text{X}_{k}\). (DeGroot & Schervish, 2012, p. 699)
Definition H.2 (Linear Regression) A regression function (\(E(Y|x_{1}, \dots, \text{x}_{k})\)) in the form of a linear function of the the parameters \(\beta_{0}, \dots, \beta_{k}\), having the following form: (DeGroot & Schervish, 2012, pp. 699–700)
\[ E(Y|x_{1}, \dots, \text{x}_{k}) = \beta_{0} + \beta_{1} x_{1} + \dots + \beta_{k} x_{k} \tag{H.1}\]
Definition H.3 (Regression Coefficients) The coefficients \(\beta_{0}, \dots, \beta_{k}\) in Equation H.1 are called regression coefficients (DeGroot & Schervish, 2012, pp. 699–700).
Definition H.4 (Simple Linear Regression) A regression of \(Y\) on just a single variable \(X\). For each value \(X = x\), the random variable \(Y\) can be represented in the form \(Y = \beta_{0} + \beta_{1}x + \epsilon\), where \(\epsilon\) is a random variable that has the normal distribution with mean \(0\) and variance \(\sigma^{2}\). It follows from this assumption that the conditional distribution of \(Y\) given \(X = x\) is the normal distribution with mean \(\beta_{0} + \beta_{1}x\) and variance \(\sigma^{2}\) (DeGroot & Schervish, 2012, p. 700).
Definition H.5 (Multiple Linear Regression) A linear regression of \(Y\) on \(k\) variables \(X_{1}, \dots, X_{k}\), rather than on just a single variable \(X\). In a problem of multiple linear regressions, we obtain \(n\) vectors of observations (\(x_{i1}. \dots, x_{ik}, Y_{i}\)), for \(i = 1, \dots, n\). Here \(x_{ij}\) is the observed value of the variable \(X_{j}\) for the \(i\)th observation. The \(E(Y_{i})\) is given by the relation: (DeGroot & Schervish, 2012, p. 738)
\[ E(Y_{i}) = \beta_{0} + \beta_{1} x_{i1} + \dots + \beta_{k} x_{ik} \tag{H.2}\]
Definition H.6 (Residuals/Fitted Values) For \(i = 1, \dots, n\), the observed values of \(\hat{y} = \hat{\beta}_{0} + \hat{\beta}_{1} x_{i}\) are called fitted values. For \(i = 1, \dots, n\), the observed values of \(e_{i} = y_{i} - \hat{y}_{i}\) are called residuals (DeGroot & Schervish, 2012, p. 717).
Definition H.7 (General Linear Model) The statistical model in which the observations \(Y_{1}, \dots, Y_{n}\) satisfy the following assumptions (DeGroot & Schervish, 2012, p. 738).
H.3 Assumptions
Here we shall assume that each observation \(Y_{1}, \dots, Y_{n}\) has a normal distribution, that the observations \(Y_{1}, \dots, Y_{n}\) are independent, and that the observations \(Y_{1}, \dots, Y_{n}\) have the same variance \(\sigma^{2}\). Instead of a single predictor being associated with each \(Y_{i}\), we assume that a \(p\)-dimensional vector \(z_{i} = (z_{i0}, \dots, z_{ip - 1})\) is associated with each \(Y_{i}\) (DeGroot & Schervish, 2012, p. 736).
- Assumption 1
- Predictor is known. Either the vectors \(z_{1}, \dots , z_{n}\) are known ahead of time, or they are the observed values of random vectors \(Z_{1}, \dots , Z_{n}\) on whose values we condition before computing the joint distribution of (\(Y_{1}, \dots , Y_{n}\)) (DeGroot & Schervish, 2012, p. 736).
- Assumption 2
- Normality. For \(i = 1, \dots, n\), the conditional distribution of \(Y_{i}\) given the vectors \(z_{1}, \dots , z_{n}\) is a normal distribution (DeGroot & Schervish, 2012, p. 737).
(Normality of the error term distribution (Hair, 2019, p. 287))
- Assumption 3
- Linear mean. There is a vector of parameters \(\beta = (\beta_{0}, \dots, \beta_{p - 1})\) such that the conditional mean of \(Y_{i}\) given the values \(z_{1}, \dots , z_{n}\) has the form
\[ z_{i0} \beta_{0} + z_{i1} \beta_{1} + \cdots + z_{ip - 1} \beta_{p - 1} \]
for \(i = 1, \dots, n\) (DeGroot & Schervish, 2012, p. 737).
(Linearity of the phenomenon measured (Hair, 2019, p. 287))
It is important to clarify that the linear assumption pertains to linearity of the parameters or equivalently, linearity of the coefficients. This means that each predictor is multiplied by its corresponding regression coefficient. However, this does not necessarily imply that the relationship between the predictors and the response variable is linear. In fact, a linear model can still effectively capture non-linear relationships between predictors and the response variable by utilizing transformations of the predictors (Cohen et al., 2002, p. 195).
- Assumption 4
- Common variance (homoscedasticity). There is as parameter \(\sigma^{2}\) such the conditional variance of \(Y_{i}\) given the values \(z_{1}, \dots , z_{n}\) is \(\sigma^{2}\) for \(i = 1, \dots, n\).
(Constant variance of the error terms (Hair, 2019, p. 287))
- Assumption 5
- Independence. The random variables \(Y_{1}, \dots , Y_{n}\) are independent given the observed \(z_{1}, \dots , z_{n}\) (DeGroot & Schervish, 2012, p. 737).
(Independence of the error terms (Hair, 2019, p. 287))