[Appendix H — Hypothesis Test B]{#sec-sm-hypothesis-test-b .quarto-section-identifier}

doi:10.17605/OSF.IO/YGKTS

Appendix H — Hypothesis Test B

H.1 Overview

This document presents the analysis for testing the thesis hypothesis (Test B) using the methodology outlined in Supplementary Material B.

The assumptions evaluated here pertain to general linear models, with additional details provided in Supplementary Material J.

H.2 Setting the Environment

Code

library(brandr)
library(broom)
library(dplyr)
library(forecast)
library(ggplot2)
library(ggplotify)
library(here)
library(hms)
library(janitor)
library(latex2exp)
library(lmtest)
library(lubritime) # github.com/danielvartan/lubritime
library(magrittr)
library(methods)
library(olsrr)
library(parsnip)
library(patchwork)
library(performance)
library(pwrss)
library(quartor) # github.com/danielvartan/quartor
library(report)
library(rlang)
library(rutils) # github.com/danielvartan/rutils
library(see)
library(skedastic)
library(stats)
library(stringr)
library(tidyr)
library(workflows)

H.3 Loading and Processing the Data

Assumption 1: Predictor is known. Either the vectors \(z_{1}, \dots , z_{n}\) are known ahead of time, or they are the observed values of random vectors \(Z_{1}, \dots , Z_{n}\) on whose values we condition before computing the joint distribution of (\(Y_{1}, \dots , Y_{n}\)) (DeGroot & Schervish, 2012, p. 736).

Assumption 1 is satisfied, as the predictors are known.

Code

targets::tar_make(script = here::here("_targets.R"))

This data processing is performed solely for the purpose of the analysis. The variables undergo numerical transformations to streamline the modeling process and minimize potential errors.

data <-
  targets::tar_read("weighted_data", store = here::here("_targets")) |>
  dplyr::select(
    msf_sc, age, sex, latitude, longitude, ghi_month, cell_weight
  ) |>
  dplyr::mutate(
    msf_sc =
      msf_sc |>
      lubritime:::link_to_timeline(threshold = hms::parse_hms("12:00:00")) |>
      as.numeric()
  ) |>
  tidyr::drop_na()

H.4 Building the Restricted Model

The development and analysis of the restricted model are detailed in Supplementary Material F.

form <- formula(msf_sc ~ age + sex + longitude + ghi_month)

model <-
  parsnip::linear_reg() |>
  parsnip::set_engine("lm") |>
  parsnip::set_mode("regression")

workflow <-
  workflows::workflow() |>
  workflows::add_case_weights(cell_weight) |>
  workflows::add_formula(form) |>
  workflows::add_model(model)

workflow
#> ══ Workflow ═════════════════════════════════════════════════════════════════
#> Preprocessor: Formula
#> Model: linear_reg()
#> 
#> ── Preprocessor ─────────────────────────────────────────────────────────────
#> msf_sc ~ age + sex + longitude + ghi_month
#> 
#> ── Case Weights ─────────────────────────────────────────────────────────────
#> cell_weight
#> 
#> ── Model ────────────────────────────────────────────────────────────────────
#> Linear Regression Model Specification (regression)
#> 
#> Computational engine: lm

fit <- workflow |> parsnip::fit(data)
fit_restricted <- fit

Code

fit_coefs <- fit |> broom::tidy()

fit_coefs |> janitor::adorn_rounding(5)

Table H.1: Output from the model fitting process showing the estimated coefficients, standard errors, test statistics, and \(p\)-values for the terms in the linear regression model.

Code

fit_stats <- fit |> broom::glance()

fit_stats |>
  tidyr::pivot_longer(cols = dplyr::everything()) |>
  janitor::adorn_rounding(10)

Table H.2: Summary of model fit statistics showing key metrics including R-squared, adjusted R-squared, sigma, F-statistic, \(p\)-value, degrees of freedom, log-likelihood, AIC, BIC, and deviance.

fit_engine <- fit |> parsnip::extract_fit_engine()

fit_engine |> summary()
#> 
#> Call:
#> stats::lm(formula = ..y ~ ., data = data, weights = weights)
#> 
#> Weighted Residuals:
#>        Min         1Q     Median         3Q        Max 
#> -44246.473  -2705.422   -315.873   2429.960  52016.987 
#> 
#> Coefficients:
#>                   Estimate     Std. Error   t value   Pr(>|t|)    
#> (Intercept) 19426.69521742   389.03361181  49.93578 < 2.22e-16 ***
#> age          -124.87573578     1.72710425 -72.30353 < 2.22e-16 ***
#> sexMale       358.47472586    39.07147223   9.17485 < 2.22e-16 ***
#> longitude     -71.88511434     4.86857616 -14.76512 < 2.22e-16 ***
#> ghi_month      -0.52011291     0.03995153 -13.01860 < 2.22e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 4537.443 on 65818 degrees of freedom
#> Multiple R-squared:  0.08516628, Adjusted R-squared:  0.08511069 
#> F-statistic: 1531.829 on 4 and 65818 DF,  p-value: < 2.2204e-16

Code

# A jerry-rigged solution to fix issues related to modeling using the pipe.

fit_engine_2 <- lm(form, data = data, weights = cell_weight)
fit_engine_restricted <- fit_engine_2

report::report(fit_engine_2)
#> We fitted a linear model (estimated using OLS) to predict msf_sc with age,
#> sex, longitude and ghi_month (formula: msf_sc ~ age + sex + longitude +
#> ghi_month). The model explains a statistically significant and weak
#> proportion of variance (R2 = 0.09, F(4, 65818) = 1531.83, p < .001, adj. R2
#> = 0.09). The model's intercept, corresponding to age = 0, sex = Female,
#> longitude = 0 and ghi_month = 0, is at 19426.70 (95% CI [18664.19,
#> 20189.20], t(65818) = 49.94, p < .001). Within this model:
#> 
#>   - The effect of age is statistically significant and negative (beta =
#> -124.88, 95% CI [-128.26, -121.49], t(65818) = -72.30, p < .001; Std. beta =
#> -0.27, 95% CI [-0.28, -0.26])
#>   - The effect of sex [Male] is statistically significant and positive (beta =
#> 358.47, 95% CI [281.89, 435.05], t(65818) = 9.17, p < .001; Std. beta =
#> 0.07, 95% CI [0.05, 0.08])
#>   - The effect of longitude is statistically significant and negative (beta =
#> -71.89, 95% CI [-81.43, -62.34], t(65818) = -14.77, p < .001; Std. beta =
#> -0.07, 95% CI [-0.08, -0.06])
#>   - The effect of ghi month is statistically significant and negative (beta =
#> -0.52, 95% CI [-0.60, -0.44], t(65818) = -13.02, p < .001; Std. beta =
#> -0.06, 95% CI [-0.07, -0.05])
#> 
#> Standardized parameters were obtained by fitting the model on a standardized
#> version of the dataset. 95% Confidence Intervals (CIs) and p-values were
#> computed using a Wald t-distribution approximation.

H.5 Building the Full Model

H.5.1 Fitting the Model

form <- as.formula(
    paste0(
      "msf_sc ~ ",
      paste0(
        c("age", "sex", "longitude", "ghi_month",  "latitude"),
        collapse = " + "
      )
    )
)

lm(form, data = data, weights = cell_weight)
#> 
#> Call:
#> lm(formula = form, data = data, weights = cell_weight)
#> 
#> Coefficients:
#>   (Intercept)            age        sexMale      longitude      ghi_month  
#> 18514.7078232   -125.2679118    359.1669617    -63.3360146     -0.3451286  
#>      latitude  
#>   -21.7291415

model <-
  parsnip::linear_reg() |>
  parsnip::set_engine("lm") |>
  parsnip::set_mode("regression")

workflow <-
  workflows::workflow() |>
  workflows::add_case_weights(cell_weight) |>
  workflows::add_formula(form) |>
  workflows::add_model(model)

workflow
#> ══ Workflow ═════════════════════════════════════════════════════════════════
#> Preprocessor: Formula
#> Model: linear_reg()
#> 
#> ── Preprocessor ─────────────────────────────────────────────────────────────
#> msf_sc ~ age + sex + longitude + ghi_month + latitude
#> 
#> ── Case Weights ─────────────────────────────────────────────────────────────
#> cell_weight
#> 
#> ── Model ────────────────────────────────────────────────────────────────────
#> Linear Regression Model Specification (regression)
#> 
#> Computational engine: lm

fit <- workflow |> parsnip::fit(data)
fit_full <- fit

Code

fit_coefs <- fit |> broom::tidy()

fit_coefs |> janitor::adorn_rounding(5)

Table H.3: Output from the model fitting process showing the estimated coefficients, standard errors, test statistics, and \(p\)-values for the terms in the linear regression model.

Code

fit_stats <- fit |> broom::glance()

fit_stats |>
  tidyr::pivot_longer(cols = dplyr::everything()) |>
  janitor::adorn_rounding(10)

Table H.4: Summary of model fit statistics showing key metrics including R-squared, adjusted R-squared, sigma, F-statistic, \(p\)-value, degrees of freedom, log-likelihood, AIC, BIC, and deviance.

fit_engine <- fit |> parsnip::extract_fit_engine()

fit_engine |> summary()
#> 
#> Call:
#> stats::lm(formula = ..y ~ ., data = data, weights = weights)
#> 
#> Weighted Residuals:
#>        Min         1Q     Median         3Q        Max 
#> -44379.655  -2728.158   -339.719   2408.172  52050.417 
#> 
#> Coefficients:
#>                   Estimate     Std. Error   t value           Pr(>|t|)    
#> (Intercept) 18514.70782319   416.59909595  44.44251         < 2.22e-16 ***
#> age          -125.26791175     1.72782129 -72.50050         < 2.22e-16 ***
#> sexMale       359.16696172    39.06086322   9.19506         < 2.22e-16 ***
#> longitude     -63.33601460     5.06446295 -12.50597         < 2.22e-16 ***
#> ghi_month      -0.34512864     0.04915163  -7.02171 0.0000000000022128 ***
#> latitude      -21.72914149     3.55729235  -6.10834 0.0000000010123637 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 4536.192 on 65817 degrees of freedom
#> Multiple R-squared:  0.08568461, Adjusted R-squared:  0.08561515 
#> F-statistic: 1233.601 on 5 and 65817 DF,  p-value: < 2.2204e-16

Code

# A jerry-rigged solution to fix issues related to modeling using the pipe.

fit_engine_2 <- lm(form, data = data, weights = cell_weight)
fit_engine_full <- fit_engine_2

report::report(fit_engine_2)
#> We fitted a linear model (estimated using OLS) to predict msf_sc with age,
#> sex, longitude, ghi_month and latitude (formula: msf_sc ~ age + sex +
#> longitude + ghi_month + latitude). The model explains a statistically
#> significant and weak proportion of variance (R2 = 0.09, F(5, 65817) =
#> 1233.60, p < .001, adj. R2 = 0.09). The model's intercept, corresponding to
#> age = 0, sex = Female, longitude = 0, ghi_month = 0 and latitude = 0, is at
#> 18514.71 (95% CI [17698.17, 19331.24], t(65817) = 44.44, p < .001). Within
#> this model:
#> 
#>   - The effect of age is statistically significant and negative (beta =
#> -125.27, 95% CI [-128.65, -121.88], t(65817) = -72.50, p < .001; Std. beta =
#> -0.27, 95% CI [-0.28, -0.26])
#>   - The effect of sex [Male] is statistically significant and positive (beta =
#> 359.17, 95% CI [282.61, 435.73], t(65817) = 9.20, p < .001; Std. beta =
#> 0.07, 95% CI [0.05, 0.08])
#>   - The effect of longitude is statistically significant and negative (beta =
#> -63.34, 95% CI [-73.26, -53.41], t(65817) = -12.51, p < .001; Std. beta =
#> -0.06, 95% CI [-0.07, -0.05])
#>   - The effect of ghi month is statistically significant and negative (beta =
#> -0.35, 95% CI [-0.44, -0.25], t(65817) = -7.02, p < .001; Std. beta = -0.04,
#> 95% CI [-0.05, -0.03])
#>   - The effect of latitude is statistically significant and negative (beta =
#> -21.73, 95% CI [-28.70, -14.76], t(65817) = -6.11, p < .001; Std. beta =
#> -0.03, 95% CI [-0.05, -0.02])
#> 
#> Standardized parameters were obtained by fitting the model on a standardized
#> version of the dataset. 95% Confidence Intervals (CIs) and p-values were
#> computed using a Wald t-distribution approximation.

H.5.2 Evaluating the Model Fit

H.5.2.1 Predictions

Code

limits <-
  stats::predict(fit_engine, interval = "prediction") |>
  dplyr::as_tibble() |>
  rutils::shush()

fit |>
  broom::augment(data) |>
  dplyr::bind_cols(limits) |>
  ggplot2::ggplot(ggplot2::aes(msf_sc, .pred)) +
  # ggplot2::geom_ribbon(
  #   mapping = ggplot2::aes(ymin = lwr, ymax = upr),
  #   alpha = 0.2
  # ) +
  ggplot2::geom_ribbon(
    mapping = ggplot2::aes(
      ymin = stats::predict(stats::loess(lwr ~ msf_sc)),
      ymax = stats::predict(stats::loess(upr ~ msf_sc)),
    ),
    alpha = 0.2
  ) +
  ggplot2::geom_smooth(
    mapping = ggplot2::aes(y = lwr),
    se = FALSE,
    method = "loess",
    formula = y ~ x,
    linetype = "dashed",
    linewidth = 0.2,
    color = "black"
  ) +
  ggplot2::geom_smooth(
    mapping = ggplot2::aes(y = upr),
    se = FALSE,
    method = "loess",
    formula = y ~ x,
    linetype = "dashed",
    linewidth = 0.2,
    color = "black"
  ) +
  ggplot2::geom_point() +
  ggplot2::geom_abline(
    intercept = 0,
    slope = 1,
    color = brandr::get_brand_color("orange")
  ) +
  ggplot2::labs(
    x = "Observed",
    y = "Predicted"
  )

Figure H.1: Relation between observed and predicted values. The orange line is a 45-degree line originating from the plane’s origin and represents a perfect fit. The shaded area depicts a smoothed version of the 95% confidence of the prediction interval.

H.5.2.2 Posterior Predictive Checks

Posterior predictive checks are a Bayesian technique used to assess model fit by comparing observed data to data simulated from the posterior predictive distribution (i.e., the distribution of potential unobserved values given the observed data). These checks help identify systematic discrepancies between the observed and simulated data, providing insight into whether the chosen model (or distributional family) is appropriate. Ideally, the model-predicted lines should closely match the observed data patterns.

Code

diag_sum_plots <-
  fit_engine_2 |>
  performance::check_model(
    panel = FALSE,
    colors = c(
      brandr::get_brand_color("orange"),
      brandr::get_brand_color("black"),
      "black"
    )
  ) |>
  plot() |>
  rutils::shush()

diag_sum_plots$PP_CHECK +
  ggplot2::labs(
    title = ggplot2::element_blank(),
    subtitle = ggplot2::element_blank(),
    x = "MSFsc (Chronotype proxy) (s)",
  ) +
  ggplot2::theme_get()

Figure H.2: Posterior predictive checks for the model. The orange line represents the observed data, while the black lines represent the model-predicted data.

H.5.3 Conducting Model Diagnostics

It’s important to note that objective assumption tests (e.g., Anderson–Darling test) is not advisable for larger samples, since they can be overly sensitive to minor deviations. Additionally, they might overlook visual patterns that are not captured by a single metric (Kozak & Piepho, 2018; Schucany & Ng, 2006; Shatz, 2024).

These tests are included here for reference only. However, due to the reasons mentioned above, all assumptions were primarily diagnosed through visual inspection.

For a straightforward critique of normality tests specifically, refer to this article by Greener (2020).

H.5.3.1 Normality

Assumption 2: Normality. For \(i = 1, \dots, n\), the conditional distribution of \(Y_{i}\) given the vectors \(z_{1}, \dots , z_{n}\) is a normal distribution (DeGroot & Schervish, 2012, p. 737).

(Normality of the error term distribution (Hair, 2019, p. 287))

Assumption 2 is satisfied, as the residuals shown a fairly normal distribution by visual inspection.

H.5.3.1.1 Inspection

Code

fit_engine |>
  stats::residuals() |>
  dplyr::as_tibble() |>
  rutils:::stats_summary(col = "value", name = "Residuals")

Table H.5: Summary statistics of model residuals.

Code

fit_engine |>
  stats::residuals() |>
  dplyr::as_tibble() |>
  rutils:::test_normality(col = "value", name = "Residuals")
#> Registered S3 method overwritten by 'quantmod':
#>   method            from
#>   as.zoo.data.frame zoo

Figure H.3: Histogram of the model residuals with a kernel density estimate, along with a quantile-quantile (Q-Q) plot between the residuals and the theoretical quantiles of the normal distribution.

Code

fit |>
  broom::augment(data) |>
  dplyr::select(.resid) |>
  tidyr::pivot_longer(.resid) |>
  ggplot2::ggplot(ggplot2::aes(x = name, y = value)) +
  ggplot2::geom_boxplot(
    outlier.colour = brandr::get_brand_color("orange"),
    outlier.shape = 1,
    width = 0.5
  ) +
  ggplot2::labs(x = "Variable", y = "Value") +
  ggplot2::coord_flip() +
  ggplot2::theme(
    axis.title.y = ggplot2::element_blank(),
    axis.text.y = ggplot2::element_blank(),
    axis.ticks.y = ggplot2::element_blank()
  )

Figure H.4: Boxplot of model residuals with outliers highlighted in orange.

H.5.3.1.2 Tests

It’s important to note that the Kolmogorov-Smirnov and Pearson chi-square tests are included here just for reference, as many authors don’t recommend using them when testing for normality (D’Agostino & Belanger, 1990).

To learn more about normality tests, refer to Thode (2002) and the original papers for each test to understand their assumptions and limitations:

Anderson-Darling test: Anderson & Darling (1952); Anderson & Darling (1954).
Bonett-Seier test: Bonett & Seier (2002).
Cramér-von Mises test: Cramér (1928); Anderson (1962).
D’Agostino test: D’Agostino (1971); D’Agostino & Pearson (1973).
Jarque–Bera test: Jarque & Bera (1980); Bera & Jarque (1981); Jarque & Bera (1987).
Lilliefors (K-S) test: Smirnov (1948); Kolmogorov (1933); Massey (1951); Lilliefors (1967); Dallal & Wilkinson (1986).
Pearson chi-square test: Pearson (1900).
Shapiro-Francia test: Shapiro & Francia (1972).
Shapiro-Wilk test: Shapiro & Wilk (1965).

\[ \begin{cases} \text{H}_{0}: \text{The data is normally distributed} \\ \text{H}_{a}: \text{The data is not normally distributed} \end{cases} \]

Code

fit_engine |>
  stats::residuals() |>
  dplyr::as_tibble() |>
  rutils:::normality_summary(col = "value")

Table H.6: Summary of statistical tests conducted to assess the normality of the residuals.

H.5.3.2 Linearity

Assumption 3: Linear mean. There is a vector of parameters \(\beta = (\beta_{0}, \dots, \beta_{p - 1})\) such that the conditional mean of \(Y_{i}\) given the values \(z_{1}, \dots , z_{n}\) has the form

\[ z_{i0} \beta_{0} + z_{i1} \beta_{1} + \cdots + z_{ip - 1} \beta_{p - 1} \]

for \(i = 1, \dots, n\) (DeGroot & Schervish, 2012, p. 737).

(Linearity of the phenomenon measured (Hair, 2019, p. 287))

Assumption 3 is satisfied, as the relationship between the variables is fairly linear. As shown in Chapter 4, the hypothesis implies linearity and the distribution of the residuals also support this.

Code

plot <-
  fit |>
  broom::augment(data) |>
  ggplot2::ggplot(ggplot2::aes(.pred, .resid)) +
  ggplot2::geom_point() +
  ggplot2::geom_hline(
    yintercept = 0,
    color = "black",
    linewidth = 0.5,
    linetype = "dashed"
  ) +
  ggplot2::geom_smooth(color = brandr::get_brand_color("orange")) +
  ggplot2::labs(x = "Fitted values", y = "Residuals")

plot |> print() |> rutils::shush()

Figure H.5: Residual plot showing the relationship between fitted values and residuals.`<br>`{html}The dashed black line represent zero residuals, indicating an ideal model fit. The orange line indicate the conditional mean of residuals.

Code

plots <- fit_engine |> olsrr::ols_plot_resid_fit_spread(print_plot = FALSE)

for (i in seq_along(plots)) {
  q <- plots[[i]] + ggplot2::labs(title = ggplot2::element_blank())

  q <- q |> ggplot2::ggplot_build()
  q$data[[1]]$colour <- brandr::get_brand_color("orange")
  q$plot$layers[[1]]$constructor$color <- brandr::get_brand_color("orange")

  plots[[i]] <- q |> ggplot2::ggplot_gtable() |> ggplotify::as.ggplot()
}

patchwork::wrap_plots(plots$fm_plot, plots$rsd_plot, ncol = 2)

Figure H.6: Residual fit spread plots to detect non-linearity, influential observations, and outliers.`<br>`{html}The side-by-side plots show the centered fit and residuals, illustrating the variation explained by the model and what remains in the residuals. Inappropriately specified models often exhibit greater spread in the residuals than in the centered fit. “Proportion Less” indicates the cumulative distribution function, representing the proportion of observations below a specific value, facilitating an assessment of model performance.

The Ramsey’s RESET test checks if the model has no omitted variables. The test examines whether non-linear combinations of the fitted values can explain the response variable.

Learn more about the Ramsey’s RESET test in Ramsey (1969).

\[ \begin{cases} \text{H}_{0}: \text{The model has no omitted variables} \\ \text{H}_{a}: \text{The model has omitted variables} \end{cases} \]

fit_engine |> lmtest::resettest(power = 2:3)
#> 
#>  RESET test
#> 
#> data:  fit_engine
#> RESET = 157.25133, df1 = 2, df2 = 65815, p-value < 2.2204e-16

fit_engine |> lmtest::resettest(type = "regressor")
#> 
#>  RESET test
#> 
#> data:  fit_engine
#> RESET = 42.46296, df1 = 12, df2 = 65805, p-value < 2.2204e-16

H.5.3.3 Homoscedasticity (Common Variance)

Assumption 4: Common variance (homoscedasticity). There is as parameter \(\sigma^{2}\) such the conditional variance of \(Y_{i}\) given the values \(z_{1}, \dots , z_{n}\) is \(\sigma^{2}\) for \(i = 1, \dots, n\).

(Constant variance of the error terms (Hair, 2019, p. 287))

Assumption 4 is satisfied. When comparing the standardized residuals (\(\sqrt{|\text{Standardized Residuals}|}\)) spread to the fitted values, we can observe that the residuals are fairly constant across the range of values. This suggests that the residuals have a constant variance.

H.5.3.3.1 Visual Inspection

Code

plot <-
  fit |>
  stats::predict(data) |>
  dplyr::mutate(
    .sd_resid =
      fit_engine |>
      stats::rstandard() |>
      abs() |>
      sqrt()
  ) |>
  ggplot2::ggplot(ggplot2::aes(.pred, .sd_resid)) +
  ggplot2::geom_point() +
  ggplot2::geom_smooth(color = brandr::get_brand_color("orange")) +
  ggplot2::labs(
    x = "Fitted values",
    y = latex2exp::TeX("$\\sqrt{|Standardized \\ Residuals|}$")
  )

plot |> print() |> rutils::shush()

Figure H.7: Relation between the fitted values of the model and its standardized residuals.

Code

fit |>
  plotr:::plot_homoscedasticity(
    data = data,
    col = "msf_sc",
    x_label = "MSF~sc~ (Chronotype proxy) (Seconds)"
  )

Table H.7: Relation between `msf_sc` and the model standardized residuals.

Code

fit |>
  plotr:::plot_homoscedasticity(
    data = data,
    col = "age",
    x_label = "Age (Years)"
  )

Table H.8: Relation between `age` and the model standardized residuals.

Code

fit |>
  plotr:::plot_homoscedasticity(
    data = data,
    col = "longitude",
    x_label = "Latitude (Decimal degrees)"
  )

Table H.9: Relation between `longitude` and the model standardized residuals.

Code

fit |>
  plotr:::plot_homoscedasticity(
    data = data,
    col = "ghi_month",
    x_label = "Longitude (Decimal degrees)"
  )

Table H.10: Relation between `ghi_month` and the model standardized residuals.

Code

fit |>
  plotr:::plot_homoscedasticity(
    data = data,
    col = "latitude",
    x_label = "Monthly average global horizontal irradiance (Wh/m²)"
  )

Table H.11: Relation between `latitude` and the model standardized residuals.

H.5.3.3.2 Breusch-Pagan Test

The Breusch-Pagan test checks whether the residuals exhibit constant variance.

Learn more about the Breusch-Pagan test in Breusch & Pagan (1979) and Koenker (1981).

\[ \begin{cases} \text{H}_{0}: \text{The variance is constant} \\ \text{H}_{a}: \text{The variance is not constant} \end{cases} \]

# With studentising modification of Koenker
fit_engine |> lmtest::bptest(studentize = TRUE)
#> 
#>  studentized Breusch-Pagan test
#> 
#> data:  fit_engine
#> BP = 1131.6884, df = 5, p-value < 2.2204e-16

fit_engine |> lmtest::bptest(studentize = FALSE)
#> 
#>  Breusch-Pagan test
#> 
#> data:  fit_engine
#> BP = 1552.1088, df = 5, p-value < 2.2204e-16

Code

# Using the studentized modification of Koenker.
fit_engine |> skedastic::breusch_pagan(koenker = TRUE)

Code

fit_engine |> skedastic::breusch_pagan(koenker = FALSE)

fit_engine |> car::ncvTest()
#> Non-constant Variance Score Test 
#> Variance formula: ~ fitted.values 
#> Chisquare = 9083.487491, Df = 1, p = < 2.22045e-16

fit_engine_2 |> olsrr::ols_test_breusch_pagan()
#> 
#>  Breusch Pagan Test for Heteroskedasticity
#>  -----------------------------------------
#>  Ho: the variance is constant            
#>  Ha: the variance is not constant        
#> 
#>                Data                
#>  ----------------------------------
#>  Response : msf_sc 
#>  Variables: fitted values of msf_sc 
#> 
#>            Test Summary            
#>  ----------------------------------
#>  DF            =    1 
#>  Chi2          =    72.18484543 
#>  Prob > Chi2   =    1.959548349e-17

H.5.3.4 Independence

Assumption 5: Independence. The random variables \(Y_{1}, \dots , Y_{n}\) are independent given the observed \(z_{1}, \dots , z_{n}\) (DeGroot & Schervish, 2012, p. 737).

(Independence of the error terms (Hair, 2019, p. 287))

Assumption 5 is satisfied. Although the residuals show some autocorrelation, they fall within the acceptable range of the Durbin–Watson statistic (\(1.5\) to \(2.5\)). It’s also important to note that the observations for each predicted value are not related to any other prediction; in other words, they are not grouped or sequenced by any variable (by design) (see Hair (2019, p. 291) for more information).

Many authors don’t consider autocorrelation tests for linear regression models, as they are more relevant for time series data. However, I include them here just for reference.

H.5.3.4.1 Visual Inspection

Code

fit_engine |>
  residuals() |>
  forecast::ggtsdisplay(
    lag.max = 30,
    theme = ggplot2::theme_get()
  )
#> Registered S3 methods overwritten by 'forecast':
#>   method  from 
#>   head.ts stats
#>   tail.ts stats

Figure H.8: Time series plot of the residuals along with its AutoCorrelation Function (ACF) and Partial AutoCorrelation Function (PACF).

H.5.3.4.2 Correlations

Table H.12 shows the relative importance of independent variables in determining the response variable. It highlights how much each variable uniquely contributes to the R-squared value, beyond what is explained by the other predictors.

Code

fit_engine |> olsrr::ols_correlations()

Table H.12: Correlations between the dependent variable and the independent variables, along with the zero-order, part, and partial correlations.<br>{html}The zero-order correlation represents the Pearson correlation coefficient between the dependent and independent variables. Part correlations indicate how much the R-squared would decrease if a specific variable were removed from the model, while partial correlations reflect the portion of variance in the response variable that is explained by a specific independent variable, beyond the influence of other predictors in the model.

H.5.3.4.3 Newey-West Estimator

The Newey-West estimator is a method used to estimate the covariance matrix of the coefficients in a regression model when the residuals are autocorrelated.

Learn more about the Newey-West estimator in: Newey & West (1987) and Newey & West (1994).

fit_engine |> sandwich::NeweyWest()
#>                (Intercept)              age          sexMale
#> (Intercept) 372183.6934257 -425.57968425539 -3422.9016991396
#> age           -425.5796843   11.64040001590    12.6056297067
#> sexMale      -3422.9016991   12.60562970672  3925.6778488357
#> longitude     3147.4694189   -0.78729929084   -12.4252016878
#> ghi_month      -37.0109194    0.02467725042     0.2877105044
#> latitude       943.6824329    5.87074507634    19.1869193147
#>                    longitude        ghi_month       latitude
#> (Intercept) 3147.46941886501 -37.010919402627 943.6824328946
#> age           -0.78729929084   0.024677250422   5.8707450763
#> sexMale      -12.42520168783   0.287710504404  19.1869193147
#> longitude     84.41505502739   0.010143834520 -35.6361017284
#> ghi_month      0.01014383452   0.005812213491  -0.3212740041
#> latitude     -35.63610172839  -0.321274004106  55.5690572263

H.5.3.4.4 Durbin-Watson Test

The Durbin-Watson test is a statistical test used to detect the presence of autocorrelation at lag \(1\) in the residuals from a regression analysis. The test statistic ranges from \(0\) to \(4\), with a value of \(2\) indicating no autocorrelation. Values less than \(2\) indicate positive autocorrelation, while values greater than \(2\) indicate negative autocorrelation (Fox, 2016).

A common rule of thumb in the statistical community is that a Durbin-Watson statistic between \(1.5\) and \(2.5\) suggests little to no autocorrelation.

Learn more about the Durbin-Watson test in Durbin & Watson (1950), Durbin & Watson (1951), and Durbin & Watson (1971).

\[ \begin{cases} \text{H}_{0}: \text{Autocorrelation of the disturbances is 0} \\ \text{H}_{a}: \text{Autocorrelation of the disturbances is not equal to 0} \end{cases} \]

car::durbinWatsonTest(fit_engine)
#>  lag Autocorrelation D-W Statistic p-value
#>    1    0.0364739387   1.927029923       0
#>  Alternative hypothesis: rho != 0

H.5.3.4.5 Ljung-Box Test

The Ljung–Box test is a statistical test used to determine whether any autocorrelations within a time series are significantly different from zero. Rather than testing randomness at individual lags, it assesses the “overall” randomness across multiple lags.

Learn more about the Ljung-Box test in Box & Pierce (1970) and Ljung & Box (1978).

\[ \begin{cases} \text{H}_{0}: \text{Residuals are independently distributed} \\ \text{H}_{a}: \text{Residuals are not independently distributed} \end{cases} \]

fit_engine |>
  stats::residuals() |>
  stats::Box.test(type = "Ljung-Box", lag = 10)
#> 
#>  Box-Ljung test
#> 
#> data:  stats::residuals(fit_engine)
#> X-squared = 1088.8889, df = 10, p-value < 2.2204e-16

H.5.3.5 Collinearity/Multicollinearity

No high degree of collinearity was observed among the independent variables.

H.5.3.5.1 Variance Inflation Factor (VIF)

The Variance Inflation Factor (VIF) indicates the effect of other independent variables on the standard error of a regression coefficient. The VIF is directly related to the tolerance value (\(\text{VIF}_{i} = 1/\text{TO}L\)). High VIF values (larger than \(\sim 5\) (Struck, 2024)) suggest significant collinearity or multicollinearity among the independent variables (Hair, 2019, p. 265).

Code

diag_sum_plots <-
  fit_engine_2 |>
  performance::check_model(panel = FALSE) |>
  plot() |>
  rutils::shush()

diag_sum_plots$VIF +
  ggplot2::labs(
    title = ggplot2::element_blank(),
    subtitle = ggplot2::element_blank()
  ) +
  ggplot2::theme_get() +
  ggplot2::theme(
    legend.position = "right",
    axis.title = ggplot2::element_text(size = 11, colour = "black"),
    axis.text = ggplot2::element_text(colour = "black"),
    axis.text.y = ggplot2::element_text(size = 9),
    legend.text = ggplot2::element_text(colour = "black")
  )

Figure H.9: Variance Inflation Factors (VIF) for each predictor variable. VIFs below \(5\) are considered acceptable. Between \(5\) and \(10\), the variable should be examined. Above \(10\), the variable must considered highly collinear.

Code

fit_engine |> olsrr::ols_vif_tol()

Table H.13: Variance Inflation Factors (VIF) and tolerance values for each predictor variable.

Code

fit_engine_2 |> performance::check_collinearity()

Table H.14: Variance Inflation Factors (VIF) and tolerance values for each predictor variable.

H.5.3.5.2 Condition Index

The condition index is a measure of multicollinearity in a regression model. It is based on the eigenvalues of the correlation matrix of the predictors. A condition index of 30 or higher is generally considered indicative of significant collinearity (Belsley et al., 2004, pp. 112–114).

Code

fit_engine |> olsrr::ols_eigen_cindex()

Table H.15: Condition indexes and eigenvalues for each predictor variable.

H.5.3.6 Measures of Influence

In this section, I check several measures of influence that can be used to assess the impact of individual observations on the model estimates.

Leverage points: Leverage is a measure of the distance between individual values of a predictor and other values of the predictor. In other words, a point with high leverage has an \(x\)-value far away from the other \(x\)-values. Points with high leverage have the potential to influence the model estimates (Hair, 2019, p. 262; Nahhas, 2024; Struck, 2024).

Influence points: Influence is a measure of how much an observation affects the model estimates. If an observation with large influence were removed from the dataset, we would expect a large change in the predictive equation (Nahhas, 2024; Struck, 2024).

H.5.3.6.1 Standardized Residuals

Standardized residuals are a rescaling of the residual to a common basis by dividing each residual by the standard deviation of the residuals (Hair, 2019, p. 264).

fit_engine |> stats::rstandard() |> head()
#>              1              2              3              4              5 
#>  1.14466248868  1.91162017161 -0.05544017932  0.15285619139  0.86926612171 
#>              6 
#>  0.29009860510

Code

dplyr::tibble(
  x = seq_len(nrow(data)),
  std = stats::rstandard(fit_engine)
) |>
  ggplot2::ggplot(
    ggplot2::aes(x = x, y = std, ymin = 0, ymax = std)
  ) +
  ggplot2::geom_linerange(color = "black") +
  ggplot2::geom_hline(yintercept = 2, color = brandr::get_brand_color("grey")) +
  ggplot2::geom_hline(yintercept = -2, color = brandr::get_brand_color("grey")) +
  ggplot2::geom_hline(yintercept = 3, color = brandr::get_brand_color("orange")) +
  ggplot2::geom_hline(yintercept = -3, color = brandr::get_brand_color("orange")) +
  ggplot2::scale_y_continuous(breaks = seq(-3, 3)) +
  ggplot2::labs(
    x = "Observation",
    y = "Standardized residual"
  )
#> ! The color grey was not found in the `_brand.yml` file.
#> ! The color grey was not found in the `_brand.yml` file.

Figure H.10: Standardized residuals for each observation.

H.5.3.6.2 Studentized Residuals

Studentized residuals are a commonly used variant of the standardized residual. It differs from other methods in how it calculates the standard deviation used in standardization. To minimize the effect of any observation on the standardization process, the standard deviation of the residual for observation \(i\) is computed from regression estimates omitting the \(i\)th observation in the calculation of the regression estimates (Hair, 2019, p. 264).

fit_engine |> stats::rstudent() |> head()
#>              1              2              3              4              5 
#>  1.14466518662  1.91165871961 -0.05543975944  0.15285505730  0.86926450791 
#>              6 
#>  0.29009658674

Code

dplyr::tibble(
  x = seq_len(nrow(data)),
  std = stats::rstudent(fit_engine)
) |>
  ggplot2::ggplot(
    ggplot2::aes(x = x, y = std, ymin = 0, ymax = std)
  ) +
  ggplot2::geom_linerange(color = "black") +
  ggplot2::geom_hline(yintercept = 2, color = brandr::get_brand_color("grey")) +
  ggplot2::geom_hline(yintercept = -2, color = brandr::get_brand_color("grey")) +
  ggplot2::geom_hline(yintercept = 3, color = brandr::get_brand_color("orange")) +
  ggplot2::geom_hline(yintercept = -3, color = brandr::get_brand_color("orange")) +
  ggplot2::scale_y_continuous(breaks = seq(-3, 3)) +
  ggplot2::labs(
    x = "Observation",
    y = "Studentized residual"
  )
#> ! The color grey was not found in the `_brand.yml` file.
#> ! The color grey was not found in the `_brand.yml` file.

Figure H.11: Studentized residuals for each observation.

Code

fit |>
  broom::augment(data) |>
  dplyr::mutate(
    std = stats::rstudent(fit_engine)
  ) |>
  ggplot2::ggplot(ggplot2::aes(.pred, std)) +
  ggplot2::geom_point(color = "black") +
  ggplot2::geom_hline(yintercept = 2, color = brandr::get_brand_color("grey")) +
  ggplot2::geom_hline(yintercept = -2, color = brandr::get_brand_color("grey")) +
  ggplot2::geom_hline(yintercept = 3, color = brandr::get_brand_color("orange")) +
  ggplot2::geom_hline(yintercept = -3, color = brandr::get_brand_color("orange")) +
  ggplot2::scale_y_continuous(breaks = seq(-3, 3)) +
  ggplot2::labs(
    x = "Predicted value",
    y = "Studentized residual"
  )
#> ! The color grey was not found in the `_brand.yml` file.
#> ! The color grey was not found in the `_brand.yml` file.

Figure H.12: Relation between studentized residuals and fitted values.

Code

plot <-
  fit_engine |>
  olsrr::ols_plot_resid_lev(threshold = 2, print_plot = FALSE)

plot$plot +
  ggplot2::labs(
    title = ggplot2::element_blank(),
    y = "Studentized residual"
  )

Figure H.13: Relation between studentized residuals and their leverage points.

H.5.3.6.3 Hat Values

The hat value indicates how distinct an observation’s predictor values are from those of other observations. Observations with high hat values have high leverage and may be, though not necessarily, influential.

There is no fixed threshold for what constitutes a “large” hat value; instead, the focus must be on observations with hat values significantly higher than the rest (Hair, 2019, p. 261; Nahhas, 2024).

fit_engine |> stats::hatvalues() |> head()
#>                1                2                3                4 
#> 0.00011154803870 0.00005304235715 0.00001890244325 0.00028116437047 
#>                5                6 
#> 0.00011264323625 0.00010994216720

Code

dplyr::tibble(
  x = seq_len(nrow(data)),
  hat = stats::hatvalues(fit_engine)
) |>
  ggplot2::ggplot(
    ggplot2::aes(x = x, y = hat, ymin = 0, ymax = hat)
  ) +
  ggplot2::geom_linerange(color = "black") +
  ggplot2::labs(
    x = "Observation",
    y = "Hat value"
  )

Figure H.14: Hat values for each observation.

H.5.3.6.4 Cook’s Distance

The Cook’s D measures each observation’s influence on the model’s fitted values. It is considered one of the most representative metrics for assessing overall influence (Hair, 2019).

A common practice is to flag observations with a Cook’s distance of \(1.0\) or greater. However, a threshold of \(4 / (n - k - 1)\), where \(n\) is the sample size and \(k\) is the number of independent variables, is suggested as a more conservative measure in small samples or for use with larger datasets (Hair, 2019).

Learn more about Cook’s D in Cook (1977) and Cook (1979).

fit_engine |> stats::cooks.distance() |> head()
#>                    1                    2                    3 
#> 0.000024362061633972 0.000032307087715705 0.000000009683317111 
#>                    4                    5                    6 
#> 0.000001095209567771 0.000014187579235604 0.000001542240395315

Code

plot <-
  fit_engine |>
  olsrr::ols_plot_cooksd_bar(type = 2, print_plot = FALSE)

# The following procedure changes the plot aesthetics.
q <- plot$plot + ggplot2::labs(title = ggplot2::element_blank())
q <- q |> ggplot2::ggplot_build()
q$data[[5]]$label <- ""

q |> ggplot2::ggplot_gtable() |> ggplotify::as.ggplot()

Figure H.15: Cook’s distance for each observation along with a threshold line at \(4 / (n - k - 1)\).

Code

diag_sum_plots <-
  fit_engine_2 |>
  performance::check_model(
    panel = FALSE,
    colors = c("blue", "black", "black")
  ) |>
  plot() |>
  rutils::shush()

plot <-
  diag_sum_plots$OUTLIERS +
  ggplot2::labs(
    title = ggplot2::element_blank(),
    subtitle = ggplot2::element_blank(),
    x = "Leverage",
    y = "Studentized residuals"
  ) +
  ggplot2::theme_get() +
  ggplot2::theme(
    legend.position = "right",
    axis.title = ggplot2::element_text(size = 11, colour = "black"),
    axis.text = ggplot2::element_text(colour = "gray25"),
    axis.text.y = ggplot2::element_text(size = 9)
  )

plot <- plot |> ggplot2::ggplot_build()

# The following procedure changes the plot aesthetics.
for (i in c(1:9)) {
  # "#1b6ca8" "#3aaf85"
  plot$data[[i]]$colour <- dplyr::case_when(
    plot$data[[i]]$colour == "blue" ~
      ifelse(i == 4, brandr::get_brand_color("grey"), brandr::get_brand_color("orange")),
    plot$data[[i]]$colour == "#1b6ca8" ~ "black",
    plot$data[[i]]$colour == "darkgray" ~ "black",
    TRUE ~ plot$data[[i]]$colour
  )
}
#> ! The color grey was not found in the `_brand.yml` file.

plot |> ggplot2::ggplot_gtable() |> ggplotify::as.ggplot()

Figure H.16: Relation between studentized residuals and their leverage points. The blue line represents the Cook’s distance. Any points outside the contour lines are influential observations.

H.5.3.6.5 Influence on Prediction (DFFITS)

DFFITS (Difference in Fits) is a standardized measure of how much the prediction for a given observation would change if it were deleted from the model. Each observation’s DFFITS is standardized by the standard deviation of fit at that point (Struck, 2024).

The best rule of thumb is to classify as influential any standardized values that exceed \(2 \sqrt{(p / n)}\), where \(p\) is the number of independent variables \(+1\) and \(n\) is the sample size (Hair, 2019, p. 261).

Learn more about DDFITS in Welsch & Kuh (1977) and Belsley et al. (2004).

fit_engine |> stats::dffits() |> head()
#>                1                2                3                4 
#>  0.0120902050791  0.0139230149062 -0.0002410373884  0.0025634273733 
#>                5                6 
#>  0.0092263296788  0.0030419259705

Code

plot <- fit_engine |>
  olsrr::ols_plot_dffits(print_plot = FALSE)

plot$plot + ggplot2::labs(title = ggplot2::element_blank())

Figure H.17: Standardized DFFITS (difference in fits) for each observation.

H.5.3.6.6 Influence on Parameter Estimates (DFBETAS)

DFBETAS are a measure of the change in a regression coefficient when an observation is omitted from the regression analysis. The value of the DFBETA is in terms of the coefficient itself (Hair, 2019, p. 261).

A cutoff for what is considered a large DFBETAS value is \(2 / \sqrt{n}\), where \(n\) is the number of observations. (Struck, 2024).

Learn more about DFBETAS in Welsch & Kuh (1977) and Belsley et al. (2004).

fit_engine |> stats::dfbeta() |> head()
#>      (Intercept)              age        sexMale        longitude
#> 1 -2.54519742992 -0.0104159688425 0.163678462807 -0.0321085605036
#> 2  1.95612939429 -0.0085954301495 0.243826899038 -0.0043306908770
#> 3 -0.03511379247  0.0002244927435 0.004780743688 -0.0001207820623
#> 4  0.42848993936 -0.0023788915306 0.034404027397  0.0072552713006
#> 5 -0.89702626365 -0.0083558333639 0.137754956560 -0.0112439580996
#> 6 -0.72956384607 -0.0016201389879 0.045784527016 -0.0081332881171
#>            ghi_month          latitude
#> 1  0.000239855929853 -0.01292882652087
#> 2 -0.000332969674368  0.00409279034750
#> 3  0.000003449967890  0.00008355341512
#> 4 -0.000003282864713 -0.00004161199299
#> 5  0.000086221225732 -0.01403962584853
#> 6  0.000066624264282 -0.00419501257285

Code

plots$plots[[1]] +
ggplot2::labs(title = "Intercept coefficient")

Table H.16: Standardized DFBETAS values for each observation concerning the `Intercept` coefficient.

Code

plots$plots[[2]] +
ggplot2::labs(title = "age coefficient")

Table H.17: Standardized DFBETAS values for each observation concerning the `age` coefficient.

Code

plots$plots[[3]] +
ggplot2::labs(title = "sexMale coefficient")

Table H.18: Standardized DFBETAS values for each observation concerning the `sexMale` coefficient.

Code

plots$plots[[4]] +
ggplot2::labs(title = "longitude coefficient")

Table H.19: Standardized DFBETAS values for each observation concerning the `longitude` coefficient.

Code

plots$plots[[5]] +
ggplot2::labs(title = "ghi_month coefficient")

Table H.20: Standardized DFBETAS values for each observation concerning the `ghi_month` coefficient.

Code

plots$plots[[6]] +
ggplot2::labs(title = "latitude coefficient")

Table H.21: Standardized DFBETAS values for each observation concerning the `latitude` coefficient.

H.5.3.6.7 Hadi’s Measure

Hadi’s measure of influence is based on the idea that influential observations can occur in either the response variable, the predictors, or both.

Learn more about Hadi’s measure in Chatterjee & Hadi (2012).

Code

plot <-
  fit_engine |>
  olsrr::ols_plot_hadi(print_plot = FALSE)

plot +
  ggplot2::labs(
    title = ggplot2::element_blank(),
    y = "Hadi's measure"
  )

Figure H.18: Hadi’s influence measure for each observation.

Code

plot <-
  fit_engine |>
  olsrr::ols_plot_resid_pot(print_plot = FALSE)

plot + ggplot2::labs(title = ggplot2::element_blank())

Figure H.19: Potential-residual plot classifying unusual observations as high-leverage points, outliers, or a combination of both.

H.6 Hypothesis Testing

Hypothesis: Latitude is associated with chronotype distributions, with populations closer to the equator exhibiting, on average, a shorter or more morning-oriented circadian phenotype compared to those residing near the poles.

\[ \begin{cases} \text{H}_{0}: \Delta \ \text{Adjusted} \ \text{R}^{2} < \text{MES} \\ \text{H}_{a}: \Delta \ \text{Adjusted} \ \text{R}^{2} \geq \text{MES} \end{cases} \]

H.6.1 F-Test

The results indicate that the F-test is significant (\(\alpha < 0.05\)), meaning that the model including the latitude variable differs from the model without it.

\[ \text{F} = \cfrac{\text{R}^{2}_{f} - \text{R}^{2}_{r} / (k_{f} - k_{R})}{(1 - \text{R}^{2}_{f}) / (\text{N} - k_{f} - 1)} \]

f_test <- stats::anova(fit_engine_restricted, fit_engine_full)

print(f_test)
#> Analysis of Variance Table
#> 
#> Model 1: msf_sc ~ age + sex + longitude + ghi_month
#> Model 2: msf_sc ~ age + sex + longitude + ghi_month + latitude
#>   Res.Df           RSS Df Sum of Sq        F          Pr(>F)    
#> 1  65818 1355086391524                                          
#> 2  65817 1354318625681  1 767765843 37.31178 0.0000000010124 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

n <- nrow(data)
k_res <- length(stats::coefficients(fit_engine_restricted)) - 1
k_full <- length(stats::coefficients(fit_engine_full)) - 1

r_squared_restricted <- summary(fit_engine_restricted)$r.squared
r_squared_full <- summary(fit_engine_full)$r.squared

((r_squared_full - r_squared_restricted) /
    (k_full - k_res)) / ((1 - r_squared_full) / (n  - k_full - 1))
#> [1] 37.31178436

H.6.2 Effect Size

The results show that the \(\Delta \ \text{Adjusted} \ \text{R}^{2}\) value is below the Minimal Effect Size (MES) threshold (\(\text{R}^{2} \approx 0.01961\)), indicating that adding latitude does not meaningfully improve the model’s fit.

\[ \text{Cohen's } f^2 = \cfrac{\text{R}^{2}_{f} - \text{R}^{2}_{r}}{1 - \text{R}^{2}_{f}} = \cfrac{\Delta \text{R}^{2}}{1 - \text{R}^{2}_{f}} \]

\[ \text{MES} = \text{Cohen's } f^2 \text{small threshold} = 0.02 \\ \]

\[ 0.02 = \cfrac{\text{R}^{2}}{1 - \text{R}^{2}} \quad \text{or} \quad \text{R}^{2} = \cfrac{0.02}{1.02} \eqsim 0.01960784 \]

Code

adj_r_squared_restricted <-
  fit_engine_restricted |>
  rutils:::summarize_r2(
    n = nrow(data),
    k = length(fit_engine_restricted$coefficients) - 1,
    ci_level = 0.95
  )

adj_r_squared_restricted

Table H.22: Confidence interval for the adjusted R-squared of the restricted model. LCL correspond to the lower limit, and UCL to the upper limit.

Code

adj_r_squared_full <-
  fit_engine_full |>
  rutils:::summarize_r2(
    n = nrow(data),
    k = length(fit_engine_full$coefficients) - 1,
    ci_level = 0.95
  )

adj_r_squared_full

Table H.23: Confidence interval for the adjusted R-squared of the full model. LCL correspond to the lower limit, and UCL to the upper limit.

Code

dplyr::tibble(
  name = c(
    "adj_r_squared_res",
    "adj_r_squared_full",
    "diff"
  ),
  value = c(
    adj_r_squared_restricted$value[1],
    adj_r_squared_full$value[1],
    adj_r_squared_full$value[1] - adj_r_squared_restricted$value[1]
  )
)

Table H.24: Comparison between the coefficients of determination (\(\text{R}^2\)) of the restricted and full models.

Code

effect_size <- rutils:::cohens_f_squared_summary(
  base_r_squared = adj_r_squared_restricted,
  new_r_squared = adj_r_squared_full
)

effect_size |> rutils:::list_as_tibble()

Table H.25: Effect size between the restricted and full models based on Cohen’s \(f^2\).

H.7 Conclusion

Based on the hypothesis test results and the methodology outlined in Supplementary Material B, we fail to reject the null hypothesis. Latitude does not meaningfully contribute to explaining the variance in chronotype.

Anderson, T. W. (1962). On the distribution of the two-sample Cramér-von Mises criterion. The Annals of Mathematical Statistics, 33(3), 1148–1159. https://doi.org/10.1214/aoms/1177704477

Anderson, T. W., & Darling, D. A. (1952). Asymptotic theory of certain "goodness of fit" criteria based on stochastic processes. The Annals of Mathematical Statistics, 23(2), 193–212. https://www.jstor.org/stable/2236446

Anderson, T. W., & Darling, D. A. (1954). A test of goodness of fit. Journal of the American Statistical Association, 49(268), 765–769. https://doi.org/10.1080/01621459.1954.10501232

Belsley, D. A., Kuh, E., & Welsch, R. E. (2004). Regression diagnostics: Identifying influential data and sources of collinearity (Print ISBN: 9780471058564
Online ISBN: 9780471725152). John Wiley & Sons. https://doi.org/10.1002/0471725153

Bera, A. K., & Jarque, C. M. (1981). Efficient tests for normality, homoscedasticity and serial independence of regression residuals: Monte Carlo evidence. Economics Letters, 7(4), 313–318. https://doi.org/10.1016/0165-1765(81)90035-5

Bonett, D. G., & Seier, E. (2002). A test of normality with high uniform power. Computational Statistics & Data Analysis, 40(3), 435–445. https://doi.org/10.1016/S0167-9473(02)00074-9

Box, G. E. P., & Pierce, D. A. (1970). Distribution of residual autocorrelations in autoregressive-integrated moving average time series models. Journal of the American Statistical Association, 65(332), 1509–1526. https://doi.org/10.1080/01621459.1970.10481180

Breusch, T. S., & Pagan, A. R. (1979). A simple test for heteroscedasticity and random coefficient variation. Econometrica, 47(5), 1287–1294. https://doi.org/10.2307/1911963

Chatterjee, S., & Hadi, A. S. (2012). Regression analysis by example (5th ed.). Wiley.

Cook, R. D. (1977). Detection of influential observation in linear regression. Technometrics, 19(1), 15–18. https://doi.org/10.1080/00401706.1977.10489493

Cook, R. D. (1979). Influential observations in linear regression. Journal of the American Statistical Association, 74(365), 169–174. https://doi.org/10.1080/01621459.1979.10481634

Cramér, H. (1928). On the composition of elementary errors: First paper: Mathematical deductions. Scandinavian Actuarial Journal, 1928(1), 13–74. https://doi.org/10.1080/03461238.1928.10416862

D’Agostino, R. B. (1971). An omnibus test of normality for moderate and large size samples. Biometrika, 58(2), 341–348. https://doi.org/10.1093/biomet/58.2.341

D’Agostino, R. B., & Belanger, A. (1990). A suggestion for using powerful and informative tests of normality. The American Statistician, 44(4), 316–321. https://doi.org/10.2307/2684359

D’Agostino, R. B., & Pearson, E. S. (1973). Tests for departure from normality. Empirical results for the distributions of b2 and √b1. Biometrika, 60(3), 613–622. https://doi.org/10.1093/biomet/60.3.613

Dallal, G. E., & Wilkinson, L. (1986). An analytic approximation to the distribution of Lilliefors’s test statistic for normality. The American Statistician, 40(4), 294–296. https://doi.org/10.1080/00031305.1986.10475419

DeGroot, M. H., & Schervish, M. J. (2012). Probability and statistics (OCLC: ocn502674206) (4th ed.). Addison-Wesley.

Durbin, J., & Watson, G. S. (1950). Testing for serial correlation in least squares regression. I. Biometrika, 37(3–4), 409–428. https://doi.org/10.1093/biomet/37.3-4.409

Durbin, J., & Watson, G. S. (1951). Testing for serial correlation in least squares regression. II. Biometrika, 38(1–2), 159–178. https://doi.org/10.1093/biomet/38.1-2.159

Durbin, J., & Watson, G. S. (1971). Testing for serial correlation in least squares regression. III. Biometrika, 58(1), 1–19. https://doi.org/10.1093/biomet/58.1.1

Fox, J. (2016). Applied regression analysis and generalized linear models (3rd ed.). Sage.

Greener, R. (2020, August 4). Stop testing for normality. Medium. https://towardsdatascience.com/stop-testing-for-normality-dba96bb73f90

Hair, J. F. (2019). Multivariate data analysis (8th ed.). Cengage.

Jarque, C. M., & Bera, A. K. (1980). Efficient tests for normality, homoscedasticity and serial independence of regression residuals. Economics Letters, 6(3), 255–259. https://doi.org/10.1016/0165-1765(80)90024-5

Jarque, C. M., & Bera, A. K. (1987). A test for normality of observations and regression residuals. International Statistical Review, 55(2), 163–172. https://doi.org/10.2307/1403192

Koenker, R. (1981). A note on studentizing a test for heteroscedasticity. Journal of Econometrics, 17(1), 107–112. https://doi.org/10.1016/0304-4076(81)90062-2

Kolmogorov, A. (1933). Sulla determinazione empirica di una legge di distribuzione. Giornale dell’Istituto Italiano degli Attuari, 4.

Kozak, M., & Piepho, H.-P. (2018). What’s normal anyway? Residual plots are more telling than significance tests when checking ANOVA assumptions. Journal of Agronomy and Crop Science, 204(1), 86–98. https://doi.org/10.1111/jac.12220

Lilliefors, H. W. (1967). On the Kolmogorov-Smirnov test for normality with mean and variance unknown. Journal of the American Statistical Association, 62(318), 399–402. https://doi.org/10.1080/01621459.1967.10482916

Ljung, G. M., & Box, G. E. P. (1978). On a measure of lack of fit in time series models. Biometrika, 65(2), 297–303. https://doi.org/10.1093/biomet/65.2.297

Massey, F. J. (1951). The Kolmogorov-Smirnov test for goodness of fit. Journal of the American Statistical Association, 46(253), 68–78. https://doi.org/10.1080/01621459.1951.10500769

Nahhas, R. W. (2024). Introduction to regression methods for public health using R. https://www.bookdown.org/rwnahhas/RMPH/

Newey, W. K., & West, K. D. (1987). A simple, positive semi-definite, heteroskedasticity and autocorrelation consistent covariance matrix. Econometrica, 55(3), 703–708. https://doi.org/10.2307/1913610

Newey, W. K., & West, K. D. (1994). Automatic lag selection in covariance matrix estimation. The Review of Economic Studies, 61(4), 631–653. https://doi.org/10.2307/2297912

Pearson, K. (1900). X. On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 50(302), 157–175. https://doi.org/10.1080/14786440009463897

Ramsey, J. B. (1969). Tests for specification errors in classical linear least-squares regression analysis. Journal of the Royal Statistical Society. Series B (Methodological), 31(2), 350–371. https://doi.org/10.1111/j.2517-6161.1969.tb00796.x

Schucany, W. R., & Ng, H. K. T. (2006). Preliminary goodness-of-fit tests for normality do not validate the one-sample Student t. Communications in Statistics - Theory and Methods, 35(12), 2275–2286. https://doi.org/10.1080/03610920600853308

Shapiro, S. S., & Francia, R. S. (1972). An approximate analysis of variance test for normality. Journal of the American Statistical Association, 67(337), 215–216. https://doi.org/10.1080/01621459.1972.10481232

Shapiro, S. S., & Wilk, M. B. (1965). An analysis of variance test for normality (complete samples)†. Biometrika, 52(3–4), 591–611. https://doi.org/10.1093/biomet/52.3-4.591

Shatz, I. (2024). Assumption-checking rather than (just) testing: The importance of visualization and effect size in statistical diagnostics. Behavior Research Methods, 56(2), 826–845. https://doi.org/10.3758/s13428-023-02072-x

Smirnov, N. (1948). Table for estimating the goodness of fit of empirical distributions. Annals of Mathematical Statistics, 19, 279–281.

Struck, J. (2024). Regression Diagnostics with R. University of Wisconsin-Madison. https://sscc.wisc.edu/sscc/pubs/RegDiag-R/

Thode, H. C. (2002). Testing for normality. Marcel Dekker.

Welsch, R., & Kuh, E. (1977). Linear regression diagnostics (Working Paper 0173; p. 44). National Bureau of Economic Research. https://doi.org/10.3386/w0173