Sample size estimation

Author

Daniel Vartanian

Published

2024-06-05

Project Status: Inactive – The project has reached a stable, usable state but is no longer being actively developed; support/maintenance will be provided as time allows. License: MIT

Overview

The aim of this analysis is to estimate the sample size for a study on the effects of sleep disruption and sleep duration on labor duration (the pregnancy project), with the goal of replicating the findings of Lee & Gay (2004).

Effect size estimation (Cohen’s f)

See Cohen (1988) and Maxwell et al. (2018) to learn more.

Code
# See Maxwell et al. (2018, p. 150)
f_unb <- function(k, n_total, f_test) {
  checkmate::assert_int(k, lower = 2)
  checkmate::assert_int(n_total, lower = 2)
  checkmate::assert_number(f_test)
  
  ((k - 1) / n_total) %>%
  `*`(((n_total - k - 2) / (n_total - k)) * (f_test - 1)) |> 
  sqrt()
}

Parameters

The parameters for the effect size were extracted from Lee & Gay (2004), as shown below.

Controlling for infant birth weight, ANCOVA indicated that the degree of prenatal sleep disruption [WASO] had a significant effect on labor duration (\(\text{F}[2,125] = 6.93\), \(\text{P} = .001\)). (Lee & Gay, 2004, p. 2044)

Controlling for infant birth weight, ANCOVA indicated that prenatal sleep quantity (TST) also had a significant effect on labor duration (\(\text{F}[2,125] = 5.54\), \(\text{P} = .005\)). (Lee & Gay, 2004, p. 2044)

WASO = Wakefulness after sleep onset. TST = Total sleep time.

Table 1: Differences in labor duration by total sleep time and wake after sleep onset (\(n = 131\))
\(n\) Labor duration\(^{\ast}\) (mean \(\pm\) SD)
WASO
\(15\%+\) \(41\) \(26.0 \pm 11.7^{\dagger}\)
\(10\%-14.9\%\) \(32\) \(16.3 \pm 12.9\)
\(<10\%\) \(58\) \(18.3 \pm 14.6\)
TST
\(<6 \text{h}\) \(19\) \(29.0 \pm 12.5^{\dagger}\)
\(6-6.9 \text{h}\) \(38\) \(20.5 \pm 11.3\)
\(7+ \text{h}\) \(74\) \(17.7 \pm 15.6\)

\(^{\ast}\)Adjusted for infant birth weight. \(^{\dagger}\)Significantly longer labor than the other 2 groups.

Source: Lee & Gay (2004, p. 2044).

WASO

Code
k <- 3
num_df <- 2
den_df <- 125
n_total_waso <- den_df + k
f_test <- 6.93

f_unb_waso <- f_unb(k, n_total_waso, f_test)

f_unb_waso
#> [1] 0.3019499

TST

Code
k <- 3
num_df <- 2
den_df <- 125
n_total_tst <- den_df + k
f_test <- 5.54

f_unb_tst <- f_unb(k, n_total_tst, f_test)

f_unb_tst
#> [1] 0.2642016

Sample size estimation

Parameters

Code
alpha <- 0.05
power <- 0.8
k <- 3 # Number of groups

Using the pwr R package

It’s important to note that, according to Cohen (1988, p. 403):

[…] sample size estimation in the analysis of covariance proceeds in exactly the same way as in analogous analysis of variance designs.

WASO

Code
test <- pwr::pwr.anova.test(
  k = k, 
  f = f_unb_waso, 
  sig.level = alpha, 
  power = power
)

test
#> 
#>      Balanced one-way analysis of variance power calculation 
#> 
#>               k = 3
#>               n = 36.24212
#>               f = 0.3019499
#>       sig.level = 0.05
#>           power = 0.8
#> 
#> NOTE: n is number in each group
Code
pwr::plot.power.htest(test, type = "b", main = "WASO")

Code
ceiling(test$n * k) # Total sample size
#> [1] 109

TST

Code
test <- pwr::pwr.anova.test(
  k = k, 
  f = f_unb_tst, 
  sig.level = alpha, 
  power = power
)

test
#> 
#>      Balanced one-way analysis of variance power calculation 
#> 
#>               k = 3
#>               n = 47.02238
#>               f = 0.2642016
#>       sig.level = 0.05
#>           power = 0.8
#> 
#> NOTE: n is number in each group
Code
pwr::plot.power.htest(test, type = "b", main = "TST")

Code
ceiling(test$n * k) # Total sample size
#> [1] 142

Using the G*Power software

See Faul et al. (2007) to learn more.

WASO

Code
F tests - ANCOVA: Fixed effects, main effects and interactions
Analysis:   A priori: Compute required sample size 
Input:  Effect size f   =   0.3019499
    α err prob  =   0.05
    Power (1-β err prob)    =   0.8
    Numerator df    =   2
    Number of groups    =   3
    Number of covariates    =   1
Output: Noncentrality parameter λ   =   9.9379379
    Critical F  =   3.0828520
    Denominator df  =   105
    Total sample size   =   109
    Actual power    =   0.8009725

TST

Code
F tests - ANCOVA: Fixed effects, main effects and interactions
Analysis:   A priori: Compute required sample size 
Input:  Effect size f   =   0.2642016
    α err prob  =   0.05
    Power (1-β err prob)    =   0.8
    Numerator df    =   2
    Number of groups    =   3
    Number of covariates    =   1
Output: Noncentrality parameter λ   =   9.9119529
    Critical F  =   3.0617157
    Denominator df  =   138
    Total sample size   =   142
    Actual power    =   0.8027593

Using Shieh’s method

See Shieh (2020) to learn more.

Code
#' Compute a power analysis and sample size for an ANCOVA design
#'
#' @description
#'
#' `ancova_muhtpo()` computes the __power__ and  `ancova_muhtnp()` the
#' __sample size__ for a generalized linear hypothesis tests of treatment
#' effects.
#'
#' These functions were adapted from "Program C" and "Program D" algorithms
#' found in Shieh (2020) supplementary material.
#'
#' @details
#'
#' ## `contrast` parameter
#'
#' If `contrast = NULL`, `ancova_muhtnp()` will use the following contrast
#' matrix:
#'
#' `cbind(matrix(rep(1, group_n - 1), group_n - 1, 1), - diag(group_n - 1))`
#'
#' * Example:
#'
#' ```{r}
#' group_n <- 3
#'
#' cbind(matrix(rep(1, group_n - 1), group_n - 1, 1), - diag(group_n - 1))
#' ```
#'
#' Learn more about "contrast matrices" at
#' \url{https://en.wikipedia.org/wiki/Contrast_(statistics)}.
#'
#' @param group_n A number, greater than 1, indicating the number of groups.
#' @param group_ssize A [numeric][numeric()] object indicating the groups sample
#'   sizes.
#' @param group_sratio A [numeric][numeric()] object indicating the groups
#'   sample ratios.
#' @param group_amean A [numeric][numeric()] object indicating the adjusted
#'   group means.
#' @param var A number indicating the variance.
#' @param cov_n A number, greater than 0, indicating the number of covariates.
#' @param alpha (optional) a number, from `0` to `1`, indicating the designated
#'   significant level (\eqn{\alpha}{alpha}) (default: `0.05`).
#' @param power (optional) a number, between 0 and 1, indicating the nominal
#'   power (default: `0.8`).
#' @param contrast (optional) a [matrix][matrix()] object with the contrast
#' coefficients. See the Details section to learn more (default: `NULL`).
#'
#' @return A [list][list()] object with the following elements:
#'
#' * `group_n`: a number indicating the number of groups.
#' * `cov_n`: a number indicating the number of covariates.
#' * `alpha`: a number indicating the designated significant level
#' (\eqn{\alpha}{alpha}) used.
#' * `power`: a number indicating the nominal power used (only for
#' `ancova_muhtnp()`).
#' * `lgamsq`: a [matrix][matrix()] object indicating `?`.
#' * `ssize_group`: a [numeric][numeric()] object indicating the sample size for
#' each group.
#' * `ssize_total`: a number indicating the total sample size.
#' * `epower`: a number indicating the estimated statistical power for the
#' sample size (\eqn{1 - \beta}).
#'
#' @references
#' 
#' Shieh, G. (2020). Power analysis and sample size planning in ANCOVA designs.
#' _Psychometrika_, _85_(1), 101–120. \doi{10.1007/s11336-019-09692-3}
#'
#' @examples
#' ancova_muhtpo(group_n = 3, group_ssize = c(10, 10, 10),
#'               group_amean = c(7.5366, 11.9849, 13.9785),
#'               var = 29.0898, cov_n = 1, alpha = 0.05, contrast = NULL)
#'
#' ancova_muhtnp(group_n = 3, group_sratio = c(1, 1, 1),
#'               group_amean = c(7.5366, 11.9849, 13.9785),
#'               var = 29.0898, cov_n = 1, alpha = 0.05, power = 0.8,
#'               contrast = NULL)
ancova_muhtpo <- function(group_n, group_ssize, group_amean, var, cov_n,
                          alpha = 0.05, contrast = NULL) {
  checkmate::assert_int(group_n, lower = 2)
  checkmate::assert_numeric(group_ssize, len = group_n)
  checkmate::assert_numeric(group_amean, len = group_n)
  checkmate::assert_number(var)
  checkmate::assert_int(cov_n, lower = 1)
  checkmate::assert_number(alpha, lower = 0, upper = 1)
  checkmate::assert_matrix(contrast, max.cols = group_n, null.ok = TRUE)

  if (is.null(contrast)) {
    contrast <- cbind(matrix(rep(1, group_n - 1),
                             group_n - 1, 1), - diag(group_n - 1))
  }

  checkmate::assert_true(all(rowSums(contrast) == 0))

  numint <- 2000
  dd <- 1e-5
  coevec <- c(1, rep(c(4, 2), numint / 2 - 1), 4, 1)
  bl <- dd
  bu <- 1 - dd
  intl <- (bu - bl) / numint
  bvec <- bl + intl * (0:numint)
  cmu <- contrast %*% matrix(group_amean, group_n, 1)
  df1 <- nrow(contrast)
  ssize_total <- sum(group_ssize)
  qmat <- diag(ssize_total / group_ssize)
  lgamsq <- t(cmu) %*% solve(contrast %*% qmat %*% t(contrast)) %*% cmu / var

  epowerf <- function() {
    ssize_total <- sum(group_ssize)
    df2 <- ssize_total - group_n - cov_n
    dfx <- df2 + 1
    b <- cov_n / dfx
    fcrit <- stats::qf(1 - alpha, df1, df2)

    if (cov_n == 1) {
      tl <- stats::qt(dd, dfx)
      tu <- stats::qt(1 - dd, dfx)
      intl <- (tu - tl) / numint
      tvec <- tl + intl * (0:numint)
      wtpdf <- (intl / 3) * coevec * stats::dt(tvec, dfx)

      epower <- sum(wtpdf * stats::pf( # nolint
        fcrit, df1, df2, ssize_total * c(lgamsq) / (1 + b * tvec^2),
        lower.tail = FALSE)
      )
    } else {
      wbpdf <- (intl / 3) * coevec * stats::dbeta(bvec, dfx / 2,
                                                  cov_n / 2)

      epower <- sum(wbpdf * stats::pf(
        fcrit, df1, df2, ssize_total * lgamsq * bvec,
        lower.tail = FALSE)
      )
    }
  }

  epower <- epowerf()

  list(group_n = group_n, cov_n = cov_n, alpha = alpha, lgamsq = lgamsq,
       ssize_group = group_ssize, ssize_total = ssize_total, epower = epower)
}

ancova_muhtnp <- function(group_n, group_sratio, group_amean, var, cov_n,
                          alpha = 0.05, power = 0.8, contrast = NULL) {
  checkmate::assert_int(group_n, lower = 2)
  checkmate::assert_numeric(group_sratio, len = group_n)
  checkmate::assert_numeric(group_amean, len = group_n)
  checkmate::assert_number(var)
  checkmate::assert_int(cov_n, lower = 1)
  checkmate::assert_number(alpha, lower = 0, upper = 1)
  checkmate::assert_number(power, lower = 0, upper = 1)
  checkmate::assert_matrix(contrast, max.cols = group_n, null.ok = TRUE)

  if (is.null(contrast)) {
    contrast <- cbind(matrix(rep(1, group_n - 1),
                             group_n - 1, 1), - diag(group_n - 1))
  }

  checkmate::assert_true(all(rowSums(contrast) == 0))

  numint <- 2000
  dd <- 1e-5
  coevec <- c(1, rep(c(4, 2), numint / 2 - 1), 4, 1)
  bl <- dd
  bu <- 1 - dd
  intl <- (bu - bl) / numint
  bvec <- bl + intl * (0:numint)
  cmu <- contrast %*% matrix(group_amean, group_n, 1)
  df1 <- nrow(contrast)
  qmat <- diag(sum(group_sratio) / group_sratio)
  lgamsq <- t(cmu) %*% solve(contrast %*% qmat %*% t(contrast)) %*%
    cmu / var

  epowerf <-  function() {
    ssize_total <- sum(ssize_group)
    df2 <- ssize_total - group_n - cov_n
    dfx <- df2 + 1
    b <- cov_n / dfx
    fcrit <- stats::qf(1 - alpha, df1, df2)

    if (cov_n == 1) {
      tl <- stats::qt(dd, dfx)
      tu <- stats::qt(1 - dd, dfx)
      intl <- (tu - tl) / numint
      tvec <- tl + intl * (0:numint)
      wtpdf <- (intl / 3) * coevec * stats::dt(tvec, dfx)
      ncp <- ssize_total * c(lgamsq) / (1 + b * tvec^2)

      epower <- sum(wtpdf * stats::pf( # nolint
        fcrit, df1, df2, ncp, lower.tail = FALSE)
      )
    } else {
      wbpdf <- (intl / 3) * coevec * stats::dbeta(bvec, dfx / 2,
                                                  cov_n / 2)
      epower <- sum(wbpdf * stats::pf(
        fcrit, df1, df2, ssize_total * lgamsq * bvec,
        lower.tail = FALSE)
      )
    }
  }

  n <- 5
  epower <- 0

  while (epower < power && n < 1000) {
    n <- n + 1
    ssize_group <- n * group_sratio
    epower <- epowerf()
  }

  ssize_group < -n * group_sratio
  ssize_total <- sum(ssize_group)

  list(group_n = group_n, cov_n = cov_n, alpha = alpha, power = power,
       lgamsq = lgamsq, ssize_group = ssize_group, ssize_total = ssize_total,
       epower = epower)
}

WASO

Code
# Group 1: 15+
n_1_waso <- 41
mean_1_adj_waso <- 26
sd_1_waso <- 11.7

# Group 2: 10%-14.9%
n_2_waso <- 32
mean_2_adj_waso <- 16.3
sd_2_waso <- 12.9

# Group 3: <10%
n_3_waso <- 58
mean_3_adj_waso <- 18.3
sd_3_waso <- 14.6

group_sratio <- c(
  n_1_waso / n_total_waso, 
  n_2_waso / n_total_waso, 
  n_3_waso / n_total_waso
)

group_amean <- c(mean_1_adj_waso, mean_2_adj_waso, mean_3_adj_waso)

sd_pooled <- 
  (c(n_1_waso, n_2_waso, n_3_waso) - 1) %>%
  `*`(c(sd_1_waso, sd_2_waso, sd_3_waso)^2) |>
  sum() %>% 
  `/`(sum((c(n_1_waso, n_2_waso, n_3_waso) - 1))) |>
  sqrt()

var_pooled <- sd_pooled^2

ancova_muhtnp(
  group_n = k, 
  group_sratio = group_sratio, 
  group_amean = group_amean, 
  var = var_pooled, 
  cov_n = 1, 
  alpha = alpha, 
  power = power, 
  contrast = NULL
)
#> $group_n
#> [1] 3
#> 
#> $cov_n
#> [1] 1
#> 
#> $alpha
#> [1] 0.05
#> 
#> $power
#> [1] 0.8
#> 
#> $lgamsq
#>            [,1]
#> [1,] 0.08899738
#> 
#> $ssize_group
#> [1] 35.23438 27.50000 49.84375
#> 
#> $ssize_total
#> [1] 112.5781
#> 
#> $epower
#> [1] 0.8008998

TST

Code
# Group 1: <6 h
n_1_tst <- 19
mean_1_adj_tst <- 29
sd_1_tst <- 12.5

# Group 2: 6-6.9 h
n_2_tst <- 38
mean_2_adj_tst <- 20.5
sd_2_tst <- 11.3

# Group 3: 7+ h
n_3_tst <- 74
mean_3_adj_tst <- 17.7
sd_3_tst <- 15.6

group_sratio <- c(
  n_1_tst / n_total_tst, 
  n_2_tst / n_total_tst, 
  n_3_tst / n_total_tst
)

group_amean <- c(mean_1_adj_tst, mean_2_adj_tst, mean_3_adj_tst)

sd_pooled <- 
  (c(n_1_tst, n_2_tst, n_3_tst) - 1) %>%
  `*`(c(sd_1_tst, sd_2_tst, sd_3_tst)^2) |>
  sum() %>% 
  `/`(sum((c(n_1_tst, n_2_tst, n_3_tst) - 1))) |>
  sqrt()

var_pooled <- sd_pooled^2

ancova_muhtnp(
  group_n = k, 
  group_sratio = group_sratio, 
  group_amean = group_amean, 
  var = var_pooled, 
  cov_n = 1, 
  alpha = alpha, 
  power = power, 
  contrast = NULL
)
#> $group_n
#> [1] 3
#> 
#> $cov_n
#> [1] 1
#> 
#> $alpha
#> [1] 0.05
#> 
#> $power
#> [1] 0.8
#> 
#> $lgamsq
#>            [,1]
#> [1,] 0.07479986
#> 
#> $ssize_group
#> [1] 19.29688 38.59375 75.15625
#> 
#> $ssize_total
#> [1] 133.0469
#> 
#> $epower
#> [1] 0.8005404

References

Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Lawrence Erlbaum Associates.
Faul, F., Erdfelder, E., Lang, A.-G., & Buchner, A. (2007). G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39(2), 175–191. https://doi.org/10.3758/BF03193146
Lee, K. A., & Gay, C. L. (2004). Sleep in late pregnancy predicts length of labor and type of delivery. American Journal of Obstetrics and Gynecology, 191(6), 2041–2046. https://doi.org/10.1016/j.ajog.2004.05.086
Maxwell, S. E., Delaney, H. D., & Kelley, K. (2018). Designing experiments and analyzing data: A model comparison perspective (3rd ed.). Routledge.
Shieh, G. (2020). Power analysis and sample size planning in ANCOVA designs. Psychometrika, 85(1), 101–120. https://doi.org/10.1007/s11336-019-09692-3