Causal Mediation in R: A Practical Guide

Understanding the mechanisms or pathways that connect variables is fundamental for better theories and interventions. Mediation models are a way to investigate such pathways. For example, education may be associated with mental health partly because it changes income. Partnership status may matter because it changes later mental health. While path analysis can help estimate such effects, causal mediation promises to get us closer to causal relationships and treatment effects.

From path analysis to causal mediation

Causal mediation promises to get us closer to causal relationships and treatment effects. The diagram below shows a basic mediation model. In this notation, the a path links the predictor, X, to the mediator, M. Path b links the mediator to the outcome, Y. The direct path, often labelled c', represents the part of the predictor-outcome association that remains after the mediator is included.

Conceptual path diagram showing predictor X, mediator M, outcome Y, indirect paths a and b, and direct path c prime.

That path-analysis approach is a useful way to represent a theoretical mechanism. It tells us whether the estimated path through the mediator is large or small, and whether the direct path remains after the mediator is included. However, the interpretation is often framed as a decomposition of regression paths rather than as a direct answer to a causal question.

Causal mediation investigates the same types of relationships but changes the focus of interpretation. It estimates an expected treatment effect and calculates how much of the treatment-outcome relationship would operate through the mediator if the treatment changed.

This approach has three main benefits over traditional path analysis. First, it makes the assumptions of the mediation models and the context in which it can be interpreted as causal more explicit; see the original paper for a discussion of this. Second, it can facilitate interpretation when the moderator or outcomes are categorical. Third, it introduces tools for sensitivity analysis of the results.

Preparing the data for causal mediation

The examples use the wide version of the synthetic Understanding Society data. Each respondent is a row, and repeated measures are labelled with suffixes such as _1, _2, and _3. The real Understanding Society data and documentation are available on the study website, via the UK Data Service catalogue entry.

load("../data/us_clean_syn.RData")

dat0 <- usw |>
  mutate(
    age_num = as.numeric(haven::zap_labels(age)),
    age10 = (age_num - mean(age_num, na.rm = TRUE)) / 10,
    female = ifelse(gndr.fct == "Female", 1, 0),
    degree = ifelse(degree.fct == "Degree", 1, 0),
    single1 = as.numeric(haven::zap_labels(single_1)),
    single2 = as.numeric(haven::zap_labels(single_2))
  )

Age is centred and divided by ten, so a one-unit change means ten years. The models use gender, age, degree status, income, and baseline variables as controls where needed.

Causal mediation in a simple continuous example

We start with the example closest to a standard mediation using path models. Degree is the treatment, log income is the mediator, and mental health is the outcome. The mediator and outcome are continuous, so the model is easy to compare with a standard path-analysis approach.

The mediator and outcome models must be fitted on the same analysis sample. Therefore, it is safer to create the complete-case dataset first, rather than allowing each model to silently drop a different set of rows.

To estimate the model, we need to run the two regressions separately and then include the results in the mediate() command.

dat_ex1 <- dat0 |>
  dplyr::select(degree, logincome_1, sf12mcs_1, age10, female) |>
  drop_na()

model_m1 <- lm(
  logincome_1 ~ degree + age10 + female,
  data = dat_ex1
)

model_y1 <- lm(
  sf12mcs_1 ~ degree + logincome_1 + age10 + female,
  data = dat_ex1
)

med_1 <- mediate(
  model_m1,
  model_y1,
  treat = "degree",
  mediator = "logincome_1",
  sims = 200
)

Before using a cleaned table, it is worth looking at the output that R gives. The four rows to read first are ACME, ADE, Total Effect, and Prop. Mediated.

summary(med_1)
## 
## Causal Mediation Analysis 
## 
## Quasi-Bayesian Confidence Intervals
## 
##                 Estimate 95% CI Lower 95% CI Upper p-value    
## ACME            0.019885    -0.029935     0.066655    0.38    
## ADE             1.197903     1.001742     1.388387  <2e-16 ***
## Total Effect    1.217788     1.015919     1.395795  <2e-16 ***
## Prop. Mediated  0.016447    -0.023193     0.053365    0.38    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Sample Size Used: 47350 
## 
## 
## Simulations: 200

ACME means average causal mediation effect. It is the estimated part of the treatment-outcome relationship that operates through the mediator, averaged over the people in the analysis sample. In this example, it is the pathway from degree status to mental health through income.

ADE means average direct effect. It is the part of the fitted treatment effect that does not operate through the mediator. The Total Effect row combines the mediated and direct parts, while Prop. Mediated reports the estimated share carried by the mediator.

In this case, the indirect effect through income is small and not statistically different from 0. The ACME is 0.020, while the ADE is 1.198. Therefore, most of the estimated association between degree status and mental health is not through income.

Path diagram showing causal mediation from degree to mental health through income.

For a linear model like this one, a path analysis would usually give a similar point estimate for the indirect effect. The approach becomes more useful when we move away from a fully linear model. The next example does that by making the mediator binary.

Causal mediation with a binary mediator

Suppose the mediator is not continuous. In this example, degree status is the treatment, being single at wave 1 is the mediator, and mental health at wave 1 is the outcome. The mediator model is a probit regression.

This example is close to the kind of problem that motivates this method in practice. A simple product a*b is harder to interpret when one path comes from a probit model and the other path comes from a linear model. In this setting, the mediate() function estimates the ACME and ADE offer a better way to interpret the results.

dat_ex3 <- dat0 |>
  dplyr::select(degree, single1, sf12mcs_1, logincome_1, age10, female) |>
  drop_na()

model_m3 <- glm(
  single1 ~ degree + logincome_1 + age10 + female,
  data = dat_ex3,
  family = binomial("probit")
)

model_y3 <- lm(
  sf12mcs_1 ~ degree + single1 + logincome_1 + age10 + female,
  data = dat_ex3
)

med_3 <- mediate(
  model_m3,
  model_y3,
  treat = "degree",
  mediator = "single1",
  sims = 200
)
QuantityEstimateCI lowCI highp-valueN
ACME: indirect effect0.0940.0730.112< .00147350
ADE: direct effect1.1120.9281.305< .00147350
Total effect1.2061.0291.398< .00147350
Proportion mediated0.0780.0600.098< .00147350
Causal mediation results with a binary mediator

Here, the ACME is 0.094, while the ADE is 1.112. Its mediated proportion is 0.078, so the indirect pathway through single status is modest.

This result should be read in relation to the direct effect. The ADE is much larger than the ACME, so single status carries only a small part of the degree-mental health association.

Path diagram showing causal mediation from degree to mental health through single status as a binary mediator.

Longitudinal causal mediation with temporal ordering

Cross-sectional mediation is often problematic because treatment, mediator, and outcome are measured at the same time. A longitudinal design can improve the argument by ordering the variables. It does not remove all confounding, but it makes reverse causation less plausible.

Here, single status at wave 1 is the treatment. Mental health at wave 2 is the mediator, and life satisfaction at wave 3 is the outcome. The model also controls for baseline mental health and baseline life satisfaction.

dat_ex5 <- dat0 |>
  dplyr::select(
    single1, sf12mcs_1, sf12mcs_2, sati_1, sati_3,
    logincome_1, age10, female, degree
  ) |>
  drop_na()

model_m5 <- lm(
  sf12mcs_2 ~ single1 + sf12mcs_1 + sati_1 +
    logincome_1 + age10 + female + degree,
  data = dat_ex5
)

model_y5 <- lm(
  sati_3 ~ single1 + sf12mcs_2 + sf12mcs_1 + sati_1 +
    logincome_1 + age10 + female + degree,
  data = dat_ex5
)

med_5 <- mediate(
  model_m5,
  model_y5,
  treat = "single1",
  mediator = "sf12mcs_2",
  sims = 200
)
QuantityEstimateCI lowCI highp-valueN
ACME: indirect effect-0.005-0.0110.0020.14018241
ADE: direct effect-0.011-0.0540.0310.69018241
Total effect-0.015-0.0590.0270.51018241
Proportion mediated0.134-1.5993.3140.59018241
Longitudinal causal mediation results

The ACME is -0.005, which is small and the confidence interval includes zero. For this example, there is little evidence that wave-2 mental health carries much of the association between wave-1 single status and wave-3 life satisfaction. That statement is conditional on controlling baseline mental health and baseline life satisfaction.

Path diagram showing longitudinal causal mediation from single status at wave 1 to life satisfaction at wave 3 through mental health at wave 2.

Learning the mediation workflow

One useful extension to this method is sensitivity analysis, which investigates under what conditions the relationships found could be biased. This is a useful way to check model results and is rarely done in path and Structural Equation Models.

If you want a more structured introduction to doing mediation analysis, check out our online course: Mediation and Moderation Analysis Using R. It covers path analysis, moderation, mediated moderation, causal mediation, and interpretation in R.


Was the information useful?

Consider supporting the site: