Representing complex social relationships using statistical models is often a challenge. Popular techniques like regression analysis do not really capture the full complexity of relationships between variables. One approach to moving beyond this simplistic model is to use techniques such as mediation analysis. This allows for the inclusion of multiple pathways to better represent the relationship between variables. In a previous post, we introduced mediation analysis, and we looked at some relatively simple examples. Here, we extend this to examine what happens in a more complex situation when using a multiple mediation model.
As an example, let’s consider the association between being single and life satisfaction. Physical health, mental health, and income could each contribute to that association. Estimating three separate single-mediator models would only give us a limited and bias image of the relationships in the data. To better understand these relationships we need a multiple mediation model.
What is a multiple mediation model?
A standard mediation model contains one mediator between a predictor and an outcome. A multiple mediation model contains two or more mediators. The mediators can operate in parallel, in sequence, or through a mixture of both.
Parallel mediation asks whether several mechanisms explain the a relationship at the same time. Serial mediation makes a stronger claim: one mediator changes another mediator, which then changes the outcome. The distinction matters because the indirect effect is calculated from different paths.
In the R package lavaan, an indirect effect is defined by labelling regression coefficients and multiplying them. For a pathway from X to M to Y, the indirect effect is a*b:
model <- ' M ~ a*X Y ~ b*M + c*X indirect := a*b total := c + indirect '
For a refresher on the single-mediator case, see the practical introduction to path and mediation analysis in R. The official lavaan mediation tutorial also explains parameter labels and defined effects.
How to fit a multiple mediation model in R
The examples use usw, which is the wide data from Understanding Society, where each respondent represents one row and repeated measurements have wave suffixes such as _1, _2, and _3.
load("../data/us_clean_syn.RData")
dat <- usw |>
mutate(
age_num = as.numeric(haven::zap_labels(age)),
age10 = (age_num - mean(age_num, na.rm = TRUE)) / 10,
female = ifelse(gndr.fct == "Female", 1, 0),
degree = ifelse(degree.fct == "Degree", 1, 0),
single1 = as.numeric(haven::zap_labels(single_1))
)
dat |>
select(single1, degree, age10, female,
logincome_1, sati_1,
sf12pcs_1, sf12mcs_1,
sf12pcs_2, sf12mcs_2, sati_3) |>
summary()## single1 degree age10 female
## Min. :0.0000 Min. :0.0000 Min. :-2.9534 Min. :0.0000
## 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:-1.4534 1st Qu.:0.0000
## Median :0.0000 Median :0.0000 Median :-0.1534 Median :1.0000
## Mean :0.3864 Mean :0.3251 Mean : 0.0000 Mean :0.5473
## 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.: 1.3466 3rd Qu.:1.0000
## Max. :1.0000 Max. :1.0000 Max. : 5.5466 Max. :1.0000
## NA's :111
## logincome_1 sati_1 sf12pcs_1 sf12mcs_1
## Min. :2.303 Min. :1.000 Min. : 4.41 Min. : 0.00
## 1st Qu.:6.267 1st Qu.:5.000 1st Qu.:45.08 1st Qu.:45.91
## Median :7.012 Median :6.000 Median :53.46 Median :52.99
## Mean :6.633 Mean :5.248 Mean :49.49 Mean :50.44
## 3rd Qu.:7.581 3rd Qu.:6.000 3rd Qu.:57.20 3rd Qu.:57.16
## Max. :9.211 Max. :7.000 Max. :74.90 Max. :76.35
## NA's :42 NA's :11581 NA's :3593 NA's :3593
## sf12pcs_2 sf12mcs_2 sati_3
## Min. : 5.06 Min. : 0.00 Min. :1.000
## 1st Qu.:44.66 1st Qu.:44.62 1st Qu.:4.000
## Median :53.56 Median :51.83 Median :6.000
## Mean :49.76 Mean :49.84 Mean :5.115
## 3rd Qu.:57.49 3rd Qu.:57.09 3rd Qu.:6.000
## Max. :71.95 Max. :77.09 Max. :7.000
## NA's :23293 NA's :23293 NA's :23641Age is centred and divided by ten, so a one-unit increase means ten years. The models adjust for age and gender to control for their possible confounding.
Example 1: Two parallel mediators
In the first example we explore whether being single is associated with life satisfaction through physical and mental health. The two mediators are estimated together and their residuals are allowed to correlate to account for any common variance.
model_1 <- '
sf12pcs_1 ~ a1*single1 + age10 + female
sf12mcs_1 ~ a2*single1 + age10 + female
sati_1 ~ b1*sf12pcs_1 + b2*sf12mcs_1 +
c*single1 + age10 + female
sf12pcs_1 ~~ sf12mcs_1
ind_physical := a1*b1
ind_mental := a2*b2
ind_total := ind_physical + ind_mental
total := c + ind_total
'
fit_1 <- fit_mediation(model_1)
Blue arrows are positive and red arrows are negative. The labels show standardised coefficients. Age, gender, and the mediator residual correlation remain in the fitted model but are omitted from the diagram.
Here are the key coefficients of interest:
| Effect | Estimate | SE | p | 95% CI | Standardized |
|---|---|---|---|---|---|
| ind_physical | -0.049 | 0.003 | <.001 | [-0.055, -0.044] | -0.016 |
| ind_mental | -0.076 | 0.006 | <.001 | [-0.087, -0.065] | -0.025 |
| ind_total | -0.125 | 0.006 | <.001 | [-0.138, -0.113] | -0.042 |
| total | -0.266 | 0.016 | <.001 | [-0.297, -0.236] | -0.089 |
Being single predicts poorer physical and mental health. Both health measures predict higher life satisfaction, so both indirect effects are negative. The indirect effect through physical health is -0.049; the effect through mental health is -0.076. Together they account for an indirect effect of -0.125 points on the satisfaction scale.
Mental health provides a stronger pathway, but both effects are small. With more than 50,000 respondents, small coefficients can be estimated precisely, so statistical significance should not be mistaken for practical importance.
Example 2: Three mediators in a multiple mediation model
A multiple mediation model can include more than two mediators. Here, we add log income to the physical- and mental-health pathways.
model_2 <- '
sf12pcs_1 ~ a1*single1 + age10 + female
sf12mcs_1 ~ a2*single1 + age10 + female
logincome_1 ~ a3*single1 + age10 + female
sati_1 ~ b1*sf12pcs_1 + b2*sf12mcs_1 + b3*logincome_1 +
c*single1 + age10 + female
sf12pcs_1 ~~ sf12mcs_1 + logincome_1
sf12mcs_1 ~~ logincome_1
ind_physical := a1*b1
ind_mental := a2*b2
ind_income := a3*b3
ind_total := ind_physical + ind_mental + ind_income
total := c + ind_total
'
fit_2 <- fit_mediation(model_2)Here is a visual representation of the results and the main coefficients of interest:

| Effect | Estimate | SE | p | 95% CI | Standardized |
|---|---|---|---|---|---|
| ind_physical | -0.051 | 0.003 | <.001 | [-0.057, -0.045] | -0.017 |
| ind_mental | -0.077 | 0.006 | <.001 | [-0.088, -0.065] | -0.025 |
| ind_income | 0.015 | 0.002 | <.001 | [0.011, 0.019] | 0.005 |
| ind_total | -0.113 | 0.007 | <.001 | [-0.126, -0.099] | -0.037 |
| total | -0.266 | 0.016 | <.001 | [-0.297, -0.236] | -0.089 |
The physical-health (-0.051) and mental-health (-0.077) pathways are negative. The income pathway is small and positive (0.015). It partially offsets the health pathways rather than reinforcing them.
This is exactly why separate mediation models are a poor substitute for one multiple-mediator model. Separate models would not show how the pathways compete after accounting for the same controls and the other mediators.
Example 3: Parallel and serial pathways together
The third model uses degree status as the predictor and mental health as the outcome. Education can operate through income alone, satisfaction alone, or income followed by satisfaction.
model_3 <- '
logincome_1 ~ a*degree + age10 + female
sati_1 ~ d*logincome_1 + e*degree + age10 + female
sf12mcs_1 ~ b*sati_1 + f*logincome_1 +
c*degree + age10 + female
ind_income := a*f
ind_satisfaction := e*b
ind_serial := a*d*b
ind_total := ind_income + ind_satisfaction + ind_serial
total := c + ind_total
'
fit_3 <- fit_mediation(model_3)Again, here is the visual representation and main coefficients.

| Effect | Estimate | SE | p | 95% CI | Standardized |
|---|---|---|---|---|---|
| ind_income | 0.056 | 0.024 | 0.019 | [0.009, 0.103] | 0.003 |
| ind_satisfaction | 0.467 | 0.042 | <.001 | [0.384, 0.549] | 0.022 |
| ind_serial | -0.032 | 0.011 | 0.003 | [-0.054, -0.011] | -0.001 |
| ind_total | 0.490 | 0.047 | <.001 | [0.399, 0.582] | 0.023 |
| total | 1.222 | 0.093 | <.001 | [1.04, 1.404] | 0.057 |
The satisfaction-only pathway is the largest (0.467). By comparison, the income-only pathway is positive but small (0.056). A negative serial pathway (-0.032) appears because, conditional on education and the other controls, income has a small negative association with satisfaction in this synthetic dataset.
Example 4: Longitudinal parallel mediation
The previous examples use only cross-sectional data. Their paths make assumptions about causal order but the variables are measured at the same wave. This means that we are just assuming the relationships go in that particular direction. When using longitudinal data we can use the temporal order to help us decide on the causal direction.
In this example, age at the first wave predicts physical and mental health at wave 2. Those mediators predict life satisfaction at wave 3. The model controls wave-1 physical health, mental health, and satisfaction in both the mediator and outcome equations.
model_4 <- '
sf12pcs_2 ~ a1*age10 + sf12pcs_1 + sf12mcs_1 + sati_1 + female
sf12mcs_2 ~ a2*age10 + sf12pcs_1 + sf12mcs_1 + sati_1 + female
sati_3 ~ b1*sf12pcs_2 + b2*sf12mcs_2 + c*age10 +
sati_1 + sf12pcs_1 + sf12mcs_1 + female
sf12pcs_2 ~~ sf12mcs_2
ind_physical := a1*b1
ind_mental := a2*b2
ind_total := ind_physical + ind_mental
total := c + ind_total
'
fit_4 <- fit_mediation(model_4)
Baseline controls are omitted from the diagram but remain in the model and are shown below.
| Effect | Estimate | SE | p | 95% CI | Standardized |
|---|---|---|---|---|---|
| ind_physical | -0.013 | 0.002 | <.001 | [-0.017, -0.009] | -0.015 |
| ind_mental | 0.011 | 0.001 | <.001 | [0.009, 0.013] | 0.013 |
| ind_total | -0.002 | 0.002 | 0.365 | [-0.006, 0.002] | -0.002 |
| total | 0.039 | 0.006 | <.001 | [0.027, 0.05] | 0.046 |
| Outcome | Predictor | Estimate | p | Standardized |
|---|---|---|---|---|
| sf12pcs_2 | age10 | -1.371 | <.001 | -0.226 |
| sf12pcs_2 | sf12pcs_1 | 0.487 | <.001 | 0.500 |
| sf12pcs_2 | sf12mcs_1 | 0.092 | <.001 | 0.084 |
| sf12pcs_2 | sati_1 | 0.317 | <.001 | 0.043 |
| sf12pcs_2 | female | 0.029 | 0.791 | 0.001 |
| sf12mcs_2 | age10 | 0.443 | <.001 | 0.083 |
| sf12mcs_2 | sf12pcs_1 | 0.029 | <.001 | 0.034 |
| sf12mcs_2 | sf12mcs_1 | 0.169 | <.001 | 0.175 |
| sf12mcs_2 | sati_1 | 1.568 | <.001 | 0.238 |
| sf12mcs_2 | female | -0.896 | <.001 | -0.046 |
| sati_3 | sf12pcs_2 | 0.009 | <.001 | 0.068 |
| sati_3 | sf12mcs_2 | 0.025 | <.001 | 0.157 |
| sati_3 | age10 | 0.041 | <.001 | 0.048 |
| sati_3 | sati_1 | 0.204 | <.001 | 0.198 |
| sati_3 | sf12pcs_1 | 0.001 | 0.253 | 0.010 |
| sati_3 | sf12mcs_1 | 0.007 | <.001 | 0.044 |
| sati_3 | female | 0.002 | 0.914 | 0.001 |
The physical-health indirect effect is negative (-0.013), while the mental-health indirect effect is positive (0.011). Both are different from zero. Their sum is only -0.002 and is not statistically significant.
While longitudinal mediation improves our understanding of the relationships in the data it does come with assumptions. Time-varying events between waves could affect both health and satisfaction, and the spacing of the waves may not match the speed of the process. The model could be extended to include more controls and pathways to get closer the real world mechanisms.
For a fuller treatment of temporal ordering, see the guide to longitudinal mediation in R.
Choosing a multiple mediation model
What type of mediation processes should you use in your research? The main guide should be grounded in theory or prior empirical findings. Usually, the parallel mediation is a good starting point. Use a serial model only when the sequence is supported by theory or the timing of the data collection. If repeated waves are available, a longitudinal model with baseline controls is usually more credible, although it does not remove all time-varying confounding.
Where to go next
Mediation models (and SEM more generally) force users to put their theoretical expectations on paper. While this makes it more challenging, it often makes an important contribution to our understanding of the mechanisms underlying the relationships of interest.
If you want a structured treatment of mediation, moderation, assumptions, and interpretation in R, see the mediation and moderation course. For another worked example now, read the longitudinal mediation guide with R code.
Was the information useful?
Consider supporting the site: