longitudinal mediation in R

Longitudinal Mediation in R: A Hands-On Guide

Mediation analysis has become one of the most popular tools in the social and behavioral sciences for understanding the pathways through which variables influence each other. For example, rather than simply asking whether income affects health, mediation models allow us to explore the mechanisms—such as whether higher income leads to greater life satisfaction, which in turn improves health outcomes. This ability to go beyond correlation and reveal potential causal pathways makes mediation indispensable in applied research, spanning psychology, sociology, public health, and economics.

However, using cross-sectional data to investigate mediation processes can be problematic, as causal directions are often unclear and effects may not be concurrent in time. Longitudinal mediation can tackle these issues by combining the logic of mediation with repeated observations. This allows researchers to clarify causal direction and account for the time lag of effects. For example, rather than assuming that income and life satisfaction are related at a single point in time, a longitudinal mediation model can test whether income increases precede increases in satisfaction, which in turn translate into better health. By using the temporal order of the data, we can ensure the correct causal direction. Furthermore, if we believe there is a time lag between the increase in income and the change in satisfaction, this can be easily accounted for.

Access the code used here.

Access the data here.

In this guide, we will build on path analysis as a method for estimating longitudinal mediation models using data from the Understanding Society survey. As an illustration, we will explore how income affects mental health through life satisfaction over three waves of data.

Conceptual presentation of longitudinal mediation

The figure below shows the basic structure of a mediation model, a simple yet powerful framework often presented using structural equation model (SEM) diagrams. Here, a predictor variable X influences an outcome Y both directly (path c) and indirectly through a mediator M (paths a and b). The indirect effect, computed as the product a×b, represents the portion of the relationship between X and Y that operates through the mediator.

For example, in social science research, X might represent income, M might represent life satisfaction, and Y might represent health. In this case, mediation analysis helps us understand whether higher income leads to better health because it improves satisfaction, which in turn enhances well-being. As discussed in this earlier post, mediation models can be estimated using path analysis or SEM, allowing researchers to decompose total effects into their direct and indirect components and test hypotheses about underlying mechanisms.

cross-sectional mediation model using SEM diagram

While mediation analysis is a powerful framework for uncovering potential mechanisms, it also comes with important limitations. First, the assumed causal direction between variables—such as XMY—is often theoretical rather than tested empirically. In many cases, it is equally plausible that the mediator influences the predictor or that the relationship is reciprocal. Second, mediation assumes that effects occur instantaneously, but in reality, the influence of X on M or Y may take time to manifest. Without longitudinal data, it is difficult to capture such delayed or dynamic processes. Finally, mediation models based on observational data rely on the strong assumption that there is no unmeasured confounding. This implies that there are no omitted variables that simultaneously affect both the mediator and the outcome. Violations of this assumption can bias estimates of the indirect effect and lead to misleading conclusions.

Longitudinal mediation addresses some of the limitations of traditional, cross-sectional mediation models by explicitly modeling change over time and the temporal order of variables. Because data are collected at multiple time points, researchers can infer the causal direction between variables more accurately—testing, for example, whether changes in X precede changes in the mediator M, which in turn lead to later changes in Y . This temporal separation helps reduce the risk of reverse causality and provides a framework to study delayed effects, where the influence of a predictor may take time to manifest. Moreover, by controlling for prior levels of each variable, longitudinal models help to account for some of the unobserved confounding.

The figure below shows how the simple mediation model can be extended to a three-wave longitudinal design. Each variable—X, M, and Y—is measured at three time points (subscripts 1–3). The horizontal arrows represent stability paths, showing how each construct at an earlier time predicts itself at a later time (e.g., X1→X2→X3). The diagonal arrows labeled a and b represent the mediated effects from X to M and from M to Y across waves, while the dashed arrow (c) captures the direct effect of earlier X values on later Y outcomes. The circles (ϵ) denote residuals or unexplained variance for each variable at each time point.

longitudinal mediation model using SEM diagram

By combining the logic of mediation with the structure of longitudinal data, this model ensures the correct causal direction (e.g., M in time point 2 cannot cause X in time point 1). Furthermore, by controlling for past values of the variables, we can minimize the effect of confounding variables.

Running longitudinal mediation in R

To illustrate how longitudinal mediation can be estimated, we will use the lavaan package from R. The model below extends the traditional mediation framework to a three-wave panel, where income (`logincome`), life satisfaction (`sati`), and mental health (`sf12mcs`) are each measured at three points in time. The goal is to test whether earlier levels of income influence later mental health indirectly, through changes in life satisfaction. To do this, we include autoregressive paths (e.g., `logincome_1 → logincome_2 → logincome_3`) to account for stability over time, and cross-lagged paths that represent the hypothesized mediation process (e.g., `logincome_1 → sati_2 → sf12mcs_3`). The R code below specifies this longitudinal mediation model:

library(lavaan)

model <- '
  logincome_2 ~ logincome_1
  sati_2 ~ logincome_1 + sati_1
  sf12mcs_2 ~ sati_1 + sf12mcs_1

  logincome_3 ~ logincome_2
  sati_3 ~ logincome_2 + sati_2
  sf12mcs_3 ~ sati_2 + logincome_1 + sf12mcs_2
'

med1 <- sem(model, data = usw)

summary(med1, standardized = TRUE)
## lavaan 0.6-19 ended normally after 33 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                        20
## 
##                                                   Used       Total
##   Number of observations                         18008       51007
## 
## Model Test User Model:
##                                                       
##   Test statistic                              6977.809
##   Degrees of freedom                                19
##   P-value (Chi-square)                           0.000
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   logincome_2 ~                                                         
##     logincome_1       0.583    0.005  117.620    0.000    0.583    0.659
##   sati_2 ~                                                              
##     logincome_1       0.013    0.007    1.952    0.051    0.013    0.013
##     sati_1            0.478    0.007   69.996    0.000    0.478    0.462
##   sf12mcs_2 ~                                                           
##     sati_1            1.620    0.052   31.407    0.000    1.620    0.236
##     sf12mcs_1         0.184    0.008   24.239    0.000    0.184    0.182
##   logincome_3 ~                                                         
##     logincome_2       0.454    0.006   72.364    0.000    0.454    0.475
##   sati_3 ~                                                              
##     logincome_2       0.026    0.008    3.076    0.002    0.026    0.021
##     sati_2            0.344    0.007   46.780    0.000    0.344    0.329
##   sf12mcs_3 ~                                                           
##     sati_2            1.095    0.044   24.742    0.000    1.095    0.170
##     logincome_1       0.105    0.044    2.387    0.017    0.105    0.016
##     sf12mcs_2         0.332    0.007   51.006    0.000    0.332    0.342
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##  .logincome_3 ~~                                                        
##    .sati_3            0.033    0.011    3.064    0.002    0.033    0.023
##    .sf12mcs_3         0.140    0.064    2.195    0.028    0.140    0.016
##  .sati_3 ~~                                                             
##    .sf12mcs_3         2.588    0.088   29.323    0.000    2.588    0.224
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .logincome_2       0.854    0.009   94.889    0.000    0.854    0.566
##    .sati_2            1.563    0.016   94.889    0.000    1.563    0.786
##    .sf12mcs_2        76.972    0.811   94.889    0.000   76.972    0.879
##    .logincome_3       1.070    0.011   94.889    0.000    1.070    0.775
##    .sati_3            1.940    0.020   94.889    0.000    1.940    0.891
##    .sf12mcs_3        68.831    0.725   94.889    0.000   68.831    0.837

When interpreting longitudinal mediation results, it is good to focus on standardized coefficients (Std.all column), as they express relationships in comparable units across variables with different scales. The effect of income in wave 1 on satisfaction in wave 2 is 0.013. This implies that an increase in income is associated with a slightly higher well-being score. The “b” path, linking satisfaction to later mental health (0.170), is stronger, indicating that higher satisfaction predicts better mental health. The direct effect of income on mental health is 0.016, suggesting only a small effect after controlling for the indirect path and previous values of health.

The horizontal paths—such as logincome_1 → logincome_2 (0.659) or sf12mcs_2 → sf12mcs_3 (0.342)—represent stability, or how much the ranking of individuals persists across waves. These paths serve as strong controls, indicating that a significant portion of the variation in each construct is explained by its previous level.

We can visualize the main coefficients of interest using an SEM plot:

results from longitudinal mediation using R

Calculating indirect and total effects

To better understand how income influences health, it is beneficial to distinguish between its direct and indirect effects. The direct effect represents the portion of the relationship between income and health that is not explained by life satisfaction. In contrast, the indirect effect captures the pathway through which income affects satisfaction, which in turn influences mental health. Estimating both helps clarify whether the association between income and health is primarily due to material factors or operates through psychological mechanisms.

In the updated model, we label the path from income to satisfaction as a, the path from satisfaction to health as b, and the direct path from income to health as c. These labels allow us to define new parameters that quantify the indirect and total effects: indirect := a * b computes the mediated pathway, and total := c + (a * b) sums both direct and indirect effects to capture the overall impact of income on mental health.

model <- '
  logincome_2 ~ logincome_1
  sati_2 ~ a*logincome_1 + sati_1
  sf12mcs_2 ~ sati_1 + sf12mcs_1

  logincome_3 ~ logincome_2
  sati_3 ~ logincome_2 + sati_2
  sf12mcs_3 ~ b*sati_2 + c*logincome_1 + sf12mcs_2

  # Indirect effect
  indirect := a * b

  # Total effect
  total := c + (a * b)
'

med1b <- sem(model, data = usw)

summary(med1b, standardized = TRUE)
## lavaan 0.6-19 ended normally after 33 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                        20
## 
##                                                   Used       Total
##   Number of observations                         18008       51007
## 
## Model Test User Model:
##                                                       
##   Test statistic                              6977.809
##   Degrees of freedom                                19
##   P-value (Chi-square)                           0.000
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   logincome_2 ~                                                         
##     logincom_1        0.583    0.005  117.620    0.000    0.583    0.659
##   sati_2 ~                                                              
##     logincom_1 (a)    0.013    0.007    1.952    0.051    0.013    0.013
##     sati_1            0.478    0.007   69.996    0.000    0.478    0.462
##   sf12mcs_2 ~                                                           
##     sati_1            1.620    0.052   31.407    0.000    1.620    0.236
##     sf12mcs_1         0.184    0.008   24.239    0.000    0.184    0.182
##   logincome_3 ~                                                         
##     logincom_2        0.454    0.006   72.364    0.000    0.454    0.475
##   sati_3 ~                                                              
##     logincom_2        0.026    0.008    3.076    0.002    0.026    0.021
##     sati_2            0.344    0.007   46.780    0.000    0.344    0.329
##   sf12mcs_3 ~                                                           
##     sati_2     (b)    1.095    0.044   24.742    0.000    1.095    0.170
##     logincom_1 (c)    0.105    0.044    2.387    0.017    0.105    0.016
##     sf12mcs_2         0.332    0.007   51.006    0.000    0.332    0.342
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##  .logincome_3 ~~                                                        
##    .sati_3            0.033    0.011    3.064    0.002    0.033    0.023
##    .sf12mcs_3         0.140    0.064    2.195    0.028    0.140    0.016
##  .sati_3 ~~                                                             
##    .sf12mcs_3         2.588    0.088   29.323    0.000    2.588    0.224
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .logincome_2       0.854    0.009   94.889    0.000    0.854    0.566
##    .sati_2            1.563    0.016   94.889    0.000    1.563    0.786
##    .sf12mcs_2        76.972    0.811   94.889    0.000   76.972    0.879
##    .logincome_3       1.070    0.011   94.889    0.000    1.070    0.775
##    .sati_3            1.940    0.020   94.889    0.000    1.940    0.891
##    .sf12mcs_3        68.831    0.725   94.889    0.000   68.831    0.837
## 
## Defined Parameters:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##     indirect          0.014    0.007    1.946    0.052    0.014    0.002
##     total             0.119    0.044    2.677    0.007    0.119    0.018

The results indicate that the indirect effect of income on later mental health, mediated by life satisfaction, is small (standardized coefficient= 0.002), suggesting a weak mediated pathway. In contrast, the total effect (standardized = 0.018) is larger and statistically significant, indicating that higher income at the earlier wave is associated with better overall mental health in the subsequent period. This pattern implies that while part of the relationship may operate through increased satisfaction, most of the effect is likely explained by the direct effect or other mechanisms.

Including control variables

Although the longitudinal mediation model controls for information from previous waves, it is still important to include control variables to account for potential confounders. Factors such as education, gender, and marital status are often correlated with income, satisfaction, and health outcomes. By including them as covariates, we ensure that the estimated paths reflect the unique relationships among the key variables of interest rather than spurious associations driven by background characteristics.

As an example, we include three variables as controls: degree, gender, and relationship status (single). The first two are considered time constant, while the latter is time-varying. Their inclusion in the key equations means that the mediation coefficients will be “corrected” or “controlled” for their possible confounding effects.

model <- '
  logincome_2 ~ logincome_1
  sati_2 ~ a*logincome_1 + sati_1 + degree.fct + gndr.fct + single_2
  sf12mcs_2 ~ sati_1 + sf12mcs_1

  logincome_3 ~ logincome_2
  sati_3 ~ logincome_2 + sati_2
  sf12mcs_3 ~ b*sati_2 + c*logincome_1 + sf12mcs_2 + 
                degree.fct + gndr.fct + single_3

  # Indirect effect
  indirect := a * b

  # Total effect
  total := c + (a * b)
'

med2 <- sem(model, data = usw)

summary(med2, standardized = TRUE)
## lavaan 0.6-19 ended normally after 49 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                        26
## 
##                                                   Used       Total
##   Number of observations                         18005       51007
## 
## Model Test User Model:
##                                                       
##   Test statistic                              8015.186
##   Degrees of freedom                                37
##   P-value (Chi-square)                           0.000
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   logincome_2 ~                                                         
##     logincom_1        0.582    0.005  117.593    0.000    0.582    0.659
##   sati_2 ~                                                              
##     logincom_1 (a)   -0.001    0.007   -0.167    0.868   -0.001   -0.001
##     sati_1            0.474    0.007   69.021    0.000    0.474    0.459
##     degree.fct       -0.075    0.020   -3.808    0.000   -0.075   -0.026
##     gndr.fct         -0.052    0.019   -2.744    0.006   -0.052   -0.018
##     single_2         -0.093    0.020   -4.588    0.000   -0.093   -0.031
##   sf12mcs_2 ~                                                           
##     sati_1            1.620    0.052   31.409    0.000    1.620    0.236
##     sf12mcs_1         0.184    0.008   24.234    0.000    0.184    0.182
##   logincome_3 ~                                                         
##     logincom_2        0.454    0.006   72.364    0.000    0.454    0.475
##   sati_3 ~                                                              
##     logincom_2        0.026    0.008    3.127    0.002    0.026    0.022
##     sati_2            0.344    0.007   46.779    0.000    0.344    0.329
##   sf12mcs_3 ~                                                           
##     sati_2     (b)    1.094    0.044   24.655    0.000    1.094    0.170
##     logincom_1 (c)    0.082    0.046    1.791    0.073    0.082    0.013
##     sf12mcs_2         0.329    0.006   50.564    0.000    0.329    0.339
##     degree.fct        0.391    0.128    3.050    0.002    0.391    0.021
##     gndr.fct         -0.639    0.123   -5.184    0.000   -0.639   -0.035
##     single_3         -0.478    0.131   -3.666    0.000   -0.478   -0.025
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##  .logincome_3 ~~                                                        
##    .sati_3            0.033    0.011    3.070    0.002    0.033    0.023
##    .sf12mcs_3         0.134    0.064    2.105    0.035    0.134    0.016
##  .sati_3 ~~                                                             
##    .sf12mcs_3         2.589    0.088   29.368    0.000    2.589    0.224
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .logincome_2       0.854    0.009   94.882    0.000    0.854    0.566
##    .sati_2            1.559    0.016   94.882    0.000    1.559    0.784
##    .sf12mcs_2        76.976    0.811   94.882    0.000   76.976    0.879
##    .logincome_3       1.070    0.011   94.882    0.000    1.070    0.775
##    .sati_3            1.940    0.020   94.882    0.000    1.940    0.891
##    .sf12mcs_3        68.657    0.724   94.882    0.000   68.657    0.836
## 
## Defined Parameters:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##     indirect         -0.001    0.008   -0.167    0.868   -0.001   -0.000
##     total             0.081    0.047    1.739    0.082    0.081    0.012

After including the control variables for education, gender, and relationship status, the mediation effects changed noticeably. The “a” path from income to life satisfaction, which was previously small but positive, is now essentially zero, suggesting that once these background characteristics are accounted for, income no longer predicts satisfaction. The “b” path from satisfaction to later mental health remains large (0.170), indicating that satisfaction continues to be an important predictor of well-being. The “c” path, representing the direct effect of income on later health, is slightly smaller (0.013) and no longer statistically significant, implying that any remaining influence of income is weak once both satisfaction and sociodemographic factors are controlled. Correspondingly, the indirect and total effects decrease in size. Overall, this pattern suggests that much of the previously observed relationship between income, satisfaction, and health may be attributed to differences in education, gender, and relationship status, rather than to a genuine longitudinal mediation process.

An alternative specification of longitudinal mediation

An alternative approach to specifying longitudinal mediation, illustrated in the figure below, focuses on estimating wave-to-wave effects rather than decomposing them into a single pathway across three waves. In this approach, each variable—X, M, and Y—is modeled as influencing its future values and the corresponding measures of the other variables in the next wave (e.g., X1→M2, M1→Y2). This setup enables researchers to focus on lag-1 effects, specifically whether values at one time point influence those at the next time point. One of the key strengths of this model is that it can be applied even when only two waves of data are available. It is also better suited for situations where the mediator and outcome are expected to be affected relatively quickly by changes in the predictor.

alternative way to do longitudinal mediation analysis

However, this specification has some limitations. Because the model estimates separate paths for each time interval, we cannot formally compute overall direct and indirect effects across waves. The mediation process is implied by the sequence of coefficients (e.g., a1 and b1), but cannot be summarized in a single pathway. This makes it less straightforward to quantify how much of the total effect of X on Y is transmitted through M. Additionally, this structure can become complex to interpret as the number of waves increases.

To illustrate the model, we adapt the code to our data. Each time interval now has its own set of parameters: a1, b1, and c1 for the paths from Wave 1 to Wave 2, and a2, b2, and c2 for the paths from Wave 2 to Wave 3.

model <- '
  logincome_2 ~ logincome_1
  sati_2 ~ a1*logincome_1 + sati_1
  sf12mcs_2 ~ b1*sati_1 + c1*logincome_1 + sf12mcs_1

  logincome_3 ~ logincome_2
  sati_3 ~ a2*logincome_2 + sati_2
  sf12mcs_3 ~ b2*sati_2 + c2*logincome_2 + sf12mcs_2
'

med3 <- sem(model, data = usw)

summary(med3, standardized = TRUE)
## lavaan 0.6-19 ended normally after 33 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                        21
## 
##                                                   Used       Total
##   Number of observations                         18008       51007
## 
## Model Test User Model:
##                                                       
##   Test statistic                              6942.089
##   Degrees of freedom                                18
##   P-value (Chi-square)                           0.000
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   logincome_2 ~                                                         
##     logincm_1         0.583    0.005  117.620    0.000    0.583    0.659
##   sati_2 ~                                                              
##     logincm_1 (a1)    0.013    0.007    1.952    0.051    0.013    0.013
##     sati_1            0.478    0.007   69.996    0.000    0.478    0.462
##   sf12mcs_2 ~                                                           
##     sati_1    (b1)    1.625    0.052   31.523    0.000    1.625    0.237
##     logincm_1 (c1)    0.267    0.047    5.672    0.000    0.267    0.040
##     sf12mcs_1         0.182    0.008   23.998    0.000    0.182    0.181
##   logincome_3 ~                                                         
##     logincm_2         0.454    0.006   72.381    0.000    0.454    0.475
##   sati_3 ~                                                              
##     logincm_2 (a2)    0.028    0.008    3.346    0.001    0.028    0.024
##     sati_2            0.344    0.007   46.775    0.000    0.344    0.329
##   sf12mcs_3 ~                                                           
##     sati_2    (b2)    1.094    0.044   24.726    0.000    1.094    0.170
##     logincm_2 (c2)    0.149    0.050    2.956    0.003    0.149    0.020
##     sf12mcs_2         0.331    0.007   50.939    0.000    0.331    0.342
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##  .logincome_3 ~~                                                        
##    .sati_3            0.033    0.011    3.064    0.002    0.033    0.023
##    .sf12mcs_3         0.186    0.064    2.909    0.004    0.186    0.022
##  .sati_3 ~~                                                             
##    .sf12mcs_3         2.589    0.088   29.341    0.000    2.589    0.224
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .logincome_2       0.854    0.009   94.889    0.000    0.854    0.566
##    .sati_2            1.563    0.016   94.889    0.000    1.563    0.786
##    .sf12mcs_2        76.834    0.810   94.889    0.000   76.834    0.877
##    .logincome_3       1.070    0.011   94.889    0.000    1.070    0.775
##    .sati_3            1.940    0.020   94.889    0.000    1.940    0.891
##    .sf12mcs_3        68.833    0.725   94.889    0.000   68.833    0.837

The “a” and “b” paths are similar to those previously estimated. The “c” paths from wave 1 to wave 2 are larger than what we saw in the previous model, while those from wave 2 to wave 3 are similar.

We can visualize the results using a SEM diagram as well:

results from alternative way to do mediation analysis

Investigating if the coefficients are equal in time with longitudinal mediation

In the context of longitudinal mediation, it can be interesting to test whether the mediation effects remain stable over time. Real-world processes, such as the relationship between income, satisfaction, and health, may evolve as individuals age or as broader social and economic conditions shift. For example, income increases may have a stronger effect on satisfaction in earlier periods of life, while later on, satisfaction may become more important for maintaining good mental health.

To formally test whether the direct effect of income on health is constant across waves, we can constrain the relevant parameters to be equal in the model. In the code below, both direct paths—from logincome_1 to sf12mcs_2 and from logincome_2 to sf12mcs_3—are labeled with the same parameter name c. By using an identical label, the model forces these coefficients to take the same value. If this constraint significantly worsens model fit compared to a freely estimated model, it suggests that the direct effect changes over time; if not, it supports the idea of temporal stability in this relationship.

model <- '
  logincome_2 ~ logincome_1
  sati_2 ~ a1*logincome_1 + sati_1
  sf12mcs_2 ~ b1*sati_1 + c*logincome_1 + sf12mcs_1

  logincome_3 ~ logincome_2
  sati_3 ~ a2*logincome_2 + sati_2
  sf12mcs_3 ~ b2*sati_2 + c*logincome_2 + sf12mcs_2
'

med3b <- sem(model, data = usw)

summary(med3b, standardized = TRUE)
## lavaan 0.6-19 ended normally after 33 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                        21
##   Number of equality constraints                     1
## 
##                                                   Used       Total
##   Number of observations                         18008       51007
## 
## Model Test User Model:
##                                                       
##   Test statistic                              6945.017
##   Degrees of freedom                                19
##   P-value (Chi-square)                           0.000
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   logincome_2 ~                                                         
##     logincm_1         0.583    0.005  117.620    0.000    0.583    0.659
##   sati_2 ~                                                              
##     logincm_1 (a1)    0.013    0.007    1.952    0.051    0.013    0.013
##     sati_1            0.478    0.007   69.996    0.000    0.478    0.462
##   sf12mcs_2 ~                                                           
##     sati_1    (b1)    1.624    0.052   31.504    0.000    1.624    0.237
##     logincm_1  (c)    0.212    0.034    6.163    0.000    0.212    0.031
##     sf12mcs_1         0.182    0.008   24.057    0.000    0.182    0.181
##   logincome_3 ~                                                         
##     logincm_2         0.454    0.006   72.417    0.000    0.454    0.475
##   sati_3 ~                                                              
##     logincm_2 (a2)    0.031    0.008    3.675    0.000    0.031    0.026
##     sati_2            0.344    0.007   46.771    0.000    0.344    0.329
##   sf12mcs_3 ~                                                           
##     sati_2    (b2)    1.095    0.044   24.731    0.000    1.095    0.170
##     logincm_2  (c)    0.212    0.034    6.163    0.000    0.212    0.029
##     sf12mcs_2         0.331    0.007   50.876    0.000    0.331    0.341
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##  .logincome_3 ~~                                                        
##    .sati_3            0.033    0.011    3.064    0.002    0.033    0.023
##    .sf12mcs_3         0.186    0.064    2.911    0.004    0.186    0.022
##  .sati_3 ~~                                                             
##    .sf12mcs_3         2.590    0.088   29.347    0.000    2.590    0.224
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .logincome_2       0.854    0.009   94.889    0.000    0.854    0.566
##    .sati_2            1.563    0.016   94.889    0.000    1.563    0.786
##    .sf12mcs_2        76.840    0.810   94.889    0.000   76.840    0.878
##    .logincome_3       1.070    0.011   94.889    0.000    1.070    0.775
##    .sati_3            1.940    0.020   94.889    0.000    1.940    0.891
##    .sf12mcs_3        68.841    0.725   94.889    0.000   68.841    0.837

To assess whether constraining the direct effects to be equal across time significantly worsens model fit, we can compare the unconstrained and constrained models using the anova() function in R. This performs a chi-squared difference test, where the null hypothesis is that the simpler (constrained) model fits the data as well as the more complex one. A significant p-value would indicate that the constraint leads to a poorer fit, suggesting that the effects differ over time.

anova(med3, med3b)
## 
## Chi-Squared Difference Test
## 
##       Df    AIC    BIC  Chisq Chisq diff    RMSEA Df diff Pr(>Chisq)  
## med3  18 478458 478622 6942.1                                         
## med3b 19 478459 478615 6945.0     2.9283 0.010348       1    0.08704 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

In this case, the chi-squared difference test yields a value of 2.93 with 1 degree of freedom and a p-value of 0.087. Since this is above the conventional 0.05 threshold, we do not reject the null hypothesis. The AIC and BIC also indicate that the constrained model is similar to or better than the unconstrained model (smaller values indicate a better fit). This means that constraining the direct effect of income on mental health to be equal across waves does not significantly reduce model fit. In other words, there is no strong evidence that the direct effect changes over time—the direct relationship between income and health appears to be relatively stable across the two time intervals.

In addition to testing whether the direct effect remains stable over time, we can also examine whether the a and b paths—the effects of income on satisfaction and of satisfaction on mental health—remain constant across waves. To do this, we assign the same parameter labels (a and b) to the corresponding paths in both time intervals.

model <- '
  logincome_2 ~ logincome_1
  sati_2 ~ a*logincome_1 + sati_1
  sf12mcs_2 ~ b*sati_1 + c*logincome_1 + sf12mcs_1

  logincome_3 ~ logincome_2
  sati_3 ~ a*logincome_2 + sati_2
  sf12mcs_3 ~ b*sati_2 + c*logincome_2 + sf12mcs_2
'

med3c <- sem(model, data = usw)

anova(med3b, med3c)
## 
## Chi-Squared Difference Test
## 
##       Df    AIC    BIC  Chisq Chisq diff    RMSEA Df diff Pr(>Chisq)    
## med3b 19 478459 478615 6945.0                                           
## med3c 21 478514 478655 7003.8     58.811 0.039716       2  1.696e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The chi-squared difference test comparing these two models shows a large and highly significant difference (Δχ2= 58.81, df = 2, p < 0.001). This indicates that forcing the a and b coefficients to be equal substantially worsens model fit. In other words, the effects of income on satisfaction and the effects of satisfaction on mental health are not constant overtime. The strength of these relationships likely changes between waves, suggesting that the mediation process itself evolves—perhaps becoming weaker or stronger as people’s circumstances change.

Conclusions regarding longitudinal mediation

Mediation analysis enables researchers to move beyond simple associations and explore the mechanisms that link causes and effects. When applied to longitudinal data, it becomes an even more powerful tool. This approach can address key challenges of cross-sectional mediation, such as uncertainty about the causal direction or timing of effects.

Our example also illustrated how longitudinal mediation can reveal whether specific pathways remain stable or evolve as people’s circumstances change, as well as the impact of confounders on key statistics.


Was the information useful?

Consider supporting the site:

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.