If you're an experimental psychologist, you likely know that journals expect more than "main effects" or "one cause fits all" explanations of psychological phenomenon. That is, they expect any given phenomenon to "depend on" some circumstance or some trait. Or they expect there to exist some process by which the phenomenon occurs, where first X happens, which leads to M followed by Y.
In the social sciences, researchers overwhelmingly prefer two statistical methods for demonstrating such psychological processes: moderation and mediation. Since at least 1986 when Baron and Kenny1 explained the difference between the two, researchers have gone sorta hog-wild in shoehorning every hypothesis into a moderation and/or mediation framework. The statistical techniques for testing these relationships are so popular that I'd bet any given empirical article published in the past 20 years is no more than 1 degree removed from a reference to Baron & Kenny (1986), Aiken, West, & Reno (1991)2, or Preacher & Hayes (2008)3.
This post is about that last reference, specifically Andrew Hayes who had the brilliant idea to code and release macros (for free) that add user-friendly drop-down menus for conducting not 1, 2, but 76 statistical models in SPSS and SAS. If you're a graduate student or a "younger" faculty member, you're probably familiar with at least 2 of these models: Model 1 for moderation and Model 4 for mediation.
I'm going to show you how to reproduce their output in R. I'll even throw in Model 7 (moderated mediation). It's my hope that learning how to run these models in R will provide you a framework for running similar and even more complex models without a tutorial.
Here's where to find the data I use in the post (mediation_data.sav): UCLA SPSS Statistical Computing Workshop Moderation and Mediation using the Process Macro
Hayes's PROCESS Model 1: moderation with a continuous moderator (see Hayes's template for all 76 models here)
Here's how you do it in SPSS
Download the folder with the PROCESS macros and documentation. Open then file called 'process.sps', select everything in the syntax window and click the giant green arrow to run. Then open a new syntax file.
GET
file = YOUR.FILE.PATH.HERE.
PROCESS vars = socst read math
/ x = socst
/ x = socst
/ m = math
/ y = read
/ model = 1
/ center = 1
/ boot = 1000.
/ y = read
/ model = 1
/ center = 1
/ boot = 1000.
Run MATRIX procedure:
*************** PROCESS Procedure for SPSS Release 2.16 ******************
Written by Andrew F. Hayes, Ph.D. www.afhayes.com
Documentation available in Hayes (2013). www.guilford.com/p/hayes3
**************************************************************************
Model = 1
Y = read
X = socst
M = math
Sample size
200
**************************************************************************
Outcome: read
Model Summary
R R-sq MSE F df1 df2 p
.7390 .5461 48.4421 78.6145 3.0000 196.0000 .0000
Model
coeff se t p LLCI ULCI
constant 51.6153 .5687 90.7626 .0000 50.4938 52.7369
math .4807 .0637 7.5455 .0000 .3550 .6063
socst .3738 .0555 6.7301 .0000 .2643 .4834
int_1 .0113 .0052 2.1572 .0322 .0010 .0216
Product terms key:
int_1 socst X math
R-square increase due to interaction(s):
R2-chng F df1 df2 p
int_1 .0108 4.6534 1.0000 196.0000 .0322
*************************************************************************
Conditional effect of X on Y at values of the moderator(s):
math Effect se t p LLCI ULCI
-9.3684 .2681 .0678 3.9574 .0001 .1345 .4018
.0000 .3738 .0555 6.7301 .0000 .2643 .4834
9.3684 .4795 .0799 6.0034 .0000 .3220 .6370
Values for quantitative moderators are the mean and plus/minus one SD from mean.
Values for dichotomous moderators are the two values of the moderator.
******************** ANALYSIS NOTES AND WARNINGS *************************
Level of confidence for all confidence intervals in output:
95.00
NOTE: The following variables were mean centered prior to analysis:
socst math
------ END MATRIX -----
Here's how you do it in R (highlights = output to compare)
Read in the data
setwd(dir = "~/Desktop/analysis-examples/Replicating Hayes's PROCESS Models in R/") want_packages <- c("Hmisc", "lavaan", "knitr") have_packages <- want_packages %in% rownames(installed.packages()) if(any(!have_packages)) install.packages(want_packages[!have_packages]) library(Hmisc) egdata <- spss.get(file = "~/Desktop/analysis-examples/Replicating Hayes's PROCESS Models in R/mediation_data.sav", lowernames = TRUE, to.data.frame = TRUE)
Functions used
library()
activates a package in your library. Think of it as taking a package of the shelf.installed.packages()
gives you a list of all installed packagesinstall.packages()
downloads the name of the package you give it (R has so many packages and people develop more every day) and puts it in your library%in%
matches the names of elements in one vector with the names of elements in anotherany()
look at this/these logical vector(s) (TRUES and FALSES): is at least one of the values true?if()
use this for executing functions using if-then logicspss.get()
for reading in SPSS files. I like this function more than theforeign
package'sread.spss()
function just because it has more features
Packages used
- Hmisc is a grab bag of functions for data analysis and management (e.g.,
spss.get()
, but also way more cool stuff)
Center variables
Below I create a function for centering the variables I'll use in my examples. Then I create those centered variables. We know the centering "worked" if the centered variable's mean = 0 but the standard deviation is the same as the non-centered variable.
# function for centering variables center.var <- function(x){ centered <- as.vector(scale(x = x, center = TRUE, scale = FALSE)) return(centered) } # apply function to many variables and name them oldname_c (_c for centered) egdata[,c("math_c", "read_c", "write_c", "socst_c")] <- lapply(X = egdata[,c("math", "read", "write", "socst")], FUN = function(x) center.var(x)) # means of the vars sapply(egdata[,c("math", "math_c", "read", "read_c", "write", "write_c", "socst", "socst_c")], FUN = function(x){ round(mean(x), 2) }, simplify = TRUE, USE.NAMES = TRUE)
## math math_c read read_c write write_c socst socst_c ## 52.65 0.00 52.23 0.00 52.77 0.00 52.41 0.00
# sd of the vars sapply(egdata[,c("math", "math_c", "read", "read_c", "write", "write_c", "socst", "socst_c")], FUN = function(x){ round(sd(x), 2) }, simplify = TRUE, USE.NAMES = TRUE)
## math math_c read read_c write write_c socst socst_c ## 9.37 9.37 10.25 10.25 9.48 9.48 10.74 10.74
Functions used
lapply()
takes a vector and applys a function to each element of that vector; then it returns a list of each outputsapply()
like lapply above by with some more features like simplify (basically trys to make the output prettier) and USE.NAMES (names each output with the names of the input)function()
use this to make your own function
Hayes's PROCESS Model 1 results for moderation (continuous moderator)
The code is tricky at first but I think it's way easier to learn than Stata's or Mplus's (yes, SPSS's AMOS is easier, like Microsoft paint for SEM). I suggest looking over the basic tutorials on lavaan's webpage. For now, fitting and running a model essentially takes 2 steps:
1. write the model using regression syntax (~ means =), making sure to name coefficients (a1*variablename
means the coeff for variablename will be called a1
) and new paramters (indirect := a1 * b1
will give you an output line for a1 x b1 called indirect
)
2. run that model using sem()
or cfa()
, making sure to give the function the name of the dataframe with your variables
The new part is writting a model as a character element and using that as an argument in the lavaan functions. More new parts are the lavaan operators. They're easy to learn, I think.
library(lavaan) # create interaction term between X (socst) and M (math) egdata$socst_c.math_c <- with(egdata, socst_c*math_c) # parameters hayes1 <- '# regressions read ~ b1*socst_c read ~ b2*math_c read ~ b3*socst_c.math_c # mean of centered math (for use in simple slopes) math_c ~ math.mean*1 # variance of centered math (for use in simple slopes) math_c ~~ math.var*math_c # simple slopes SD.below := b1 + b3*(math.mean - sqrt(math.var)) mean := b1 + b3*(math.mean) SD.above := b1 + b3*(math.mean + sqrt(math.var)) ' # fit the model (this takes some time) sem1 <- sem(model = hayes1, data = egdata, se = "bootstrap", bootstrap = 1000) # fit measures summary(sem1, fit.measures = TRUE, standardize = TRUE, rsquare = TRUE)
## lavaan (0.5-20) converged normally after 17 iterations ## ## Number of observations 200 ## ## Estimator ML ## Minimum Function Test Statistic 76.103 ## Degrees of freedom 2 ## P-value (Chi-square) 0.000 ## ## Model test baseline model: ## ## Minimum Function Test Statistic 233.790 ## Degrees of freedom 5 ## P-value 0.000 ## ## User model versus baseline model: ## ## Comparative Fit Index (CFI) 0.676 ## Tucker-Lewis Index (TLI) 0.190 ## ## Loglikelihood and Information Criteria: ## ## Loglikelihood user model (H0) -3354.138 ## Loglikelihood unrestricted model (H1) -3316.086 ## ## Number of free parameters 7 ## Akaike (AIC) 6722.276 ## Bayesian (BIC) 6745.364 ## Sample-size adjusted Bayesian (BIC) 6723.187 ## ## Root Mean Square Error of Approximation: ## ## RMSEA 0.430 ## 90 Percent Confidence Interval 0.351 0.516 ## P-value RMSEA <= 0.05 0.000 ## ## Standardized Root Mean Square Residual: ## ## SRMR 0.180 ## ## Parameter Estimates: ## ## Information Observed ## Standard Errors Bootstrap ## Number of requested bootstrap draws 1000 ## Number of successful bootstrap draws 1000 ## ## Regressions: ## Estimate Std.Err Z-value P(>|z|) Std.lv Std.all ## read ~ ## socst_c (b1) 0.374 0.054 6.974 0.000 0.374 0.437 ## math_c (b2) 0.481 0.062 7.799 0.000 0.481 0.490 ## scst_c.m_ (b3) 0.011 0.005 2.270 0.023 0.011 0.118 ## ## Intercepts: ## Estimate Std.Err Z-value P(>|z|) Std.lv Std.all ## math_c (mth.) -0.000 0.648 -0.000 1.000 -0.000 -0.000 ## read 51.615 0.564 91.462 0.000 51.615 5.628 ## ## Variances: ## Estimate Std.Err Z-value P(>|z|) Std.lv Std.all ## math_c (mth.) 87.329 7.115 12.274 0.000 87.329 1.000 ## read 47.473 4.611 10.295 0.000 47.473 0.564 ## ## R-Square: ## Estimate ## read 0.436 ## ## Defined Parameters: ## Estimate Std.Err Z-value P(>|z|) Std.lv Std.all ## SD.below 0.268 0.067 4.025 0.000 0.268 0.319 ## mean 0.374 0.054 6.929 0.000 0.374 0.437 ## SD.above 0.479 0.075 6.359 0.000 0.479 0.554
# bootstraps parameterEstimates(sem1, boot.ci.type = "bca.simple", level = .95, ci = TRUE, standardized = FALSE)
## lhs op rhs label est ## 1 read ~ socst_c b1 0.374 ## 2 read ~ math_c b2 0.481 ## 3 read ~ socst_c.math_c b3 0.011 ## 4 math_c ~1 math.mean 0.000 ## 5 math_c ~~ math_c math.var 87.329 ## 6 read ~~ read 47.473 ## 7 socst_c ~~ socst_c 114.681 ## 8 socst_c ~~ socst_c.math_c -88.161 ## 9 socst_c.math_c ~~ socst_c.math_c 9184.507 ## 10 read ~1 51.615 ## 11 socst_c ~1 0.000 ## 12 socst_c.math_c ~1 54.489 ## 13 SD.below := b1+b3*(math.mean-sqrt(math.var)) SD.below 0.268 ## 14 mean := b1+b3*(math.mean) mean 0.374 ## 15 SD.above := b1+b3*(math.mean+sqrt(math.var)) SD.above 0.479 ## se z pvalue ci.lower ci.upper ## 1 0.054 6.974 0.000 0.267 0.474 ## 2 0.062 7.799 0.000 0.353 0.601 ## 3 0.005 2.270 0.023 0.002 0.022 ## 4 0.648 0.000 1.000 -1.345 1.210 ## 5 7.115 12.274 0.000 74.678 102.638 ## 6 4.611 10.295 0.000 39.785 59.013 ## 7 0.000 NA NA 114.681 114.681 ## 8 0.000 NA NA -88.161 -88.161 ## 9 0.000 NA NA 9184.507 9184.507 ## 10 0.564 91.462 0.000 50.514 52.725 ## 11 0.000 NA NA 0.000 0.000 ## 12 0.000 NA NA 54.489 54.489 ## 13 0.067 4.025 0.000 0.125 0.402 ## 14 0.054 6.929 0.000 0.268 0.476 ## 15 0.075 6.359 0.000 0.323 0.625
Functions used
with()
I use this function when I don't want to call a dataframe everytime I name a variable (i.e., mydata$myvariable)$
extracts an element from an objectsem()
structural equation model fit functionsummary()
summarises objects like linear models or dataframes and more- 'parameterEstimates()' outputs parameter estimates from an sem() or cfa() model
Packages used
- lavaan badass package for structural equation modeling; it's like MPLUS but free, easier to use, and prettier
Hayes's PROCESS Model 4: "classic" mediation
Here's how you do it in SPSS
PROCESS vars = science read math
/ x = math
/ m = read
/ y = science
/ model = 4
/ center = 1
/ total = 1
/ effsize = 1
/ boot = 1000.
/ x = math
/ m = read
/ y = science
/ model = 4
/ center = 1
/ total = 1
/ effsize = 1
/ boot = 1000.
Run MATRIX procedure:
*************** PROCESS Procedure for SPSS Release 2.16 ******************
Written by Andrew F. Hayes, Ph.D. www.afhayes.com
Documentation available in Hayes (2013). www.guilford.com/p/hayes3
**************************************************************************
Model = 4
Y = science
X = math
M = read
Sample size
200
**************************************************************************
Outcome: read
Model Summary
R R-sq MSE F df1 df2 p
.6623 .4386 59.3124 154.6991 1.0000 198.0000 .0000
Model
coeff se t p LLCI ULCI
constant 14.0725 3.1158 4.5165 .0000 7.9281 20.2170
math .7248 .0583 12.4378 .0000 .6099 .8397
**************************************************************************
Outcome: science
Model Summary
R R-sq MSE F df1 df2 p
.6915 .4782 51.6688 90.2743 2.0000 197.0000 .0000
Model
coeff se t p LLCI ULCI
constant 11.6155 3.0543 3.8030 .0002 5.5923 17.6387
read .3654 .0663 5.5091 .0000 .2346 .4962
math .4017 .0726 5.5339 .0000 .2586 .5449
************************** TOTAL EFFECT MODEL ****************************
Outcome: science
Model Summary
R R-sq MSE F df1 df2 p
.6307 .3978 59.3280 130.8077 1.0000 198.0000 .0000
Model
coeff se t p LLCI ULCI
constant 16.7579 3.1162 5.3776 .0000 10.6126 22.9032
math .6666 .0583 11.4371 .0000 .5516 .7815
***************** TOTAL, DIRECT, AND INDIRECT EFFECTS ********************
Total effect of X on Y
Effect SE t p LLCI ULCI
.6666 .0583 11.4371 .0000 .5516 .7815
Direct effect of X on Y
Effect SE t p LLCI ULCI
.4017 .0726 5.5339 .0000 .2586 .5449
Indirect effect of X on Y
Effect Boot SE BootLLCI BootULCI
read .2649 .0550 .1590 .3760
Partially standardized indirect effect of X on Y
Effect Boot SE BootLLCI BootULCI
read .0268 .0054 .0158 .0378
Completely standardized indirect effect of X on Y
Effect Boot SE BootLLCI BootULCI
read .2506 .0519 .1482 .3537
Ratio of indirect to total effect of X on Y
Effect Boot SE BootLLCI BootULCI
read .3973 .0962 .2090 .5989
Ratio of indirect to direct effect of X on Y
Effect Boot SE BootLLCI BootULCI
read .6593 .3179 .2642 1.4929
R-squared mediation effect size (R-sq_med)
Effect Boot SE BootLLCI BootULCI
read .3167 .0396 .2421 .4000
******************** ANALYSIS NOTES AND WARNINGS *************************
Number of bootstrap samples for bias corrected bootstrap confidence intervals:
1000
Level of confidence for all confidence intervals in output:
95.00
NOTE: Kappa-squared is disabled from output as of version 2.16.
------ END MATRIX -----
Here's how you do it in R
Reproducing Hayes's PROCESS Model 4 results for "classic" mediation
# parameters hayes4 <- ' # direct effect science ~ c*math_c direct := c # regressions read_c ~ a*math_c science ~ b*read_c # indirect effect (a*b) indirect := a*b # total effect total := c + (a*b)' # fit model sem2 <- sem(model = hayes4, data = egdata, se = "bootstrap", bootstrap = 1000) # fit measures summary(sem2, fit.measures = TRUE, standardize = TRUE, rsquare = TRUE)
## lavaan (0.5-20) converged normally after 14 iterations ## ## Number of observations 200 ## ## Estimator ML ## Minimum Function Test Statistic 0.000 ## Degrees of freedom 0 ## ## Model test baseline model: ## ## Minimum Function Test Statistic 245.569 ## Degrees of freedom 3 ## P-value 0.000 ## ## User model versus baseline model: ## ## Comparative Fit Index (CFI) 1.000 ## Tucker-Lewis Index (TLI) 1.000 ## ## Loglikelihood and Information Criteria: ## ## Loglikelihood user model (H0) -2098.582 ## Loglikelihood unrestricted model (H1) -2098.582 ## ## Number of free parameters 5 ## Akaike (AIC) 4207.164 ## Bayesian (BIC) 4223.656 ## Sample-size adjusted Bayesian (BIC) 4207.816 ## ## Root Mean Square Error of Approximation: ## ## RMSEA 0.000 ## 90 Percent Confidence Interval 0.000 0.000 ## P-value RMSEA <= 0.05 1.000 ## ## Standardized Root Mean Square Residual: ## ## SRMR 0.000 ## ## Parameter Estimates: ## ## Information Observed ## Standard Errors Bootstrap ## Number of requested bootstrap draws 1000 ## Number of successful bootstrap draws 997 ## ## Regressions: ## Estimate Std.Err Z-value P(>|z|) Std.lv Std.all ## science ~ ## math_c (c) 0.402 0.084 4.771 0.000 0.402 0.380 ## read_c ~ ## math_c (a) 0.725 0.058 12.502 0.000 0.725 0.662 ## science ~ ## read_c (b) 0.365 0.076 4.818 0.000 0.365 0.378 ## ## Variances: ## Estimate Std.Err Z-value P(>|z|) Std.lv Std.all ## science 50.894 4.937 10.309 0.000 50.894 0.522 ## read_c 58.719 5.853 10.033 0.000 58.719 0.561 ## ## R-Square: ## Estimate ## science 0.478 ## read_c 0.439 ## ## Defined Parameters: ## Estimate Std.Err Z-value P(>|z|) Std.lv Std.all ## direct 0.402 0.084 4.768 0.000 0.402 0.380 ## indirect 0.265 0.055 4.773 0.000 0.265 0.251 ## total 0.667 0.058 11.418 0.000 0.667 0.631
# bootstraps parameterEstimates(sem2, boot.ci.type = "bca.simple", level = .95, ci = TRUE, standardized = FALSE)
## lhs op rhs label est se z pvalue ci.lower ## 1 science ~ math_c c 0.402 0.084 4.771 0 0.255 ## 2 read_c ~ math_c a 0.725 0.058 12.502 0 0.609 ## 3 science ~ read_c b 0.365 0.076 4.818 0 0.216 ## 4 science ~~ science 50.894 4.937 10.309 0 42.729 ## 5 read_c ~~ read_c 58.719 5.853 10.033 0 48.068 ## 6 math_c ~~ math_c 87.329 0.000 NA NA 87.329 ## 7 direct := c direct 0.402 0.084 4.768 0 0.255 ## 8 indirect := a*b indirect 0.265 0.055 4.773 0 0.156 ## 9 total := c+(a*b) total 0.667 0.058 11.418 0 0.562 ## ci.upper ## 1 0.573 ## 2 0.833 ## 3 0.511 ## 4 63.073 ## 5 71.236 ## 6 87.329 ## 7 0.573 ## 8 0.375 ## 9 0.790
Hayes's PROCESS Model 7: moderated mediation with a continuous moderator
Here's how you do it in SPSS
PROCESS vars = science read math write
/ x = math
/ m = read
/ w = write
/ y = science
/ model = 7
/ center = 1
/ total = 1
/ boot = 1000.
/ x = math
/ m = read
/ w = write
/ y = science
/ model = 7
/ center = 1
/ total = 1
/ boot = 1000.
Run MATRIX procedure:
*************** PROCESS Procedure for SPSS Release 2.16 ******************
Written by Andrew F. Hayes, Ph.D. www.afhayes.com
Documentation available in Hayes (2013). www.guilford.com/p/hayes3
**************************************************************************
Model = 7
Y = science
X = math
M = read
W = write
Sample size
200
**************************************************************************
Outcome: read
Model Summary
R R-sq MSE F df1 df2 p
.7048 .4968 53.7100 64.4961 3.0000 196.0000 .0000
Model
coeff se t p LLCI ULCI
constant 51.9853 .6362 81.7178 .0000 50.7307 53.2399
math .5075 .0728 6.9667 .0000 .3638 .6511
write .3403 .0720 4.7292 .0000 .1984 .4823
int_1 .0045 .0068 .6633 .5079 -.0089 .0178
Product terms key:
int_1 math X write
**************************************************************************
Outcome: science
Model Summary
R R-sq MSE F df1 df2 p
.6915 .4782 51.6688 90.2743 2.0000 197.0000 .0000
Model
coeff se t p LLCI ULCI
constant 32.7641 3.5015 9.3572 .0000 25.8589 39.6693
read .3654 .0663 5.5091 .0000 .2346 .4962
math .4017 .0726 5.5339 .0000 .2586 .5449
******************** DIRECT AND INDIRECT EFFECTS *************************
Direct effect of X on Y
Effect SE t p LLCI ULCI
.4017 .0726 5.5339 .0000 .2586 .5449
Conditional indirect effect(s) of X on Y at values of the moderator(s):
Mediator
write Effect Boot SE BootLLCI BootULCI
read -9.4786 .1699 .0525 .0843 .2993
read .0000 .1854 .0435 .1099 .2847
read 9.4786 .2010 .0441 .1231 .3046
Values for quantitative moderators are the mean and plus/minus one SD from mean.
Values for dichotomous moderators are the two values of the moderator.
******************** INDEX OF MODERATED MEDIATION ************************
Mediator
Index SE(Boot) BootLLCI BootULCI
read .0016 .0023 -.0034 .0059
******************** ANALYSIS NOTES AND WARNINGS *************************
Number of bootstrap samples for bias corrected bootstrap confidence intervals:
1000
Level of confidence for all confidence intervals in output:
95.00
NOTE: The following variables were mean centered prior to analysis:
math write
------ END MATRIX -----
Here's how you do it in R
Note: cdash = direct effect (c')
Reproducing Hayes's PROCESS Model 7 results for moderated mediation
# create interaction term between X (math) and W (write) egdata$math_c.write_c <- with(egdata, math_c*write_c) hayes7 <- ' # regressions read_c ~ a1*math_c science ~ b1*read_c read_c ~ a2*write_c read_c ~ a3*math_c.write_c science ~ cdash*math_c # mean of centered write (for use in simple slopes) write_c ~ write.mean*1 # variance of centered write (for use in simple slopes) write_c ~~ write.var*write_c # indirect effects conditional on moderator (a1 + a3*a2.value)*b1 indirect.SDbelow := a1*b1 + a3*-sqrt(write.var)*b1 indirect.mean := a1*b1 + a3*write.mean*b1 indirect.SDabove := a1*b1 + a3*sqrt(write.var)*b1' # fit model sem3 <- sem(model = hayes7, data = egdata, se = "bootstrap", bootstrap = 1000) # fit measures summary(sem3, fit.measures = TRUE, standardize = TRUE, rsquare = TRUE)
## lavaan (0.5-20) converged normally after 26 iterations ## ## Number of observations 200 ## ## Estimator ML ## Minimum Function Test Statistic 119.179 ## Degrees of freedom 4 ## P-value (Chi-square) 0.000 ## ## Model test baseline model: ## ## Minimum Function Test Statistic 386.820 ## Degrees of freedom 9 ## P-value 0.000 ## ## User model versus baseline model: ## ## Comparative Fit Index (CFI) 0.695 ## Tucker-Lewis Index (TLI) 0.314 ## ## Loglikelihood and Information Criteria: ## ## Loglikelihood user model (H0) -3978.754 ## Loglikelihood unrestricted model (H1) -3919.164 ## ## Number of free parameters 11 ## Akaike (AIC) 7979.508 ## Bayesian (BIC) 8015.789 ## Sample-size adjusted Bayesian (BIC) 7980.940 ## ## Root Mean Square Error of Approximation: ## ## RMSEA 0.379 ## 90 Percent Confidence Interval 0.323 0.440 ## P-value RMSEA <= 0.05 0.000 ## ## Standardized Root Mean Square Residual: ## ## SRMR 0.199 ## ## Parameter Estimates: ## ## Information Observed ## Standard Errors Bootstrap ## Number of requested bootstrap draws 1000 ## Number of successful bootstrap draws 1000 ## ## Regressions: ## Estimate Std.Err Z-value P(>|z|) Std.lv Std.all ## read_c ~ ## math_c (a1) 0.507 0.073 6.929 0.000 0.507 0.511 ## science ~ ## read_c (b1) 0.365 0.076 4.832 0.000 0.365 0.358 ## read_c ~ ## write_c (a2) 0.340 0.070 4.832 0.000 0.340 0.347 ## mth_c._ (a3) 0.004 0.006 0.737 0.461 0.004 0.039 ## science ~ ## math_c (cdsh) 0.402 0.085 4.753 0.000 0.402 0.397 ## ## Intercepts: ## Estimate Std.Err Z-value P(>|z|) Std.lv Std.all ## write_c (wrt.) 0.000 0.680 0.000 1.000 0.000 0.000 ## read_c -0.245 0.645 -0.379 0.705 -0.245 -0.026 ## science 51.850 0.508 102.017 0.000 51.850 5.477 ## ## Variances: ## Estimate Std.Err Z-value P(>|z|) Std.lv Std.all ## write_c (wrt.) 89.394 7.323 12.207 0.000 89.394 1.000 ## read_c 52.636 4.674 11.261 0.000 52.636 0.612 ## science 50.894 4.810 10.582 0.000 50.894 0.568 ## ## R-Square: ## Estimate ## read_c 0.388 ## science 0.432 ## ## Defined Parameters: ## Estimate Std.Err Z-value P(>|z|) Std.lv Std.all ## indirect.SDblw 0.170 0.050 3.372 0.001 0.170 0.169 ## indirect.mean 0.185 0.041 4.488 0.000 0.185 0.183 ## indirect.SDabv 0.201 0.043 4.715 0.000 0.201 0.197
# bootstraps parameterEstimates(sem3, boot.ci.type = "bca.simple", level = .95, ci = TRUE, standardized = FALSE)
## lhs op rhs label ## 1 read_c ~ math_c a1 ## 2 science ~ read_c b1 ## 3 read_c ~ write_c a2 ## 4 read_c ~ math_c.write_c a3 ## 5 science ~ math_c cdash ## 6 write_c ~1 write.mean ## 7 write_c ~~ write_c write.var ## 8 read_c ~~ read_c ## 9 science ~~ science ## 10 math_c ~~ math_c ## 11 math_c ~~ math_c.write_c ## 12 math_c.write_c ~~ math_c.write_c ## 13 read_c ~1 ## 14 science ~1 ## 15 math_c ~1 ## 16 math_c.write_c ~1 ## 17 indirect.SDbelow := a1*b1+a3*-sqrt(write.var)*b1 indirect.SDbelow ## 18 indirect.mean := a1*b1+a3*write.mean*b1 indirect.mean ## 19 indirect.SDabove := a1*b1+a3*sqrt(write.var)*b1 indirect.SDabove ## est se z pvalue ci.lower ci.upper ## 1 0.507 0.073 6.929 0.000 0.359 0.647 ## 2 0.365 0.076 4.832 0.000 0.213 0.521 ## 3 0.340 0.070 4.832 0.000 0.202 0.481 ## 4 0.004 0.006 0.737 0.461 -0.008 0.016 ## 5 0.402 0.085 4.753 0.000 0.232 0.578 ## 6 0.000 0.680 0.000 1.000 -1.439 1.201 ## 7 89.394 7.323 12.207 0.000 77.074 106.572 ## 8 52.636 4.674 11.261 0.000 44.066 62.911 ## 9 50.894 4.810 10.582 0.000 42.479 61.583 ## 10 87.329 0.000 NA NA 87.329 87.329 ## 11 91.757 0.000 NA NA 91.757 91.757 ## 12 6358.530 0.000 NA NA 6358.530 6358.530 ## 13 -0.245 0.645 -0.379 0.705 -1.430 1.138 ## 14 51.850 0.508 102.017 0.000 50.885 52.868 ## 15 0.000 0.000 NA NA 0.000 0.000 ## 16 54.555 0.000 NA NA 54.555 54.555 ## 17 0.170 0.050 3.372 0.001 0.085 0.287 ## 18 0.185 0.041 4.488 0.000 0.110 0.270 ## 19 0.201 0.043 4.715 0.000 0.125 0.299
For now I'm too lazy to explore assumptions or diagnostics or interpretations. Of course, whether I "ran these analyses right" depends on the data meeting assumptions, but my goal here was simpler than that: reproduce Hayes's macro outputs. Win.
If you're interested in assumptions or diagnostics or interpretations of these models—I think you should be—then I recommend checking out Hayes's book. Also, if you haven't guessed, these models are SEM models; any good material on SEM = good material for learning when and how to run and interpret these kinds of models. The lavaan people recommend Alex Beaujean's book: Latent Variable Modeling using R: A Step-By-Step Guide.
I linked Hayes's template for his 76 models above, but here it is again. With some time and some gumption, you should be able to use the spirit of the code in this post to run every (or nearly every) model in that PDF. You can find even more models here. I also suggest visiting lavaan's webpage for really simple (and aesthetically pleasing) tutorials on how to use their package.
Happy R
Papers I cited
- Baron, R. M., & Kenny, D. A. (1986). The moderator-mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51(6), 1173-1182.
- Aiken, L. S., West, S. G., & Reno, R. R. (1991). Multiple regression: Testing and interpreting interactions. New York, NY: Sage.
- Preacher, K. J., & Hayes, A. F. (2008). Asymptotic and resampling strategies for assessing and comparing indirect effects in multiple mediator models. Behavior Research Methods, 40(3), 879-891.
- Hayes, A. F. (2015). An index and test of linear moderated mediation. Multivariate Behavioral Research, 50(1), 1-22.
Update: I forgot Hayes's Index of Moderated Mediation4. What's that?
"The heart of the test is a quantification of the association between an indirect effect and a moderator—an "index of moderated mediation"—followed by an inference as to whether this index is different from zero. I show that if the outcome of this test results in a claim that an indirect effect is moderated, this means that any two conditional indirect effects estimated at different values of the moderator are significantly different from each other."
I ran the model again and pasted some abbreviated code and output below. Again, highlights = output to compare to SPSS above. One interpretation I'll offer here is that one should not conclude that the indirect effect is significantly different at different values of the moderator. Notice that though the "simple slopes" are significantly different than zero at 1 standard deviation above and below the mean, they're not significantly different from each other. This is indexed by the aptly named index of moderated mediation (its CI captured zero), but you could also guess this to be the case by looking at the CIs for each simple slope; their CIs capture each others' estimates even if they don't capture zero.
hayes7 <- ' # regressions read_c ~ a1*math_c ...
# Index of moderated mediation index.mod.med := a3*b1 # indirect effects conditional on moderator (a1 + a3*a2.value)*b1 indirect.SDbelow := a1*b1 + a3*-sqrt(write.var)*b1 indirect.mean := a1*b1 + a3*write.mean*b1 indirect.SDabove := a1*b1 + a3*sqrt(write.var)*b1' # fit model sem3 <- sem(model = hayes7, data = egdata, se = "bootstrap", bootstrap = 1000)
# bootstraps parameterEstimates(sem3, boot.ci.type = "bca.simple", level = .95, ci = TRUE, standardized = FALSE)
## lhs op rhs label
## 1 read_c ~ math_c a1
...
## 17 index.mod.med := a3*b1 index.mod.med ## 18 indirect.SDbelow := a1*b1+a3*-sqrt(write.var)*b1 indirect.SDbelow ## 19 indirect.mean := a1*b1+a3*write.mean*b1 indirect.mean ## 20 indirect.SDabove := a1*b1+a3*sqrt(write.var)*b1 indirect.SDabove ## est se z pvalue ci.lower ci.upper ## 1 0.507 0.074 6.816 0.000 0.358 0.650
...
## 17 0.002 0.002 0.734 0.463 -0.003 0.006
## 18 0.170 0.052 3.290 0.001 0.089 0.297
## 19 0.185 0.042 4.384 0.000 0.119 0.295
## 20 0.201 0.042 4.739 0.000 0.129 0.304
No comments:
Post a Comment