Friday, November 18, 2016

Reproducing Hayes's PROCESS Model 1 with Dichotomous Moderator (in R)

I'm posting a supplement to my earlier post, Reproducing Hayes’s PROCESS Models' results in R.


Here's how you test the effects of a dichotomous moderator (i.e. test the simple slopes as well as the interaction).

First I fake some data.
library(MASS)

# set seed so that you can reproduce the same numbers
set.seed(1)

# 1000 observations
# 500 control, 500 treatment
# positive relationship in with moderator control condition (r = .5); mean = 1
# negative relationship in with moderator control condition (r = -.5); mean = 0

fake <- data.frame(id = 1:1000,
                   cond = rep(x = c("Control","Treatment"), each = 500),
                   rbind(mvrnorm(n = 500, mu = c(1,1), Sigma = diag(x = .5, nrow = 2, ncol = 2) + .5),
                         mvrnorm(n = 500, mu = c(0,0), Sigma = diag(x = 1.5, nrow = 2, ncol = 2) - .5)))

# center X1
fake$X1.c <- with(data = fake, as.numeric(scale(x = X1, center = T, scale = F)))

# effect code condition variable
fake$cond.e <- with(data = fake, ifelse(cond == "Control", -1, 1))

# create numeric interaction variable
fake$cond.X1.int <- with(data = fake, cond.e*X1.c)

Next, I plot the slopes at different levels of the moderator.

library(ggplot2)

## devtools::install_github('bart6114/artyfarty',force=TRUE)
library(artyfarty)

ggplot(data = fake, aes(x = X1.c, y = X2, color = cond)) +
  geom_point() +
  geom_smooth(method = "lm") +
  scale_y_continuous(breaks = seq(-5,5,1), limits = c(-5,5)) +
  labs(x = "X1 (Centered)", y = "X2", color = "Condition") +
  theme_five38() +
  scale_color_manual(values = pal("five38")) +
  theme(legend.position = "top")

Now I run the model. The main difference in the code is how you specify the simple slopes. In my previous post, the code computed simple slopes in a way that many PROCESS users understand: test the effect at the mean of the moderator ± 1 standard deviation.


The code below is actually simpler. You don't need to specify the mean or the standard deviations. All you have to do is test the slope at the values you set for the moderator. In this case, those are -1 and 1.

library(lavaan)

# write model
fake.mod1 <- '# regressions
                X2 ~ b1*X1.c
                X2 ~ b2*cond.e
                X2 ~ b3*cond.X1.int

              # simple slopes
              # cond == -1
                control := b1 + b3*-1

              # cond == 0
                center := b1 + b3*0

              # cond == 1
                treatment := b1 + b3*1'

# fit model
mod1.fit <- sem(model = fake.mod1,
            data = fake,
            se = "bootstrap",
            bootstrap = 1000)

# summarize
summary(mod1.fit,
        fit.measures = TRUE,
        standardize = TRUE,
        rsquare = TRUE)
## lavaan (0.5-22) converged normally after  13 iterations
## 
##   Number of observations                          1000
## 
##   Estimator                                         ML
##   Minimum Function Test Statistic                0.000
##   Degrees of freedom                                 0
## 
## Model test baseline model:
## 
##   Minimum Function Test Statistic              449.669
##   Degrees of freedom                                 3
##   P-value                                        0.000
## 
## User model versus baseline model:
## 
##   Comparative Fit Index (CFI)                    1.000
##   Tucker-Lewis Index (TLI)                       1.000
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -5632.756
##   Loglikelihood unrestricted model (H1)      -5632.756
## 
##   Number of free parameters                          4
##   Akaike (AIC)                               11273.512
##   Bayesian (BIC)                             11293.143
##   Sample-size adjusted Bayesian (BIC)        11280.438
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.000
##   90 Percent Confidence Interval          0.000  0.000
##   P-value RMSEA <= 0.05                             NA
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.000
## 
## Parameter Estimates:
## 
##   Information                                 Observed
##   Standard Errors                            Bootstrap
##   Number of requested bootstrap draws             1000
##   Number of successful bootstrap draws            1000
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   X2 ~                                                                  
##     X1.c      (b1)   -0.004    0.028   -0.137    0.891   -0.004   -0.004
##     cond.e    (b2)   -0.494    0.032  -15.427    0.000   -0.494   -0.438
##     cnd.X1.nt (b3)   -0.454    0.028  -16.423    0.000   -0.454   -0.415
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .X2                0.812    0.034   23.891    0.000    0.812    0.638
## 
## R-Square:
##                    Estimate
##     X2                0.362
## 
## Defined Parameters:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##     control           0.450    0.039   11.418    0.000    0.450    0.411
##     center           -0.004    0.028   -0.137    0.891   -0.004   -0.004
##     treatment        -0.458    0.039  -11.621    0.000   -0.458   -0.419
# bootstrapped estimates
parameterEstimates(mod1.fit,
                   boot.ci.type = "bca.simple",
                   level = .95,
                   ci = TRUE,
                   standardized = FALSE)
##            lhs op         rhs     label    est    se       z pvalue
## 1           X2  ~        X1.c        b1 -0.004 0.028  -0.137  0.891
## 2           X2  ~      cond.e        b2 -0.494 0.032 -15.427  0.000
## 3           X2  ~ cond.X1.int        b3 -0.454 0.028 -16.423  0.000
## 4           X2 ~~          X2            0.812 0.034  23.891  0.000
## 5         X1.c ~~        X1.c            1.326 0.000      NA     NA
## 6         X1.c ~~      cond.e           -0.513 0.000      NA     NA
## 7         X1.c ~~ cond.X1.int           -0.020 0.000      NA     NA
## 8       cond.e ~~      cond.e            1.000 0.000      NA     NA
## 9       cond.e ~~ cond.X1.int            0.000 0.000      NA     NA
## 10 cond.X1.int ~~ cond.X1.int            1.063 0.000      NA     NA
## 11     control :=    b1+b3*-1   control  0.450 0.039  11.418  0.000
## 12      center :=     b1+b3*0    center -0.004 0.028  -0.137  0.891
## 13   treatment :=     b1+b3*1 treatment -0.458 0.039 -11.621  0.000
##    ci.lower ci.upper
## 1    -0.059    0.049
## 2    -0.563   -0.433
## 3    -0.509   -0.401
## 4     0.750    0.886
## 5     1.326    1.326
## 6    -0.513   -0.513
## 7    -0.020   -0.020
## 8     1.000    1.000
## 9     0.000    0.000
## 10    1.063    1.063
## 11    0.371    0.524
## 12   -0.059    0.049
## 13   -0.534   -0.378

Next, I write the data to a csv file that I'll use to run the analyses in SPSS using Hayes's macro.

# write data to file
setwd("~/Desktop/analysis-examples/Reproducing Hayes's Model 1 with a Dichotomous Moderator/")
write.csv(x = fake, file = "fake.csv", row.names = FALSE)

I read the csv as text using SPSS v21.

GET DATA  /TYPE=TXT
  /FILE="/Users/nicholasmmichalak/Desktop/analysis-examples/Reproducing Hayes's Model 1 with a "+
    "Dichotomous Moderator/fake.csv"
  /ENCODING='Locale'
  /DELCASE=LINE
  /DELIMITERS=","
  /QUALIFIER='"'
  /ARRANGEMENT=DELIMITED
  /FIRSTCASE=2
  /IMPORTCASE=ALL
  /VARIABLES=
  id F3.0
  cond A7
  X1 F19
  X2 F19
  X1.c F19
  cond.e F2.0
  cond.X1.int F19.
CACHE.
EXECUTE.
DATASET NAME DataSet1 WINDOW=FRONT.

Then I run the syntax that comes with PROCESS (the .sps file that comes with the download here).

Then I can run the code below using the syntax editor. The formula for Model 1 is here.

process vars = X1.c cond.e X2
/ y = X2
/ x = X1.c
/ m = cond.e
/ model = 1
/ boot = 1000.




Run MATRIX procedure:

************** PROCESS Procedure for SPSS Release 2.16.1 *****************

          Written by Andrew F. Hayes, Ph.D.       www.afhayes.com
    Documentation available in Hayes (2013). www.guilford.com/p/hayes3

**************************************************************************
Model = 1
    Y = X2
    X = X1.c
    M = cond.e

Sample size
       1000

**************************************************************************
Outcome: X2

Model Summary
          R       R-sq        MSE          F        df1        df2          p
      .6018      .3622      .8155   188.5075     3.0000   996.0000      .0000

Model
              coeff         se          t          p       LLCI       ULCI
constant      .2718      .0319     8.5216      .0000      .2092      .3344
cond.e       -.4942      .0319   -15.4958      .0000     -.5568     -.4316
X1.c         -.0038      .0277     -.1387      .8897     -.0582      .0505
int_1        -.4538      .0277   -16.3823      .0000     -.5082     -.3995

Product terms key:

 int_1    X1.c        X     cond.e

R-square increase due to interaction(s):
         R2-chng          F        df1        df2          p
int_1      .1719   268.3809     1.0000   996.0000      .0000

*************************************************************************

Conditional effect of X on Y at values of the moderator(s):
     cond.e     Effect         se          t          p       LLCI       ULCI
    -1.0000      .4500      .0388    11.5979      .0000      .3738      .5261
     1.0000     -.4577      .0396   -11.5716      .0000     -.5353     -.3800

Values for quantitative moderators are the mean and plus/minus one SD from mean.
Values for dichotomous moderators are the two values of the moderator.

******************** ANALYSIS NOTES AND WARNINGS *************************

Level of confidence for all confidence intervals in output:
    95.00

------ END MATRIX -----


That's it!

Happy R,

Nick

No comments:

Post a Comment