Novum R-ganum: Computing reliability coefficients (like Cronbach's Alpha) in R and SPSS

This post is about measurement reliability. I think it'll be useful to define some concepts before I start my usual tutorial.

First, we should expect that any test or scale that supposedly measures something real will contain measurement error. Formally, a measurement score boils down to "true" stuff plus error stuff.

In the above equation, X is the observed score, T is the true score, and ε is measurement error.

Reliability is, in a nutshell, the proportion of observed variance that is true variance.

We don't know the true score or the true variance, so we have to collect data and do our best maths to estimate reliability. One way to do so is pretty intuitive: give people your measurement more than once.

To me, this idea is really neat. Measure people twice, X1 and X2, and assume that the true score stays the same and that measurement error is independent between time points (i.e., ε at time 1 doesn't correlate with ε at time 2).

Design and run a study or an experiment where it's reasonable to make these assumptions and you can be confident that what X1 and X2 have in common is the true score.

That is, the correlation between X1 and X2 is reliability. Pretty sick.

Even cooler, in my opinion, is that you can use the correlations among many related measurement items to estimate how their composite (e.g., summing or averaging them) will correlate with a composite of a similar number of items or a composite of the same items measured at another time.

This is the gist of coefficient alpha (i.e., Cronbach's alpha), which changes as a function of the average inter-item correlation and the number of items. More items and greater inter-item correlations result in larger alphas.

Source: LECTURE NOTES #13: Reliability, Structural Equation Modeling, and Hierarchical
Modeling by Rich Gonzalez

To demonstrate how to compute coefficient alpha (and the intraclass correlation coefficient, and an alternative coefficient, omega), I'll use data that comes with the psych package in R. I'll load that first.

Before I get into it, keep these points in mind:

I've highlighted values to compare in the R and SPSS outputs.
SPSS doesn't compute confidence intervals for coefficient alpha directly. When you ask for the ICC, it'll give you CIs around an estimate, which is equal to coefficient alpha.
SPSS hasn't implemented the alternative coefficient I'll discuss, omega.
There are many ways to compute omega—it's complicated.

Here's how you do it in R: Coefficient Alpha

First, I'll load the psych and knitr packages.

# set working directory
setwd(dir = "~/Desktop/analysis-examples/Cronbach's alpha and ICC in R and SPSS/")

# packages I want
want_packages <- c("psych", "knitr")

# packages that are already downloaded in my library
have_packages   <- want_packages %in% rownames(installed.packages())

# if I don't have these packages, download the ones I don't have
if(any(!have_packages)) install.packages(want_packages[!have_packages])

# load psych package from library
library(psych)
library(knitr)

25 Personality items representing 5 factors

25 personality self report items taken from the International Personality Item Pool (ipip.ori.org) were included as part of the Synthetic Aperture Personality Assessment (SAPA) web based personality assessment project. The data from 2800 subjects are included here as a demonstration set for scale construction, factor analysis, and Item Response Theory analysis. Three additional demographic variables (sex, education, and age) are also included.

# load data
data("bfi")

# example from psych package help page on bfi
keys.list <- list(agree = c("-A1","A2","A3","A4","A5"),
                  conscientious = c("C1","C2","C3","-C4","-C5"),
                  extraversion = c("-E1","-E2","E3","E4","E5"),
                  neuroticism = c("N1","N2","N3","N4","N5"),
                  openness = c("O1","-O2","O3","O4","-O5"))

keys <- make.keys(bfi,keys.list)

# using knit's kable function to make the output prettier
keys <- data.frame(fa.lookup(f = keys,
          dictionary = bfi.dictionary[,1:4])
)
keys$variable <- row.names(keys)
row.names(keys) <- NULL

Cronbach's alpha "from scratch"

alpha.f <- function(mat){
  r11 = mean(mat[lower.tri(mat)])
  alpha = ncol(mat) * r11 / (1 + (ncol(mat) - 1) * r11)
  return(alpha)
}

alpha.f(cor(reverse.code(bfi[,11:15],keys = c(-1,-1,1,1,1)),use = "complete.obs"))

## [1] 0.7609641

Cronbach's alpha from the `psych` package

# select the extraversion items and save them as an object
extraversion.items <- names(bfi)[11:15]

# alpha with 5000 bootstraps
psych::alpha(x = bfi[,colnames(bfi) %in% extraversion.items],
             keys = c(-1, -1, 1, 1, 1),
             title = "Cronbach's alpha for 5 extraversion items",
             n.iter = 5000)

## 
## Reliability analysis  Cronbach's alpha for 5 extraversion items  
## Call: psych::alpha(x = bfi[, colnames(bfi) %in% extraversion.items], 
##     keys = c(-1, -1, 1, 1, 1), title = "Cronbach's alpha for 5 extraversion items", 
##     n.iter = 5000)
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd
##       0.76      0.76    0.73      0.39 3.2 0.007  4.1 1.1
## 
##  lower alpha upper     95% confidence boundaries
## 0.75 0.76 0.78 
## 
##  lower median upper bootstrapped confidence intervals
##  0.75 0.76 0.78
##  Reliability if an item is dropped:
##     raw_alpha std.alpha G6(smc) average_r S/N alpha se
## E1-      0.73      0.73    0.67      0.40 2.6   0.0084
## E2-      0.69      0.69    0.63      0.36 2.3   0.0095
## E3       0.73      0.73    0.67      0.40 2.7   0.0082
## E4       0.70      0.70    0.65      0.37 2.4   0.0091
## E5       0.74      0.74    0.69      0.42 2.9   0.0078
## 
##  Item statistics 
##        n raw.r std.r r.cor r.drop mean  sd
## E1- 2777  0.72  0.70  0.59   0.52  4.0 1.6
## E2- 2784  0.78  0.76  0.69   0.61  3.9 1.6
## E3  2775  0.68  0.70  0.58   0.50  4.0 1.4
## E4  2791  0.75  0.75  0.66   0.58  4.4 1.5
## E5  2779  0.64  0.66  0.52   0.45  4.4 1.3
## 
## Non missing response frequency for each item
##       1    2    3    4    5    6 miss
## E1 0.24 0.23 0.15 0.16 0.13 0.09 0.01
## E2 0.19 0.24 0.12 0.22 0.14 0.09 0.01
## E3 0.05 0.11 0.15 0.30 0.27 0.13 0.01
## E4 0.05 0.09 0.10 0.16 0.34 0.26 0.00
## E5 0.03 0.08 0.10 0.22 0.34 0.22 0.01

Informal test of unidimensionality (principal components analysis)

# checking for dimensionality with principal component analysis
psych::principal(r = bfi[,colnames(bfi) %in% extraversion.items])

## Principal Components Analysis
## Call: psych::principal(r = bfi[, colnames(bfi) %in% extraversion.items])
## Standardized loadings (pattern matrix) based upon correlation matrix
##      PC1   h2   u2 com
## E1 -0.70 0.49 0.51   1
## E2 -0.78 0.61 0.39   1
## E3  0.69 0.48 0.52   1
## E4  0.76 0.57 0.43   1
## E5  0.64 0.41 0.59   1
## 
##                 PC1
## SS loadings    2.57
## Proportion Var 0.51
## 
## Mean item complexity =  1
## Test of the hypothesis that 1 component is sufficient.
## 
## The root mean square of the residuals (RMSR) is  0.13 
##  with the empirical chi square  895.79  with prob <  2.2e-191 
## 
## Fit based upon off diagonal values = 0.9

# eigen values
psych::principal(r = bfi[,colnames(bfi) %in% extraversion.items])$values

## [1] 2.5681044 0.7640221 0.6410349 0.5620838 0.4647548

# percent variance explained
psych::principal(r = bfi[,colnames(bfi) %in% extraversion.items])$values /
  sum(psych::principal(r = bfi[,colnames(bfi) %in% extraversion.items])$values)

## [1] 0.51362088 0.15280441 0.12820699 0.11241677 0.09295096

Here's how you do it in SPSS: Coefficient Alpha

Load the BFI dataset from R.

* load BFI data

GET DATA  /TYPE=TXT
  /FILE="/Users/nicholasmmichalak/Desktop/analysis-examples/Cronbach's alpha and ICC in R and SPSS/bfi.spss.csv"  /ENCODING='Locale'  /DELCASE=LINE  /DELIMITERS=","  /QUALIFIER='"'  /ARRANGEMENT=DELIMITED  /FIRSTCASE=2  /IMPORTCASE=ALL  /VARIABLES=  A1 F4.0  A2 F4.0  A3 F1.0  A4 F4.0  A5 F1.0  C1 F4.0  C2 F4.0  C3 F4.0  C4 F1.0  C5 F1.0  E1 F1.0  E2 F4.0  E3 F4.0  E4 F1.0  E5 F4.0  N1 F4.0  N2 F4.0  N3 F1.0  N4 F4.0  N5 F4.0  O1 F1.0  O2 F1.0  O3 F1.0  O4 F1.0  O5 F1.0  gender F1.0  education F4.0  age F2.0.MISSING VALUES A1 TO age (9999).CACHE.EXECUTE.DATASET NAME DataSet1 WINDOW=FRONT.

* reverse code first two items

COMPUTE E1.rev = abs(E1 - 7).
COMPUTE E2.rev = abs(E2 - 7).
EXECUTE.

* cronbachs alpha with ICC for 95% confidence intervals

RELIABILITY
  /VARIABLES=E1.rev E2.rev E3 E4 E5
  /SCALE("Cronbach's alpha for 5 extraversion items") ALL
  /MODEL=ALPHA
  /STATISTICS=DESCRIPTIVE SCALE CORR
  /SUMMARY=TOTAL MEANS VARIANCE CORR
  /ICC=MODEL(RANDOM) TYPE(CONSISTENCY) CIN=95 TESTVAL=0.

Case Processing Summary
		N	%
Cases	Valid	2713	96.9
	Excluded^a	87	3.1
	Total	2800	100.0
a. Listwise deletion based on all variables in the procedure.

Reliability Statistics
Cronbach's Alpha	Cronbach's Alpha Based on Standardized Items	N of Items
.761	.761	5

Item Statistics
	Mean	Std. Deviation	N
E1.rev	4.0284	1.63241	2713
E2.rev	3.8555	1.60720	2713
E3	4.0000	1.35237	2713
E4	4.4209	1.46131	2713
E5	4.4184	1.33676	2713

Inter-Item Correlation Matrix
	E1.rev	E2.rev	E3	E4	E5
E1.rev	1.000	.468	.320	.419	.298
E2.rev	.468	1.000	.377	.514	.377
E3	.320	.377	1.000	.418	.383
E4	.419	.514	.418	1.000	.316
E5	.298	.377	.383	.316	1.000

Summary Item Statistics
	Mean	Minimum	Maximum	Range	Maximum / Minimum	Variance	N of Items
Item Means	4.145	3.856	4.421	.565	1.147	.067	5
Item Variances	2.200	1.787	2.665	.878	1.491	.169	5
Inter-Item Correlations	.389	.298	.514	.216	1.723	.004	5

Item-Total Statistics
	Scale Mean if Item Deleted	Scale Variance if Item Deleted	Corrected Item-Total Correlation	Squared Multiple Correlation	Cronbach's Alpha if Item Deleted
E1.rev	16.6948	18.280	.513	.278	.725
E2.rev	16.8677	17.399	.606	.380	.688
E3	16.7232	20.196	.501	.266	.728
E4	16.3022	18.678	.578	.351	.701
E5	16.3048	20.784	.455	.222	.742

Scale Statistics
Mean	Variance	Std. Deviation	N of Items
20.7232	28.113	5.30212	5

Intraclass Correlation Coefficient
	Intraclass Correlation^b	95% Confidence Interval		F Test with True Value 0
	Intraclass Correlation^b	Lower Bound	Upper Bound	Value	df1	df2	Sig
Single Measures	.389^a	.371	.408	4.183	2712	10848	.000
Average Measures	.761	.746	.775	4.183	2712	10848	.000
Two-way random effects model where both people effects and measures effects are random.
a. The estimator is the same, whether the interaction effect is present or not.
b. Type C intraclass correlation coefficients using a consistency definition-the between-measure variance is excluded from the denominator variance.

* factor analysis for informally testing unidimensionality

FACTOR
/VARIABLES E1.rev E2.rev E3 E4 E5
/FORMAT SORT.

Communalities
	Initial	Extraction
E1.rev	1.000	.490
E2.rev	1.000	.608
E3	1.000	.477
E4	1.000	.575
E5	1.000	.414
Extraction Method: Principal Component Analysis.

Total Variance Explained
Component	Initial Eigenvalues			Extraction Sums of Squared Loadings
Component	Total	% of Variance	Cumulative %	Total	% of Variance	Cumulative %
1	2.565	51.298	51.298	2.565	51.298	51.298
2	.768	15.368	66.666
3	.643	12.851	79.517
4	.561	11.211	90.728
5	.464	9.272	100.000
Extraction Method: Principal Component Analysis.

Component Matrix^a
	Component
	1
E2.rev	.780
E4	.758
E1.rev	.700
E3	.691
E5	.644
Extraction Method: Principal Component Analysis.
a. 1 components extracted.

Rotated Component Matrix^a

a. Only one component was extracted. The solution cannot be rotated.

Here's how you do it in SPSS: Intraclass Correlation Coefficient

Here's a great resource for the ICC: Computing Intraclass Correlations (ICC) as Estimates of Interrater Reliability in SPSS

* load Shrout and Fleiss data

GET DATA  /TYPE=TXT
  /FILE="/Users/nicholasmmichalak/Desktop/analysis-examples/Cronbach's alpha and ICC in R and SPSS/sf.csv"
  /ENCODING='Locale'
  /DELCASE=LINE
  /DELIMITERS=","
  /QUALIFIER='"'
  /ARRANGEMENT=DELIMITED
  /FIRSTCASE=2
  /IMPORTCASE=ALL
  /VARIABLES=
subj F2.0
  J1 F2.0
  J2 F1.0
  J3 F1.0
  J4 F1.0.
CACHE.
EXECUTE.
DATASET NAME DataSet2 WINDOW=FRONT.

* ICC 2-way random effects and averages

RELIABILITY
  /VARIABLES=J1 J2 J3 J4
  /SCALE('ALL VARIABLES') ALL
  /MODEL=ALPHA
  /STATISTICS=DESCRIPTIVE SCALE CORR
  /SUMMARY=TOTAL MEANS VARIANCE CORR
  /ICC=MODEL(RANDOM) TYPE(CONSISTENCY) CIN=95 TESTVAL=0.

Case Processing Summary
		N	%
Cases	Valid	6	100.0
	Excluded^a	0	.0
	Total	6	100.0
a. Listwise deletion based on all variables in the procedure.

Reliability Statistics
Cronbach's Alpha	Cronbach's Alpha Based on Standardized Items	N of Items
.909	.927	4

Item Statistics
	Mean	Std. Deviation	N
J1	7.67	1.633	6
J2	2.50	1.643	6
J3	4.33	1.633	6
J4	6.67	2.503	6

Inter-Item Correlation Matrix
	J1	J2	J3	J4
J1	1.000	.745	.725	.750
J2	.745	1.000	.894	.729
J3	.725	.894	1.000	.718
J4	.750	.729	.718	1.000

Summary Item Statistics
	Mean	Minimum	Maximum	Range	Maximum / Minimum	Variance	N of Items
Item Means	5.292	2.500	7.667	5.167	3.067	5.414	4
Item Variances	3.575	2.667	6.267	3.600	2.350	3.220	4
Inter-Item Correlations	.760	.718	.894	.177	1.246	.004	4

Item-Total Statistics
	Scale Mean if Item Deleted	Scale Variance if Item Deleted	Corrected Item-Total Correlation	Squared Multiple Correlation	Cronbach's Alpha if Item Deleted
J1	13.50	28.300	.806	.651	.883
J2	18.67	27.467	.859	.824	.867
J3	16.83	27.767	.844	.812	.872
J4	14.50	20.700	.790	.635	.918

Scale Statistics
Mean	Variance	Std. Deviation	N of Items
21.17	44.967	6.706	4

Intraclass Correlation Coefficient
	Intraclass Correlation^b	95% Confidence Interval		F Test with True Value 0
	Intraclass Correlation^b	Lower Bound	Upper Bound	Value	df1	df2	Sig
Single Measures	.715^a	.342	.946	11.027	5	15	.000
Average Measures	.909	.676	.986	11.027	5	15	.000
Two-way random effects model where both people effects and measures effects are random.
a. The estimator is the same, whether the interaction effect is present or not.
b. Type C intraclass correlation coefficients using a consistency definition-the between-measure variance is excluded from the denominator variance.

* ICC 2-way random effects and absolute agreement

RELIABILITY
  /VARIABLES=J1 J2 J3 J4
  /SCALE('ALL VARIABLES') ALL
  /ICC=MODEL(RANDOM) TYPE(ABSOLUTE) CIN=95 TESTVAL=0.

Case Processing Summary
		N	%
Cases	Valid	6	100.0
	Excluded^a	0	.0
	Total	6	100.0
a. Listwise deletion based on all variables in the procedure.

Reliability Statistics
Cronbach's Alpha	N of Items
.909	4

Intraclass Correlation Coefficient
	Intraclass Correlation^b	95% Confidence Interval		F Test with True Value 0
	Intraclass Correlation^b	Lower Bound	Upper Bound	Value	df1	df2	Sig
Single Measures	.290^a	.019	.761	11.027	5	15	.000
Average Measures	.620	.071	.927	11.027	5	15	.000
Two-way random effects model where both people effects and measures effects are random.
a. The estimator is the same, whether the interaction effect is present or not.
b. Type A intraclass correlation coefficients using an absolute agreement definition.

* ICC 1-way random effects and averages

RELIABILITY
  /VARIABLES=J1 J2 J3 J4
  /SCALE('ALL VARIABLES') ALL
  /ICC=MODEL(ONEWAY) TYPE(CONSISTENCY) CIN=95 TESTVAL=0.

Case Processing Summary
		N	%
Cases	Valid	6	100.0
	Excluded^a	0	.0
	Total	6	100.0
a. Listwise deletion based on all variables in the procedure.

Reliability Statistics
Cronbach's Alpha	N of Items
.909	4

Intraclass Correlation Coefficient
	Intraclass Correlation	95% Confidence Interval		F Test with True Value 0
	Intraclass Correlation	Lower Bound	Upper Bound	Value	df1	df2	Sig
Single Measures	.166	-.133	.723	1.795	5	18	.165
Average Measures	.443	-.884	.912	1.795	5	18	.165
One-way random effects model where people effects are random.

* ICC 1-way random effects and absolute agreement

RELIABILITY
  /VARIABLES=J1 J2 J3 J4
  /SCALE('ALL VARIABLES') ALL
  /ICC=MODEL(ONEWAY) TYPE(ABSOLUTE) CIN=95 TESTVAL=0.

Case Processing Summary
		N	%
Cases	Valid	6	100.0
	Excluded^a	0	.0
	Total	6	100.0
a. Listwise deletion based on all variables in the procedure.

Reliability Statistics
Cronbach's Alpha	N of Items
.909	4

Intraclass Correlation Coefficient
	Intraclass Correlation	95% Confidence Interval		F Test with True Value 0
	Intraclass Correlation	Lower Bound	Upper Bound	Value	df1	df2	Sig
Single Measures	.166	-.133	.723	1.795	5	18	.165
Average Measures	.443	-.884	.912	1.795	5	18	.165
One-way random effects model where people effects are random.

* ICC mixed effects and averages

RELIABILITY
  /VARIABLES=J1 J2 J3 J4
  /SCALE('ALL VARIABLES') ALL
  /ICC=MODEL(MIXED) TYPE(CONSISTENCY) CIN=95 TESTVAL=0.

Case Processing Summary
		N	%
Cases	Valid	6	100.0
	Excluded^a	0	.0
	Total	6	100.0
a. Listwise deletion based on all variables in the procedure.

Reliability Statistics
Cronbach's Alpha	N of Items
.909	4

Intraclass Correlation Coefficient
	Intraclass Correlation^b	95% Confidence Interval		F Test with True Value 0
	Intraclass Correlation^b	Lower Bound	Upper Bound	Value	df1	df2	Sig
Single Measures	.715^a	.342	.946	11.027	5	15	.000
Average Measures	.909^c	.676	.986	11.027	5	15	.000
Two-way mixed effects model where people effects are random and measures effects are fixed.
a. The estimator is the same, whether the interaction effect is present or not.
b. Type C intraclass correlation coefficients using a consistency definition-the between-measure variance is excluded from the denominator variance.
c. This estimate is computed assuming the interaction effect is absent, because it is not estimable otherwise.

* ICC mixed effects and absolute agreement

RELIABILITY
  /VARIABLES=J1 J2 J3 J4
  /SCALE('ALL VARIABLES') ALL
  /ICC=MODEL(MIXED) TYPE(ABSOLUTE) CIN=95 TESTVAL=0.

Case Processing Summary
		N	%
Cases	Valid	6	100.0
	Excluded^a	0	.0
	Total	6	100.0
a. Listwise deletion based on all variables in the procedure.

Reliability Statistics
Cronbach's Alpha	N of Items
.909	4

Intraclass Correlation Coefficient
	Intraclass Correlation^b	95% Confidence Interval		F Test with True Value 0
	Intraclass Correlation^b	Lower Bound	Upper Bound	Value	df1	df2	Sig
Single Measures	.290^a	.019	.761	11.027	5	15	.000
Average Measures	.620^c	.071	.927	11.027	5	15	.000
Two-way mixed effects model where people effects are random and measures effects are fixed.
a. The estimator is the same, whether the interaction effect is present or not.
b. Type A intraclass correlation coefficients using an absolute agreement definition.
c. This estimate is computed assuming the interaction effect is absent, because it is not estimable otherwise.

Here's how you do it in R: Intraclass Correlation Coefficient

Intraclass Correlations (ICC1, ICC2, ICC3 from Shrout and Fleiss) from the `psych` package

The Intraclass correlation is used as a measure of association when studying the reliability of raters. Shrout and Fleiss (1979) outline 6 different estimates, that depend upon the particular experimental design. All are implemented and given confidence limits.

# data matrix
sf <- matrix(c(1, 9,    2,   5,    8,
2, 6,    1,   3,    2,
3, 8,    4,   6,    8,
4, 7,    1,   2,    6,
5, 10,   5,   6,    9,
6, 6,   2,   4,    7),
ncol = 5,
byrow = TRUE)

# add column names
colnames(x = sf) <- c("subj",paste("J", 1:4,sep = ""))

# run ICC
ICC(x = sf[,2:5],
    missing = TRUE,
    alpha = .05)

## Call: ICC(x = sf[, 2:5], missing = TRUE, alpha = 0.05)
## 
## Intraclass correlation coefficients 
##                          type  ICC    F df1 df2       p lower bound
## Single_raters_absolute   ICC1 0.17  1.8   5  18 0.16477      -0.133
## Single_random_raters     ICC2 0.29 11.0   5  15 0.00013       0.019
## Single_fixed_raters      ICC3 0.71 11.0   5  15 0.00013       0.342
## Average_raters_absolute ICC1k 0.44  1.8   5  18 0.16477      -0.884
## Average_random_raters   ICC2k 0.62 11.0   5  15 0.00013       0.071
## Average_fixed_raters    ICC3k 0.91 11.0   5  15 0.00013       0.676
##                         upper bound
## Single_raters_absolute         0.72
## Single_random_raters           0.76
## Single_fixed_raters            0.95
## Average_raters_absolute        0.91
## Average_random_raters          0.93
## Average_fixed_raters           0.99
## 
##  Number of subjects = 6     Number of Judges =  4

The alpha() function's Cronbach's alpha ~= ICC() function's ICC3k (average_fixed_raters)

ICC3: A fixed set of k judges rate each target. There is no generalization to a larger population of judges. (MSB - MSE)/(MSB+ (nr-1)*MSE)

alpha (Cronbach, 1951) is the same as Guttman's lambda3 (Guttman, 1945) and may be found by

Lambda 3 = (n)/(n-1)(1-tr(Vx)/(Vx) = (n)/(n-1)(Vx-tr(Vx)/Vx = alpha

# alpha on the ICC formatted data
psych::alpha(x = sf[,2:5])

## 
## Reliability analysis   
## Call: psych::alpha(x = sf[, 2:5])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean  sd
##       0.91      0.93    0.92      0.76  13 0.057  5.3 1.7
## 
##  lower alpha upper     95% confidence boundaries
## 0.8 0.91 1.02 
## 
##  Reliability if an item is dropped:
##    raw_alpha std.alpha G6(smc) average_r  S/N alpha se
## J1      0.88      0.91    0.89      0.78 10.7    0.080
## J2      0.87      0.89    0.85      0.73  8.1    0.083
## J3      0.87      0.90    0.85      0.74  8.6    0.080
## J4      0.92      0.92    0.90      0.79 11.2    0.060
## 
##  Item statistics 
##    n raw.r std.r r.cor r.drop mean  sd
## J1 6  0.88  0.89  0.83   0.81  7.7 1.6
## J2 6  0.92  0.93  0.92   0.86  2.5 1.6
## J3 6  0.91  0.92  0.91   0.84  4.3 1.6
## J4 6  0.91  0.88  0.82   0.79  6.7 2.5
## 
## Non missing response frequency for each item
##       1    2    3    4    5    6    7    8    9   10 miss
## J1 0.00 0.00 0.00 0.00 0.00 0.33 0.17 0.17 0.17 0.17    0
## J2 0.33 0.33 0.00 0.17 0.17 0.00 0.00 0.00 0.00 0.00    0
## J3 0.00 0.17 0.17 0.17 0.17 0.33 0.00 0.00 0.00 0.00    0
## J4 0.00 0.17 0.00 0.00 0.00 0.17 0.17 0.33 0.17 0.00    0

write data to .csv file for use in SPSS

# set working directory
setwd("~/Desktop/analysis-examples/Cronbach's alpha and ICC in R and SPSS/")

# spss reads NAs and turns the entire variable into a string. It's dumb. SO I make NA's into 9999 (a number)
# in SPSS syntax, I code 9999 as missing
bfi.spss <- bfi
bfi.spss[is.na(bfi.spss)] <- 9999

# BFI data
write.csv(x = bfi.spss,
          file = "bfi.spss.csv",
          row.names = FALSE)

# Shrout and Fleiss example
write.csv(x = sf,
          file = "sf.csv",
          row.names = FALSE)

The problem(s) with coefficient alpha

I went through all of that R code and SPSS syntax to tell you this: you probably shouldn't rely on or report coefficient alpha, or, if you do, you should run some other reliability and related analyses as well. Here's why.

Coefficient alpha tells you about the variance shared by all the items used in a composite score. The calculation of alpha depends on the spread of the the inter-item correlations, so the point estimate for alpha will reflect the range of correlations no matter where the source of this range comes from. It could be measurement error or more factor dimensions could underly the scale—coefficient alpha doesn't know.

Coefficient alpha will equal reliability only under conditions that are rarely met in practice:

Residuals don't correlate
Items have identical loadings
The scale is unidimensional.

When these conditions are not met, coefficient alpha may underestimate or overestimate reliability.

Here's one particularly damning quote from Zinbarg, Revelle, Yovel, & Li (2005):

Equation (10) and its reduced forms in Cases I, II and III show that is multiply determined under all but the most highly restrictive of conditions — it often reflects not only general factor saturation but also group factor saturation and even variability in factor loadings. Given such complexity, one could question whether the meaning of is terribly clear in many conditions.

Here's Cronbach himself in Cronbach & Shavelson (2004) :

Coefficients are a crude device that does not bring to the surface many subtleties implied by variance components.

OK, but we still want a coefficient that tells us our scale is reliable (or not) and unidimensional (or not). Luckily, some smart statisticians made some more coefficients and generally agreed that one is best for unidimensional measures: omega.

Source: User chl, response to StackExchange question here

Here's how you do it in R: Omega

McDonald's omega estimates of general and total factor saturation

McDonald has proposed coefficient omega as an estimate of the general factor saturation of a test. One way to find omega is to do a factor analysis of the original data set, rotate the factors obliquely, do a Schmid Leiman transformation, and then find omega. This function estimates omega as suggested by McDonald by using hierarchical factor analysis (following Jensen).

# with psych package
psych::omega(m = bfi[,colnames(bfi) %in% extraversion.items], key = c(-1, -1, 1, 1, 1), nfactors = 3, n.iter = 100, plot = FALSE)

## Loading required namespace: GPArotation

## Omega 
## Call: omegah(m = m, nfactors = nfactors, fm = fm, key = key, flip = flip, 
##     digits = digits, title = title, sl = sl, labels = labels, 
##     plot = plot, n.obs = n.obs, rotate = rotate, Phi = Phi, option = option, 
##     covar = covar)
## Alpha:                 0.76 
## G.6:                   0.73 
## Omega Hierarchical:    0.69 
## Omega H asymptotic:    0.87 
## Omega Total            0.8 
## 
## Schmid Leiman Factor loadings greater than  0.2 
##        g   F1*  F2*   F3*   h2   u2   p2
## E1- 0.56                  0.36 0.64 0.89
## E2- 0.73  0.30            0.62 0.38 0.85
## E3  0.53       0.37       0.44 0.56 0.65
## E4  0.65             0.36 0.58 0.42 0.74
## E5  0.50       0.36       0.39 0.61 0.63
## 
## With eigenvalues of:
##    g  F1*  F2*  F3* 
## 1.81 0.14 0.28 0.16 
## 
## general/max  6.45   max/min =   1.96
## mean percent general =  0.75    with sd =  0.12 and cv of  0.15 
## Explained Common Variance of the general factor =  0.76 
## 
## The degrees of freedom are -2  and the fit is  0 
## The number of observations was  2800  with Chi Square =  0  with prob <  NA
## The root mean square of the residuals is  0 
## The df corrected root mean square of the residuals is  NA
## 
## Compare this with the adequacy of just a general factor and no group factors
## The degrees of freedom for just the general factor are 5  and the fit is  0.04 
## The number of observations was  2800  with Chi Square =  113.96  with prob <  6e-23
## The root mean square of the residuals is  0.05 
## The df corrected root mean square of the residuals is  0.07 
## 
## RMSEA index =  0.088  and the 90 % confidence intervals are  0.075 0.103
## BIC =  74.27 
## 
## Measures of factor score adequacy             
##                                                  g   F1*   F2*   F3*
## Correlation of scores with factors            0.85  0.35  0.50  0.47
## Multiple R square of scores with factors      0.72  0.12  0.25  0.22
## Minimum correlation of factor score estimates 0.44 -0.76 -0.50 -0.56
## 
##  Total, General and Subset omega for each subset
##                                                  g  F1*  F2*  F3*
## Omega total for total scores and subscales    0.80 0.65 0.58 0.56
## Omega general for total scores and subscales  0.69 0.57 0.38 0.43
## Omega group for total scores and subscales    0.07 0.08 0.20 0.13
## 
##  Estimates and bootstrapped confidence intervals
##           lower estimate upper
## omega_h    0.55     0.69  0.70
## alpha      0.68     0.76  0.76
## omega_tot  0.76     0.80  0.84
## G6         0.67     0.73  0.74
## omega_lim  0.68     0.87  0.89

estimating omega h using the `ci.reliability()` from the `MBESS` package

MBESS::ci.reliability(data = data.frame(
  reverse.code(items = na.omit(bfi[,colnames(bfi) %in% extraversion.items]
  ),
  keys = c(-1,-1,1,1,1))
  ), type = "hierarchical", interval.type = "none"
  )

## $est
## [1] 0.7668617
## 
## $se
## [1] NA
## 
## $ci.lower
## [1] NA
## 
## $ci.upper
## [1] NA
## 
## $conf.level
## [1] 0.95
## 
## $type
## [1] "hierarchical omega"
## 
## $interval.type
## [1] "none"

MBESS::ci.reliability(data = data.frame(
  reverse.code(items = na.omit(bfi[,colnames(bfi) %in% extraversion.items]
  ),
  keys = c(-1,-1,1,1,1))
  ), type = "omega", interval.type = "none")

## $est
## [1] 0.7673336
## 
## $se
## [1] NA
## 
## $ci.lower
## [1] NA
## 
## $ci.upper
## [1] NA
## 
## $conf.level
## [1] 0.95
## 
## $type
## [1] "omega"
## 
## $interval.type
## [1] "none"

Final thoughts

I'm genuinely confused why ci.reliability() essentially returns the same estimate as alpha in other functions. If anyone knows why this is the case, please email me or post your ideas in the comments.

There is a lot of code and commentary in this post. This is mostly because reliability turned out to be more complicated than I thought. You can't just compute alpha and dust your hands.

Use some sort of dimension reduction technique (e.g., principal components) to check whether the first eigenvalue is big and the next values aren't. This suggests a unidimensional scale.

Source: SPSS FAQ What does Cronbach's alpha mean?

If the scale you're using is well-developed, compute its alpha and omega, and treat them like point estimates (because they are). That is, report their confidence intervals.

If it's not well-developed, then you have some factor analyses and some theory-guided decisions to make that go beyond the scope of this post. In any case...

Happy R

References

Cronbach, L. J., & Shavelson, R. J. (2004). My current thoughts on coefficient alpha and successor procedures. Educational and psychological measurement, 64(3), 391-418.
Dunn, T. J., Baguley, T., & Brunsden, V. (2014). From alpha to omega: A practical solution to the pervasive problem of internal consistency estimation. British Journal of Psychology, 105(3), 399-412.
Zinbarg, R. E., Revelle, W., Yovel, I., & Li, W. (2005). Cronbach’s α, Revelle’s β, and McDonald’s ω H: Their relations with each other and two alternative conceptualizations of reliability. psychometrika, 70(1), 123-133.

Novum R-ganum

Monday, August 15, 2016

Computing reliability coefficients (like Cronbach's Alpha) in R and SPSS

Here's how you do it in R: Coefficient Alpha

First, I'll load the psych and knitr packages.

25 Personality items representing 5 factors

Cronbach's alpha "from scratch"

Cronbach's alpha from the `psych` package

Informal test of unidimensionality (principal components analysis)

Here's how you do it in SPSS: Coefficient Alpha

Load the BFI dataset from R.

Here's how you do it in SPSS: Intraclass Correlation Coefficient

Here's how you do it in R: Intraclass Correlation Coefficient

Intraclass Correlations (ICC1, ICC2, ICC3 from Shrout and Fleiss) from the `psych` package

The alpha() function's Cronbach's alpha ~= ICC() function's ICC3k (average_fixed_raters)

write data to .csv file for use in SPSS

The problem(s) with coefficient alpha

Here's how you do it in R: Omega

McDonald's omega estimates of general and total factor saturation

estimating omega h using the `ci.reliability()` from the `MBESS` package

Final thoughts

References

No comments:

Post a Comment

Monday, August 15, 2016

Computing reliability coefficients (like Cronbach's Alpha) in R and SPSS

Here's how you do it in R: Coefficient Alpha

First, I'll load the psych and knitr packages.

25 Personality items representing 5 factors

Cronbach's alpha "from scratch"

Cronbach's alpha from the psych package

Informal test of unidimensionality (principal components analysis)

Here's how you do it in SPSS: Coefficient Alpha

Load the BFI dataset from R.

Here's how you do it in SPSS: Intraclass Correlation Coefficient

Here's how you do it in R: Intraclass Correlation Coefficient

Intraclass Correlations (ICC1, ICC2, ICC3 from Shrout and Fleiss) from the psych package

The alpha() function's Cronbach's alpha ~= ICC() function's ICC3k (average_fixed_raters)

write data to .csv file for use in SPSS

The problem(s) with coefficient alpha

Here's how you do it in R: Omega

McDonald's omega estimates of general and total factor saturation

estimating omega h using the ci.reliability() from the MBESS package

Final thoughts

References

No comments:

Post a Comment

Cronbach's alpha from the `psych` package

Intraclass Correlations (ICC1, ICC2, ICC3 from Shrout and Fleiss) from the `psych` package

estimating omega h using the `ci.reliability()` from the `MBESS` package