SEM model with multiple mediators and multiple independent variables in lavaan - r

I have cross-sectional data and I am trying to specify a model with multiple mediations.
My independent variable (IV) is measured by a tool with 24 items, which make up 5 subscales (latent variables), which in turn load onto a total "higher-order" factor. I try to estimate this model in two different ways: (1) A five-factor model (without a higher order factor) in which all 5 subscales are allowed to correlate and (2) a higher-order model with a TOTAL latent variable made up of those 5 suscales.The first model has five correlated latents (FNR, FOB...FAA) factors, with variance fixed to 1.
My first model has an IV (FTOTAL), 4 mediators (ER 1, 2, 3 and 4), and one DV (PH). The relationship between ER 4 and the IV is mediated by one of the other mediators (ER3), making it a mediated mediation. I was able to specify the SEM for the first model with the higher order TOTAL factor without a problem, and the code is shown below:
"#Measurements model
FNR =~ FNR1 + FNR2 + FNR3 +FNR4 +FNR5
FOB =~ FOB1 + FOB2 +FOB3 +FOB4
FDS =~ FDS1 +FDS2 +FDS3 + FDS4 + FDS5
FNJ =~ FNJ1 + FNJ2 + FNJ3 +FNJ4 + FNJ5
FAA =~ FAA1 + FAA2 +FAA3 + FAA4 +FAA5
FTOTAL =~ FNR + FOB + FDS + FNJ+ FAA
#Regressions
ER3~ a*FTOTAL
ER4~ b*RSTOTAL +FTOTAL
ER1 ~ u1*FTOTAL
ER2 ~ u2*FTOTAL
PHQTOTAL ~ z1*ER1 + z2*ER2 + d*FTOTAL + c*ER4 + ER3
indirect1 := u1 * z1
indirect2 := u2 * z2
indirect3 := a*b*c
total := d + (u1 * z1) + (u2 * z2) + a*b*c
#Residual correlations
CRTOTAL~~SUPTOTAL+SLEEPTOTAL+RSTOTAL
SUPTOTAL~~SLEEPTOTAL+RSTOTAL
"
fitPHtotal <- sem(model = multipleMediationPH, data = SEMDATA, std.lv=TRUE)
summary(fitPHtotal)
However, I cannot figure out how to specify the model with all the 5 subscales as independent variables in the same model. I tried following the same logic and but including the 5 IVs in the model and it did not work out. Could anyone suggest a solution? or is the only way to just run 5 different models with one subscale as an IV at a time?
Thank you in advance for the help.

Since you did not provide data. I show you how to do it using the HolzingerSwineford1939 which comes with the lavaan package.
First, a mediation using a second-order latent factor (3 first-order factors):
library(lavaan)
#> This is lavaan 0.6-8
#> lavaan is FREE software! Please report any bugs.
model_2L <- "
visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9
higher =~ visual + textual + speed
#grade will be your Y
#higher order latent factor will be your X
#agemo will be your M
grade ~ c*higher + b*agemo
agemo ~ a*higher
# indirect effect (a*b)
ab := a*b
# total effect
total := c + (a*b)
"
fit_2L <- sem(model = model_2L, data = HolzingerSwineford1939)
summary(object = fit_2L, std=T)
#> lavaan 0.6-8 ended normally after 48 iterations
#>
#> Estimator ML
#> Optimization method NLMINB
#> Number of model parameters 26
#>
#> Used Total
#> Number of observations 300 301
#>
#> Model Test User Model:
#>
#> Test statistic 116.110
#> Degrees of freedom 40
#> P-value (Chi-square) 0.000
#>
#> Parameter Estimates:
#>
#> Standard errors Standard
#> Information Expected
#> Information saturated (h1) model Structured
#>
#> Latent Variables:
#> Estimate Std.Err z-value P(>|z|) Std.lv Std.all
#> visual =~
#> x1 1.000 0.849 0.727
#> x2 0.621 0.109 5.680 0.000 0.527 0.448
#> x3 0.824 0.124 6.641 0.000 0.699 0.619
#> textual =~
#> x4 1.000 0.990 0.851
#> x5 1.117 0.066 16.998 0.000 1.106 0.859
#> x6 0.922 0.056 16.563 0.000 0.913 0.834
#> speed =~
#> x7 1.000 0.648 0.595
#> x8 1.130 0.148 7.612 0.000 0.732 0.726
#> x9 1.010 0.135 7.465 0.000 0.655 0.649
#> higher =~
#> visual 1.000 0.673 0.673
#> textual 0.849 0.185 4.586 0.000 0.490 0.490
#> speed 0.810 0.179 4.519 0.000 0.714 0.714
#>
#> Regressions:
#> Estimate Std.Err z-value P(>|z|) Std.lv Std.all
#> grade ~
#> higher (c) 0.421 0.089 4.730 0.000 0.241 0.482
#> agemo (b) -0.004 0.008 -0.519 0.604 -0.004 -0.029
#> agemo ~
#> higher (a) 0.322 0.469 0.687 0.492 0.184 0.053
#>
#> Variances:
#> Estimate Std.Err z-value P(>|z|) Std.lv Std.all
#> .x1 0.641 0.110 5.822 0.000 0.641 0.471
#> .x2 1.108 0.102 10.848 0.000 1.108 0.799
#> .x3 0.786 0.094 8.398 0.000 0.786 0.616
#> .x4 0.373 0.048 7.750 0.000 0.373 0.276
#> .x5 0.436 0.058 7.453 0.000 0.436 0.263
#> .x6 0.364 0.044 8.369 0.000 0.364 0.304
#> .x7 0.767 0.080 9.629 0.000 0.767 0.646
#> .x8 0.482 0.070 6.924 0.000 0.482 0.474
#> .x9 0.589 0.068 8.686 0.000 0.589 0.579
#> .grade 0.192 0.020 9.767 0.000 0.192 0.768
#> .agemo 11.881 0.972 12.220 0.000 11.881 0.997
#> .visual 0.394 0.111 3.535 0.000 0.547 0.547
#> .textual 0.745 0.101 7.397 0.000 0.760 0.760
#> .speed 0.206 0.062 3.312 0.001 0.490 0.490
#> higher 0.327 0.097 3.375 0.001 1.000 1.000
#>
#> Defined Parameters:
#> Estimate Std.Err z-value P(>|z|) Std.lv Std.all
#> ab -0.001 0.004 -0.366 0.715 -0.001 -0.002
#> total 0.420 0.089 4.728 0.000 0.240 0.481
Second, a mediation using a three first-order factors. Three indirect effects and three total effects are estimated:
library(lavaan)
model_1L <- "
visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9
#grade will be your Y
#higher order latent factor will be your X
#agemo will be your M
grade ~ c1*visual + c2*textual + c3*speed + b*agemo
agemo ~ a1*visual + a2*textual + a3*speed
# indirect effect (a*b)
a1b := a1*b
a2b := a2*b
a3b := a3*b
# total effect
total1 := c1 + (a1*b)
total2 := c2 + (a2*b)
total3 := c3 + (a3*b)
"
fit_1L <- sem(model = model_1L, data = HolzingerSwineford1939)
summary(object = fit_1L, std=T)
#> lavaan 0.6-8 ended normally after 55 iterations
#>
#> Estimator ML
#> Optimization method NLMINB
#> Number of model parameters 30
#>
#> Used Total
#> Number of observations 300 301
#>
#> Model Test User Model:
#>
#> Test statistic 101.925
#> Degrees of freedom 36
#> P-value (Chi-square) 0.000
#>
#> Parameter Estimates:
#>
#> Standard errors Standard
#> Information Expected
#> Information saturated (h1) model Structured
#>
#> Latent Variables:
#> Estimate Std.Err z-value P(>|z|) Std.lv Std.all
#> visual =~
#> x1 1.000 0.904 0.775
#> x2 0.555 0.100 5.564 0.000 0.501 0.426
#> x3 0.724 0.109 6.657 0.000 0.655 0.580
#> textual =~
#> x4 1.000 0.993 0.853
#> x5 1.108 0.065 17.017 0.000 1.101 0.855
#> x6 0.921 0.055 16.667 0.000 0.915 0.836
#> speed =~
#> x7 1.000 0.668 0.613
#> x8 1.115 0.142 7.840 0.000 0.744 0.737
#> x9 0.945 0.125 7.540 0.000 0.631 0.625
#>
#> Regressions:
#> Estimate Std.Err z-value P(>|z|) Std.lv Std.all
#> grade ~
#> visual (c1) 0.012 0.048 0.246 0.806 0.011 0.021
#> textual (c2) 0.048 0.035 1.376 0.169 0.047 0.095
#> speed (c3) 0.295 0.063 4.689 0.000 0.197 0.394
#> agemo (b) -0.003 0.008 -0.361 0.718 -0.003 -0.020
#> agemo ~
#> visual (a1) 0.354 0.355 0.996 0.319 0.320 0.093
#> textual (a2) -0.233 0.256 -0.912 0.362 -0.231 -0.067
#> speed (a3) 0.098 0.421 0.232 0.817 0.065 0.019
#>
#> Covariances:
#> Estimate Std.Err z-value P(>|z|) Std.lv Std.all
#> visual ~~
#> textual 0.412 0.074 5.565 0.000 0.459 0.459
#> speed 0.265 0.058 4.554 0.000 0.438 0.438
#> textual ~~
#> speed 0.180 0.052 3.448 0.001 0.271 0.271
#>
#> Variances:
#> Estimate Std.Err z-value P(>|z|) Std.lv Std.all
#> .x1 0.545 0.115 4.747 0.000 0.545 0.400
#> .x2 1.135 0.102 11.115 0.000 1.135 0.819
#> .x3 0.846 0.091 9.322 0.000 0.846 0.664
#> .x4 0.368 0.048 7.698 0.000 0.368 0.272
#> .x5 0.447 0.058 7.657 0.000 0.447 0.270
#> .x6 0.361 0.043 8.343 0.000 0.361 0.301
#> .x7 0.741 0.079 9.422 0.000 0.741 0.624
#> .x8 0.465 0.069 6.724 0.000 0.465 0.456
#> .x9 0.620 0.067 9.217 0.000 0.620 0.609
#> .grade 0.201 0.018 11.307 0.000 0.201 0.806
#> .agemo 11.813 0.969 12.191 0.000 11.813 0.991
#> visual 0.817 0.147 5.564 0.000 1.000 1.000
#> textual 0.986 0.113 8.752 0.000 1.000 1.000
#> speed 0.446 0.091 4.906 0.000 1.000 1.000
#>
#> Defined Parameters:
#> Estimate Std.Err z-value P(>|z|) Std.lv Std.all
#> a1b -0.001 0.003 -0.344 0.731 -0.001 -0.002
#> a2b 0.001 0.002 0.335 0.738 0.001 0.001
#> a3b -0.000 0.002 -0.183 0.855 -0.000 -0.000
#> total1 0.011 0.048 0.226 0.821 0.010 0.020
#> total2 0.048 0.035 1.399 0.162 0.048 0.096
#> total3 0.295 0.063 4.685 0.000 0.197 0.394
Created on 2021-03-30 by the reprex package (v1.0.0)

Related

CFA in Lavaan is not setting factor loading to 1

****I am running a CFA in R using the Lavaan pkg for the first time. I have got everything up and running however for some reason my none of my factor loadings are set to 1 like they are supposed to be. I want to know why Lavaan isn't automatically setting one of the loadings to one for each of the factors.
This is the code I used:****
model1<-'comm=~relimport+relthink+relhurt
Ind=~attend+prayer+relread
relimport~~relthink'
fit1 <-cfa(model1, data=SIM1, std.lv=TRUE)
summary (fit1, ci=T, standardized=T, fit.measures=T )
modindices(fit1, minimum.value=10, sort=TRUE)
lavaanPlot(model=fit1, node_options=list(shape="box", fontname= "Helvetica"),
edge_options=list(color="grey"), coefs=TRUE, stand=TRUE)
Here is my output:
lavaan 0.6.13 ended normally after 30 iterations
Estimator ML
Optimization method NLMINB
Number of model parameters 14
Used Total
Number of observations 796 1769
Model Test User Model:
Test statistic 2.707
Degrees of freedom 7
P-value (Chi-square) 0.911
Model Test Baseline Model:
Test statistic 1394.558
Degrees of freedom 15
P-value 0.000
User Model versus Baseline Model:
Comparative Fit Index (CFI) 1.000
Tucker-Lewis Index (TLI) 1.007
Loglikelihood and Information Criteria:
Loglikelihood user model (H0) -7374.779
Loglikelihood unrestricted model (H1) -7373.425
Akaike (AIC) 14777.558
Bayesian (BIC) 14843.072
Sample-size adjusted Bayesian (SABIC) 14798.615
Root Mean Square Error of Approximation:
RMSEA 0.000
90 Percent confidence interval - lower 0.000
90 Percent confidence interval - upper 0.017
P-value H_0: RMSEA <= 0.050 1.000
P-value H_0: RMSEA >= 0.080 0.000
Standardized Root Mean Square Residual:
SRMR 0.008
Parameter Estimates:
Standard errors Standard
Information Expected
Information saturated (h1) model Structured
Latent Variables:
Estimate Std.Err z-value P(>|z|) ci.lower ci.upper Std.lv
comm =~
relimport 0.796 0.050 15.875 0.000 0.698 0.894 0.796
relthink 0.735 0.062 11.784 0.000 0.613 0.857 0.735
relhurt 0.660 0.061 10.827 0.000 0.540 0.779 0.660
Ind =~
attend 0.685 0.048 14.408 0.000 0.591 0.778 0.685
prayer 1.605 0.065 24.794 0.000 1.478 1.732 1.605
relread 1.134 0.052 21.960 0.000 1.033 1.235 1.134
Std.all
0.926
0.672
0.455
0.523
0.844
0.757
Covariances:
Estimate Std.Err z-value P(>|z|) ci.lower ci.upper Std.lv
.relimport ~~
.relthink -0.007 0.069 -0.104 0.917 -0.143 0.129 -0.007
comm ~~
Ind 0.609 0.043 14.108 0.000 0.525 0.694 0.609
Std.all
-0.027
0.609
Variances:
Estimate Std.Err z-value P(>|z|) ci.lower ci.upper Std.lv
.relimport 0.106 0.071 1.489 0.137 -0.033 0.245 0.106
.relthink 0.658 0.084 7.874 0.000 0.494 0.822 0.658
.relhurt 1.668 0.097 17.268 0.000 1.479 1.857 1.668
.attend 1.242 0.068 18.253 0.000 1.109 1.376 1.242
.prayer 1.040 0.125 8.286 0.000 0.794 1.286 1.040
.relread 0.955 0.075 12.676 0.000 0.807 1.103 0.955
comm 1.000 1.000 1.000 1.000
Ind 1.000 1.000 1.000 1.000
Std.all
0.143
0.549
0.793
0.726
0.288
0.426
I figured out that std.lv=TRUE was literally telling R-Lavaan not to set the first factor to one. So problem fixed!

Generic solution update to sort multiple columns and filter a cut.off

hy, Here is an update proposal for the solution presented by #Sam Dickson.
Who can contribute to creating an output closer to the [Expected output] and using the dplyr function to generalize the solution.
data.frame(
RC1=c(0.902,0.9,0.899,0.825,0.802,0.745,0.744,0.74,0.382,0.356,0.309,0.295,0.194,0.162,0.162,0.156,0.153,0.147,0.144,0.142,0.123,0.113,0.098,0.062),
RC2=c(0.206,0.282,0.133,0.057,0.091,0.243,-0.068,0.105,0.143,0.173,0.329,0.683,0.253,0.896,-0.155,-0.126,0.06,-0.158,0.952,0.932,-0.077,-0.062,0.322,-0.065),
RC3=c(0.153,-0.029,0.093,0.138,0.289,0.071,0.413,-0.011,-0.069,0.181,0.123,-0.035,0.807,0.104,-0.044,0.504,0.15,-0.004,-0.013,0.106,0.785,-0.053,0.751,0.858),
RC4=c(0.078,0.05,0.219,0.216,0.218,0.114,0.122,0.249,0.726,0.108,0.725,-0.089,0.249,0.146,0.622,-0.189,0.099,0.406,0.05,0.026,-0.018,-0.095,0.007,-0.118),
RC5=c(0.217,0.021,-0.058,0.166,0.352,0.09,0.26,-0.354,0.065,-0.014,0.064,0.359,0.134,-0.114,0.212,0.178,0.878,0.71,-0.019,-0.021,0.015,-0.055,0.165,-0.074),
RC6=c(0.027,-0.007,0.087,0.104,0.045,0.319,0.296,0.205,0.088,0.816,0.229,0.302,0.163,0.059,-0.256,0.604,-0.07,0.394,-0.02,-0.041,0.071,-0.008,0.219,-0.068),
RC7=c(-0.015,-0.15,0.073,0.126,0.06,0.347,0.082,-0.093,-0.155,0.093,-0.045,-0.175,-0.021,0.004,0.052,-0.184,-0.054,-0.008,0.012,-0.004,0.094,0.951,-0.001,-0.118))->df
row.names(df)<- c("X5","X12","X13","X2","X6","X4","X3","X11","X15","X10","X16","X8","X20","X19","X17","X21","X9","X7","X22","X24","X1","X14","X23","X18")
ord1 <- apply(as.matrix(df),1,function(x) min(which(abs(x)>=0.4),ncol(df)))
ord2 <- df[cbind(1:nrow(df),ord1)]
df[order(ord1,-abs(ord2)),]
df1<-df[ , ]> 0.4
row.names(df1)<- c("X5","X12","X13","X2","X6","X4","X3","X11","X15","X10","X16","X8","X20","X19","X17","X21","X9","X7","X22","X24","X1","X14","X23","X18")
df1
df[df[,]< 0.4] <- ""
df
Output:
RC1 RC2 RC3 RC4 RC5 RC6 RC7
X5 0.902
X12 0.9
X13 0.899
X2 0.825
X6 0.802
X4 0.745
X3 0.744 0.413
X11 0.74
X15 0.726
X10 0.816
X16 0.725
X8 0.683
X20 0.807
X19 0.896
X17 0.622
X21 0.504 0.604
X9 0.878
X7 0.406 0.71
X22 0.952
X24 0.932
X1 0.785
X14 0.951
X23 0.751
X18 0.858
Expected output:
Now the question is cleared up, I think this does what you want:
library(dplyr)
df %>%
mutate(across(everything(),
~ ifelse(. < 0.4, "", format(., digits = 3)))) %>%
arrange(across(everything(), desc))
# RC1 RC2 RC3 RC4 RC5 RC6 RC7
# 1 0.902
# 2 0.900
# 3 0.899
# 4 0.825
# 5 0.802
# 6 0.745
# 7 0.744 0.413
# 8 0.740
# 9 0.952
# 10 0.932
# 11 0.896
# 12 0.683
# 13 0.858
# 14 0.807
# 15 0.785
# 16 0.751
# 17 0.504 0.604
# 18 0.726
# 19 0.725
# 20 0.622
# 21 0.406 0.710
# 22 0.878
# 23 0.816
# 24 0.951
library(tidyverse)
data <-
data.frame(
id=c("X5","X12","X13","X2","X6", "X4","X3","X11","X15","X10","X16","X8","X20","X19","X17","X21","X9","X7","X22","X24","X1","X14","X23","X18"),
RC1=c(0.902,0.9,0.899,0.825,0.802,0.745,0.744,0.74,0.382,0.356,0.309,0.295,0.194,0.162,0.162,0.156,0.153,0.147,0.144,0.142,0.123,0.113,0.098,0.062),
RC2=c(0.206,0.282,0.133,0.057,0.091,0.243,-0.068,0.105,0.143,0.173,0.329,0.683,0.253,0.896,-0.155,-0.126,0.06,-0.158,0.952,0.932,-0.077,-0.062,0.322,-0.065),
RC3=c(0.153,-0.029,0.093,0.138,0.289,0.071,0.413,-0.011,-0.069,0.181,0.123,-0.035,0.807,0.104,-0.044,0.504,0.15,-0.004,-0.013,0.106,0.785,-0.053,0.751,0.858),
RC4=c(0.078,0.05,0.219,0.216,0.218,0.114,0.122,0.249,0.726,0.108,0.725,-0.089,0.249,0.146,0.622,-0.189,0.099,0.406,0.05,0.026,-0.018,-0.095,0.007,-0.118),
RC5=c(0.217,0.021,-0.058,0.166,0.352,0.09,0.26,-0.354,0.065,-0.014,0.064,0.359,0.134,-0.114,0.212,0.178,0.878,0.71,-0.019,-0.021,0.015,-0.055,0.165,-0.074),
RC6=c(0.027,-0.007,0.087,0.104,0.045,0.319,0.296,0.205,0.088,0.816,0.229,0.302,0.163,0.059,-0.256,0.604,-0.07,0.394,-0.02,-0.041,0.071,-0.008,0.219,-0.068),
RC7=c(-0.015,-0.15,0.073,0.126,0.06,0.347,0.082,-0.093,-0.155,0.093,-0.045,-0.175,-0.021,0.004,0.052,-0.184,-0.054,-0.008,0.012,-0.004,0.094,0.951,-0.001,-0.118)
)
# Question 1: How to sort the columns, from largest to smallest, in each column, as in the image?
data %>% arrange(-RC1)
#> id RC1 RC2 RC3 RC4 RC5 RC6 RC7
#> 1 X5 0.902 0.206 0.153 0.078 0.217 0.027 -0.015
#> 2 X12 0.900 0.282 -0.029 0.050 0.021 -0.007 -0.150
#> 3 X13 0.899 0.133 0.093 0.219 -0.058 0.087 0.073
#> 4 X2 0.825 0.057 0.138 0.216 0.166 0.104 0.126
#> 5 X6 0.802 0.091 0.289 0.218 0.352 0.045 0.060
#> 6 X4 0.745 0.243 0.071 0.114 0.090 0.319 0.347
#> 7 X3 0.744 -0.068 0.413 0.122 0.260 0.296 0.082
#> 8 X11 0.740 0.105 -0.011 0.249 -0.354 0.205 -0.093
#> 9 X15 0.382 0.143 -0.069 0.726 0.065 0.088 -0.155
#> 10 X10 0.356 0.173 0.181 0.108 -0.014 0.816 0.093
#> 11 X16 0.309 0.329 0.123 0.725 0.064 0.229 -0.045
#> 12 X8 0.295 0.683 -0.035 -0.089 0.359 0.302 -0.175
#> 13 X20 0.194 0.253 0.807 0.249 0.134 0.163 -0.021
#> 14 X19 0.162 0.896 0.104 0.146 -0.114 0.059 0.004
#> 15 X17 0.162 -0.155 -0.044 0.622 0.212 -0.256 0.052
#> 16 X21 0.156 -0.126 0.504 -0.189 0.178 0.604 -0.184
#> 17 X9 0.153 0.060 0.150 0.099 0.878 -0.070 -0.054
#> 18 X7 0.147 -0.158 -0.004 0.406 0.710 0.394 -0.008
#> 19 X22 0.144 0.952 -0.013 0.050 -0.019 -0.020 0.012
#> 20 X24 0.142 0.932 0.106 0.026 -0.021 -0.041 -0.004
#> 21 X1 0.123 -0.077 0.785 -0.018 0.015 0.071 0.094
#> 22 X14 0.113 -0.062 -0.053 -0.095 -0.055 -0.008 0.951
#> 23 X23 0.098 0.322 0.751 0.007 0.165 0.219 -0.001
#> 24 X18 0.062 -0.065 0.858 -0.118 -0.074 -0.068 -0.118
# Question 2: How to hide the values ​in each column when the value is =< 0.04?
data %>% filter(RC1 > 0.04)
#> id RC1 RC2 RC3 RC4 RC5 RC6 RC7
#> 1 X5 0.902 0.206 0.153 0.078 0.217 0.027 -0.015
#> 2 X12 0.900 0.282 -0.029 0.050 0.021 -0.007 -0.150
#> 3 X13 0.899 0.133 0.093 0.219 -0.058 0.087 0.073
#> 4 X2 0.825 0.057 0.138 0.216 0.166 0.104 0.126
#> 5 X6 0.802 0.091 0.289 0.218 0.352 0.045 0.060
#> 6 X4 0.745 0.243 0.071 0.114 0.090 0.319 0.347
#> 7 X3 0.744 -0.068 0.413 0.122 0.260 0.296 0.082
#> 8 X11 0.740 0.105 -0.011 0.249 -0.354 0.205 -0.093
#> 9 X15 0.382 0.143 -0.069 0.726 0.065 0.088 -0.155
#> 10 X10 0.356 0.173 0.181 0.108 -0.014 0.816 0.093
#> 11 X16 0.309 0.329 0.123 0.725 0.064 0.229 -0.045
#> 12 X8 0.295 0.683 -0.035 -0.089 0.359 0.302 -0.175
#> 13 X20 0.194 0.253 0.807 0.249 0.134 0.163 -0.021
#> 14 X19 0.162 0.896 0.104 0.146 -0.114 0.059 0.004
#> 15 X17 0.162 -0.155 -0.044 0.622 0.212 -0.256 0.052
#> 16 X21 0.156 -0.126 0.504 -0.189 0.178 0.604 -0.184
#> 17 X9 0.153 0.060 0.150 0.099 0.878 -0.070 -0.054
#> 18 X7 0.147 -0.158 -0.004 0.406 0.710 0.394 -0.008
#> 19 X22 0.144 0.952 -0.013 0.050 -0.019 -0.020 0.012
#> 20 X24 0.142 0.932 0.106 0.026 -0.021 -0.041 -0.004
#> 21 X1 0.123 -0.077 0.785 -0.018 0.015 0.071 0.094
#> 22 X14 0.113 -0.062 -0.053 -0.095 -0.055 -0.008 0.951
#> 23 X23 0.098 0.322 0.751 0.007 0.165 0.219 -0.001
#> 24 X18 0.062 -0.065 0.858 -0.118 -0.074 -0.068 -0.118
# Question 3:That the solution is, if possible, generic for n columns
data %>% filter_at(vars(starts_with("RC")), ~ .x > 0.04)
#> id RC1 RC2 RC3 RC4 RC5 RC6 RC7
#> 1 X2 0.825 0.057 0.138 0.216 0.166 0.104 0.126
#> 2 X6 0.802 0.091 0.289 0.218 0.352 0.045 0.060
#> 3 X4 0.745 0.243 0.071 0.114 0.090 0.319 0.347
# Question 4: If possible, how visually can the doR output be presented in table format (expected output)?.
# Output is already a table, you can use kable package for HTML table rendering
Created on 2021-09-09 by the reprex package (v2.0.1)

Model fit of SEM in Lavaan

What is the reason for CFI=0 in a sem model in Lavaan. Statistic values are attached
Well, first let's check how does the CFI estimator works:
Usually, SEM programs do not present CFI values below 0, as such if a negative value is obtained, the software shows 0.
An example:
library(lavaan)
#> This is lavaan 0.6-8
#> lavaan is FREE software! Please report any bugs.
HS.model <- ' visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9 '
fit <- cfa(HS.model, data = HolzingerSwineford1939)
summary(fit, fit.measures = TRUE)
#> lavaan 0.6-8 ended normally after 35 iterations
#>
#> Estimator ML
#> Optimization method NLMINB
#> Number of model parameters 21
#>
#> Number of observations 301
#>
#> Model Test User Model:
#>
#> Test statistic 85.306
#> Degrees of freedom 24
#> P-value (Chi-square) 0.000
#>
#> Model Test Baseline Model:
#>
#> Test statistic 918.852
#> Degrees of freedom 36
#> P-value 0.000
#>
#> User Model versus Baseline Model:
#>
#> Comparative Fit Index (CFI) 0.931
#> Tucker-Lewis Index (TLI) 0.896
#>
#> Loglikelihood and Information Criteria:
#>
#> Loglikelihood user model (H0) -3737.745
#> Loglikelihood unrestricted model (H1) -3695.092
#>
#> Akaike (AIC) 7517.490
#> Bayesian (BIC) 7595.339
#> Sample-size adjusted Bayesian (BIC) 7528.739
#>
#> Root Mean Square Error of Approximation:
#>
#> RMSEA 0.092
#> 90 Percent confidence interval - lower 0.071
#> 90 Percent confidence interval - upper 0.114
#> P-value RMSEA <= 0.05 0.001
#>
#> Standardized Root Mean Square Residual:
#>
#> SRMR 0.065
#>
#> Parameter Estimates:
#>
#> Standard errors Standard
#> Information Expected
#> Information saturated (h1) model Structured
#>
#> Latent Variables:
#> Estimate Std.Err z-value P(>|z|)
#> visual =~
#> x1 1.000
#> x2 0.554 0.100 5.554 0.000
#> x3 0.729 0.109 6.685 0.000
#> textual =~
#> x4 1.000
#> x5 1.113 0.065 17.014 0.000
#> x6 0.926 0.055 16.703 0.000
#> speed =~
#> x7 1.000
#> x8 1.180 0.165 7.152 0.000
#> x9 1.082 0.151 7.155 0.000
#>
#> Covariances:
#> Estimate Std.Err z-value P(>|z|)
#> visual ~~
#> textual 0.408 0.074 5.552 0.000
#> speed 0.262 0.056 4.660 0.000
#> textual ~~
#> speed 0.173 0.049 3.518 0.000
#>
#> Variances:
#> Estimate Std.Err z-value P(>|z|)
#> .x1 0.549 0.114 4.833 0.000
#> .x2 1.134 0.102 11.146 0.000
#> .x3 0.844 0.091 9.317 0.000
#> .x4 0.371 0.048 7.779 0.000
#> .x5 0.446 0.058 7.642 0.000
#> .x6 0.356 0.043 8.277 0.000
#> .x7 0.799 0.081 9.823 0.000
#> .x8 0.488 0.074 6.573 0.000
#> .x9 0.566 0.071 8.003 0.000
#> visual 0.809 0.145 5.564 0.000
#> textual 0.979 0.112 8.737 0.000
#> speed 0.384 0.086 4.451 0.000
As you can see your model's X² is 85.306, with 24 degrees of freedom, and the baseline model has 918.852, with 36 degrees of freedom.
With that we can easily calculate CFI by hand:
1-((85.306-24)/(918.852-36))
#> [1] 0.9305591
Which you can compare with the CFI reported by the summary() function (i.e., 0.931).
The model reported by you allows us to check that your CFI would be negative if the software did not limit it to 0.
1-((5552.006-94)/(3181.455-21))
#> [1] -0.7269684
Created on 2021-03-27 by the reprex package (v1.0.0)

How to extract the loadings as a matrix from princomp() in R

In using princomp() I get an object with a "loadings" attribute. The "loadings" is a composite object which holds information that I would prefer to have it separate in normal matrix format so that I can handle and manipulate it freely. In particular I would like to extract the loadings in a matrix format. I can not find a way to extract though this information from the object. Is this possible and if yes how?
I want to use princomp() because it accepts as input a covariance matrix which is easier to provide since my dataset is very large (7000000 x 650).
For a reproducible example, please see below:
> data(mtcars)
>
> princomp_mtcars = princomp(mtcars)
>
> loadings = princomp_mtcars$loadings
>
> names(loadings)
NULL
>
> class(loadings)
[1] "loadings"
>
> loadings
Loadings:
Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 Comp.9 Comp.10 Comp.11
mpg 0.982 0.144
cyl 0.228 -0.239 0.794 0.425 0.189 0.132 0.145
disp -0.900 -0.435
hp -0.435 0.899
drat 0.133 -0.227 0.939 0.184
wt -0.128 0.244 0.127 -0.187 -0.156 0.391 0.830
qsec -0.886 0.214 0.190 0.255 0.103 -0.204
vs -0.177 -0.103 -0.684 0.303 0.626
am 0.136 -0.205 0.201 0.572 -0.163 0.733
gear 0.130 0.276 -0.335 0.802 -0.217 -0.156 0.204 -0.191
carb -0.103 0.269 0.855 0.284 -0.165 -0.128 -0.240
Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 Comp.9 Comp.10 Comp.11
SS loadings 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
Proportion Var 0.091 0.091 0.091 0.091 0.091 0.091 0.091 0.091 0.091 0.091 0.091
Cumulative Var 0.091 0.182 0.273 0.364 0.455 0.545 0.636 0.727 0.818 0.909 1.000
>
Your advice will be appreciated.

Goodness-of-fit indices "NA"

I'm running a non-recursive model with Lavaan. However, 2 things happened that I didn't quite understand. First, gooodness-of-fit indices and some standard errors were "NA". Second, the two coefficients between two variables of different directions were not consistent (non-recursive part: ResidentialMobility--Author): one was positive, and another one was negative (at least they should be in the same direction; otherwise, how to explain?). Can someone help me out? Please let me know if you want me to clarify it more. Thanks!
model01<-'ResidentialMobility~a*Coun
SavingMotherPercentage~e*Affect
SavingMotherPercentage~f*Author
SavingMotherPercentage~g*Recipro
Affect~b*ResidentialMobility
Author~c*ResidentialMobility
Recipro~d*ResidentialMobility
ResidentialMobility~h*Affect
ResidentialMobility~i*Author
ResidentialMobility~j*Recipro
Affect~~Author+Recipro+ResidentialMobility
Author~~Recipro+ResidentialMobility
Recipro~~ResidentialMobility
Coun~SavingMotherPercentage
ab:=a*b
ac:=a*c
ad:=a*d
be:=b*e
cf:=c*f
dg:=d*g
'
fit <- cfa(model01, estimator = "MLR", data = data01, missing = "FIML")
summary(fit, standardized = TRUE, fit.measures = TRUE)
Output:
lavaan (0.5-21) converged normally after 93 iterations
Used Total
Number of observations 502 506
Number of missing patterns 4
Estimator ML Robust
Minimum Function Test Statistic NA NA
Degrees of freedom -2 -2
Minimum Function Value 0.0005232772506
Scaling correction factor
for the Yuan-Bentler correction
User model versus baseline model:
Comparative Fit Index (CFI) NA NA
Tucker-Lewis Index (TLI) NA NA
Loglikelihood and Information Criteria:
Loglikelihood user model (H0) -5057.346 -5057.346
Loglikelihood unrestricted model (H1) -5057.084 -5057.084
Number of free parameters 29 29
Akaike (AIC) 10172.693 10172.693
Bayesian (BIC) 10295.032 10295.032
Sample-size adjusted Bayesian (BIC) 10202.984 10202.984
Root Mean Square Error of Approximation:
RMSEA NA NA
90 Percent Confidence Interval NA NA NA NA
P-value RMSEA <= 0.05 NA NA
Standardized Root Mean Square Residual:
SRMR 0.006 0.006
Parameter Estimates:
Information Observed
Standard Errors Robust.huber.white
Regressions:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
ResidentialMobility ~
Coun (a) -1.543 0.255 -6.052 0.000 -1.543 -0.540
SavingMotherPercentage ~
Affect (e) 3.093 1.684 1.837 0.066 3.093 0.122
Author (f) 2.618 0.923 2.835 0.005 2.618 0.145
Recipro (g) 0.061 1.344 0.046 0.964 0.061 0.003
Affect ~
RsdntlMblt (b) -0.311 0.075 -4.125 0.000 -0.311 -0.570
Author ~
RsdntlMblt (c) -0.901 0.119 -7.567 0.000 -0.901 -1.180
Recipro ~
RsdntlMblt (d) -0.313 0.082 -3.841 0.000 -0.313 -0.512
ResidentialMobility ~
Affect (h) -0.209 0.193 -1.082 0.279 -0.209 -0.114
Author (i) 0.475 0.192 2.474 0.013 0.475 0.363
Recipro (j) 0.178 0.346 0.514 0.607 0.178 0.109
Coun ~
SvngMthrPr 0.003 0.001 2.225 0.026 0.003 0.108
Covariances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.Affect ~~
.Author 0.667 0.171 3.893 0.000 0.667 0.534
.Recipro 0.669 0.119 5.623 0.000 0.669 0.773
.ResidentialMobility ~~
.Affect 0.624 0.144 4.347 0.000 0.624 0.474
.Author ~~
.Recipro 0.565 0.173 3.267 0.001 0.565 0.416
.ResidentialMobility ~~
.Author 1.029 0.288 3.572 0.000 1.029 0.499
.Recipro 0.564 0.304 1.851 0.064 0.564 0.395
Intercepts:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.ResidentlMblty 1.813 NA 1.813 1.270
.SvngMthrPrcntg 29.591 7.347 4.027 0.000 29.591 1.499
.Affect 5.701 0.169 33.797 0.000 5.701 7.320
.Author 5.569 0.275 20.259 0.000 5.569 5.109
.Recipro 5.149 0.186 27.642 0.000 5.149 5.889
.Coun 0.367 0.069 5.336 0.000 0.367 0.735
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.ResidentlMblty 2.169 0.259 8.378 0.000 2.169 1.064
.SvngMthrPrcntg 363.792 23.428 15.528 0.000 363.792 0.934
.Affect 0.797 0.129 6.153 0.000 0.797 1.314
.Author 1.957 0.343 5.713 0.000 1.957 1.647
.Recipro 0.941 0.126 7.439 0.000 0.941 1.231
.Coun 0.242 0.004 54.431 0.000 0.242 0.969
Defined Parameters:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
ab 0.480 0.120 3.991 0.000 0.480 0.308
ac 1.390 0.261 5.328 0.000 1.390 0.637
ad 0.483 0.133 3.640 0.000 0.483 0.276
be -0.962 0.548 -1.757 0.079 -0.962 -0.070
cf -2.359 0.851 -2.771 0.006 -2.359 -0.171
dg -0.019 0.421 -0.046 0.964 -0.019 -0.001
Why you get NA I think is because you have specified a model with -2 in degrees of freedom. You should specify the model differently so that you get a positive number of degrees of freedom.

Resources