SEM model with multiple mediators and multiple independent variables in lavaan - r
I have cross-sectional data and I am trying to specify a model with multiple mediations.
My independent variable (IV) is measured by a tool with 24 items, which make up 5 subscales (latent variables), which in turn load onto a total "higher-order" factor. I try to estimate this model in two different ways: (1) A five-factor model (without a higher order factor) in which all 5 subscales are allowed to correlate and (2) a higher-order model with a TOTAL latent variable made up of those 5 suscales.The first model has five correlated latents (FNR, FOB...FAA) factors, with variance fixed to 1.
My first model has an IV (FTOTAL), 4 mediators (ER 1, 2, 3 and 4), and one DV (PH). The relationship between ER 4 and the IV is mediated by one of the other mediators (ER3), making it a mediated mediation. I was able to specify the SEM for the first model with the higher order TOTAL factor without a problem, and the code is shown below:
"#Measurements model
FNR =~ FNR1 + FNR2 + FNR3 +FNR4 +FNR5
FOB =~ FOB1 + FOB2 +FOB3 +FOB4
FDS =~ FDS1 +FDS2 +FDS3 + FDS4 + FDS5
FNJ =~ FNJ1 + FNJ2 + FNJ3 +FNJ4 + FNJ5
FAA =~ FAA1 + FAA2 +FAA3 + FAA4 +FAA5
FTOTAL =~ FNR + FOB + FDS + FNJ+ FAA
#Regressions
ER3~ a*FTOTAL
ER4~ b*RSTOTAL +FTOTAL
ER1 ~ u1*FTOTAL
ER2 ~ u2*FTOTAL
PHQTOTAL ~ z1*ER1 + z2*ER2 + d*FTOTAL + c*ER4 + ER3
indirect1 := u1 * z1
indirect2 := u2 * z2
indirect3 := a*b*c
total := d + (u1 * z1) + (u2 * z2) + a*b*c
#Residual correlations
CRTOTAL~~SUPTOTAL+SLEEPTOTAL+RSTOTAL
SUPTOTAL~~SLEEPTOTAL+RSTOTAL
"
fitPHtotal <- sem(model = multipleMediationPH, data = SEMDATA, std.lv=TRUE)
summary(fitPHtotal)
However, I cannot figure out how to specify the model with all the 5 subscales as independent variables in the same model. I tried following the same logic and but including the 5 IVs in the model and it did not work out. Could anyone suggest a solution? or is the only way to just run 5 different models with one subscale as an IV at a time?
Thank you in advance for the help.
Since you did not provide data. I show you how to do it using the HolzingerSwineford1939 which comes with the lavaan package.
First, a mediation using a second-order latent factor (3 first-order factors):
library(lavaan)
#> This is lavaan 0.6-8
#> lavaan is FREE software! Please report any bugs.
model_2L <- "
visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9
higher =~ visual + textual + speed
#grade will be your Y
#higher order latent factor will be your X
#agemo will be your M
grade ~ c*higher + b*agemo
agemo ~ a*higher
# indirect effect (a*b)
ab := a*b
# total effect
total := c + (a*b)
"
fit_2L <- sem(model = model_2L, data = HolzingerSwineford1939)
summary(object = fit_2L, std=T)
#> lavaan 0.6-8 ended normally after 48 iterations
#>
#> Estimator ML
#> Optimization method NLMINB
#> Number of model parameters 26
#>
#> Used Total
#> Number of observations 300 301
#>
#> Model Test User Model:
#>
#> Test statistic 116.110
#> Degrees of freedom 40
#> P-value (Chi-square) 0.000
#>
#> Parameter Estimates:
#>
#> Standard errors Standard
#> Information Expected
#> Information saturated (h1) model Structured
#>
#> Latent Variables:
#> Estimate Std.Err z-value P(>|z|) Std.lv Std.all
#> visual =~
#> x1 1.000 0.849 0.727
#> x2 0.621 0.109 5.680 0.000 0.527 0.448
#> x3 0.824 0.124 6.641 0.000 0.699 0.619
#> textual =~
#> x4 1.000 0.990 0.851
#> x5 1.117 0.066 16.998 0.000 1.106 0.859
#> x6 0.922 0.056 16.563 0.000 0.913 0.834
#> speed =~
#> x7 1.000 0.648 0.595
#> x8 1.130 0.148 7.612 0.000 0.732 0.726
#> x9 1.010 0.135 7.465 0.000 0.655 0.649
#> higher =~
#> visual 1.000 0.673 0.673
#> textual 0.849 0.185 4.586 0.000 0.490 0.490
#> speed 0.810 0.179 4.519 0.000 0.714 0.714
#>
#> Regressions:
#> Estimate Std.Err z-value P(>|z|) Std.lv Std.all
#> grade ~
#> higher (c) 0.421 0.089 4.730 0.000 0.241 0.482
#> agemo (b) -0.004 0.008 -0.519 0.604 -0.004 -0.029
#> agemo ~
#> higher (a) 0.322 0.469 0.687 0.492 0.184 0.053
#>
#> Variances:
#> Estimate Std.Err z-value P(>|z|) Std.lv Std.all
#> .x1 0.641 0.110 5.822 0.000 0.641 0.471
#> .x2 1.108 0.102 10.848 0.000 1.108 0.799
#> .x3 0.786 0.094 8.398 0.000 0.786 0.616
#> .x4 0.373 0.048 7.750 0.000 0.373 0.276
#> .x5 0.436 0.058 7.453 0.000 0.436 0.263
#> .x6 0.364 0.044 8.369 0.000 0.364 0.304
#> .x7 0.767 0.080 9.629 0.000 0.767 0.646
#> .x8 0.482 0.070 6.924 0.000 0.482 0.474
#> .x9 0.589 0.068 8.686 0.000 0.589 0.579
#> .grade 0.192 0.020 9.767 0.000 0.192 0.768
#> .agemo 11.881 0.972 12.220 0.000 11.881 0.997
#> .visual 0.394 0.111 3.535 0.000 0.547 0.547
#> .textual 0.745 0.101 7.397 0.000 0.760 0.760
#> .speed 0.206 0.062 3.312 0.001 0.490 0.490
#> higher 0.327 0.097 3.375 0.001 1.000 1.000
#>
#> Defined Parameters:
#> Estimate Std.Err z-value P(>|z|) Std.lv Std.all
#> ab -0.001 0.004 -0.366 0.715 -0.001 -0.002
#> total 0.420 0.089 4.728 0.000 0.240 0.481
Second, a mediation using a three first-order factors. Three indirect effects and three total effects are estimated:
library(lavaan)
model_1L <- "
visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9
#grade will be your Y
#higher order latent factor will be your X
#agemo will be your M
grade ~ c1*visual + c2*textual + c3*speed + b*agemo
agemo ~ a1*visual + a2*textual + a3*speed
# indirect effect (a*b)
a1b := a1*b
a2b := a2*b
a3b := a3*b
# total effect
total1 := c1 + (a1*b)
total2 := c2 + (a2*b)
total3 := c3 + (a3*b)
"
fit_1L <- sem(model = model_1L, data = HolzingerSwineford1939)
summary(object = fit_1L, std=T)
#> lavaan 0.6-8 ended normally after 55 iterations
#>
#> Estimator ML
#> Optimization method NLMINB
#> Number of model parameters 30
#>
#> Used Total
#> Number of observations 300 301
#>
#> Model Test User Model:
#>
#> Test statistic 101.925
#> Degrees of freedom 36
#> P-value (Chi-square) 0.000
#>
#> Parameter Estimates:
#>
#> Standard errors Standard
#> Information Expected
#> Information saturated (h1) model Structured
#>
#> Latent Variables:
#> Estimate Std.Err z-value P(>|z|) Std.lv Std.all
#> visual =~
#> x1 1.000 0.904 0.775
#> x2 0.555 0.100 5.564 0.000 0.501 0.426
#> x3 0.724 0.109 6.657 0.000 0.655 0.580
#> textual =~
#> x4 1.000 0.993 0.853
#> x5 1.108 0.065 17.017 0.000 1.101 0.855
#> x6 0.921 0.055 16.667 0.000 0.915 0.836
#> speed =~
#> x7 1.000 0.668 0.613
#> x8 1.115 0.142 7.840 0.000 0.744 0.737
#> x9 0.945 0.125 7.540 0.000 0.631 0.625
#>
#> Regressions:
#> Estimate Std.Err z-value P(>|z|) Std.lv Std.all
#> grade ~
#> visual (c1) 0.012 0.048 0.246 0.806 0.011 0.021
#> textual (c2) 0.048 0.035 1.376 0.169 0.047 0.095
#> speed (c3) 0.295 0.063 4.689 0.000 0.197 0.394
#> agemo (b) -0.003 0.008 -0.361 0.718 -0.003 -0.020
#> agemo ~
#> visual (a1) 0.354 0.355 0.996 0.319 0.320 0.093
#> textual (a2) -0.233 0.256 -0.912 0.362 -0.231 -0.067
#> speed (a3) 0.098 0.421 0.232 0.817 0.065 0.019
#>
#> Covariances:
#> Estimate Std.Err z-value P(>|z|) Std.lv Std.all
#> visual ~~
#> textual 0.412 0.074 5.565 0.000 0.459 0.459
#> speed 0.265 0.058 4.554 0.000 0.438 0.438
#> textual ~~
#> speed 0.180 0.052 3.448 0.001 0.271 0.271
#>
#> Variances:
#> Estimate Std.Err z-value P(>|z|) Std.lv Std.all
#> .x1 0.545 0.115 4.747 0.000 0.545 0.400
#> .x2 1.135 0.102 11.115 0.000 1.135 0.819
#> .x3 0.846 0.091 9.322 0.000 0.846 0.664
#> .x4 0.368 0.048 7.698 0.000 0.368 0.272
#> .x5 0.447 0.058 7.657 0.000 0.447 0.270
#> .x6 0.361 0.043 8.343 0.000 0.361 0.301
#> .x7 0.741 0.079 9.422 0.000 0.741 0.624
#> .x8 0.465 0.069 6.724 0.000 0.465 0.456
#> .x9 0.620 0.067 9.217 0.000 0.620 0.609
#> .grade 0.201 0.018 11.307 0.000 0.201 0.806
#> .agemo 11.813 0.969 12.191 0.000 11.813 0.991
#> visual 0.817 0.147 5.564 0.000 1.000 1.000
#> textual 0.986 0.113 8.752 0.000 1.000 1.000
#> speed 0.446 0.091 4.906 0.000 1.000 1.000
#>
#> Defined Parameters:
#> Estimate Std.Err z-value P(>|z|) Std.lv Std.all
#> a1b -0.001 0.003 -0.344 0.731 -0.001 -0.002
#> a2b 0.001 0.002 0.335 0.738 0.001 0.001
#> a3b -0.000 0.002 -0.183 0.855 -0.000 -0.000
#> total1 0.011 0.048 0.226 0.821 0.010 0.020
#> total2 0.048 0.035 1.399 0.162 0.048 0.096
#> total3 0.295 0.063 4.685 0.000 0.197 0.394
Created on 2021-03-30 by the reprex package (v1.0.0)
Related
CFA in Lavaan is not setting factor loading to 1
****I am running a CFA in R using the Lavaan pkg for the first time. I have got everything up and running however for some reason my none of my factor loadings are set to 1 like they are supposed to be. I want to know why Lavaan isn't automatically setting one of the loadings to one for each of the factors. This is the code I used:**** model1<-'comm=~relimport+relthink+relhurt Ind=~attend+prayer+relread relimport~~relthink' fit1 <-cfa(model1, data=SIM1, std.lv=TRUE) summary (fit1, ci=T, standardized=T, fit.measures=T ) modindices(fit1, minimum.value=10, sort=TRUE) lavaanPlot(model=fit1, node_options=list(shape="box", fontname= "Helvetica"), edge_options=list(color="grey"), coefs=TRUE, stand=TRUE) Here is my output: lavaan 0.6.13 ended normally after 30 iterations Estimator ML Optimization method NLMINB Number of model parameters 14 Used Total Number of observations 796 1769 Model Test User Model: Test statistic 2.707 Degrees of freedom 7 P-value (Chi-square) 0.911 Model Test Baseline Model: Test statistic 1394.558 Degrees of freedom 15 P-value 0.000 User Model versus Baseline Model: Comparative Fit Index (CFI) 1.000 Tucker-Lewis Index (TLI) 1.007 Loglikelihood and Information Criteria: Loglikelihood user model (H0) -7374.779 Loglikelihood unrestricted model (H1) -7373.425 Akaike (AIC) 14777.558 Bayesian (BIC) 14843.072 Sample-size adjusted Bayesian (SABIC) 14798.615 Root Mean Square Error of Approximation: RMSEA 0.000 90 Percent confidence interval - lower 0.000 90 Percent confidence interval - upper 0.017 P-value H_0: RMSEA <= 0.050 1.000 P-value H_0: RMSEA >= 0.080 0.000 Standardized Root Mean Square Residual: SRMR 0.008 Parameter Estimates: Standard errors Standard Information Expected Information saturated (h1) model Structured Latent Variables: Estimate Std.Err z-value P(>|z|) ci.lower ci.upper Std.lv comm =~ relimport 0.796 0.050 15.875 0.000 0.698 0.894 0.796 relthink 0.735 0.062 11.784 0.000 0.613 0.857 0.735 relhurt 0.660 0.061 10.827 0.000 0.540 0.779 0.660 Ind =~ attend 0.685 0.048 14.408 0.000 0.591 0.778 0.685 prayer 1.605 0.065 24.794 0.000 1.478 1.732 1.605 relread 1.134 0.052 21.960 0.000 1.033 1.235 1.134 Std.all 0.926 0.672 0.455 0.523 0.844 0.757 Covariances: Estimate Std.Err z-value P(>|z|) ci.lower ci.upper Std.lv .relimport ~~ .relthink -0.007 0.069 -0.104 0.917 -0.143 0.129 -0.007 comm ~~ Ind 0.609 0.043 14.108 0.000 0.525 0.694 0.609 Std.all -0.027 0.609 Variances: Estimate Std.Err z-value P(>|z|) ci.lower ci.upper Std.lv .relimport 0.106 0.071 1.489 0.137 -0.033 0.245 0.106 .relthink 0.658 0.084 7.874 0.000 0.494 0.822 0.658 .relhurt 1.668 0.097 17.268 0.000 1.479 1.857 1.668 .attend 1.242 0.068 18.253 0.000 1.109 1.376 1.242 .prayer 1.040 0.125 8.286 0.000 0.794 1.286 1.040 .relread 0.955 0.075 12.676 0.000 0.807 1.103 0.955 comm 1.000 1.000 1.000 1.000 Ind 1.000 1.000 1.000 1.000 Std.all 0.143 0.549 0.793 0.726 0.288 0.426
I figured out that std.lv=TRUE was literally telling R-Lavaan not to set the first factor to one. So problem fixed!
Generic solution update to sort multiple columns and filter a cut.off
hy, Here is an update proposal for the solution presented by #Sam Dickson. Who can contribute to creating an output closer to the [Expected output] and using the dplyr function to generalize the solution. data.frame( RC1=c(0.902,0.9,0.899,0.825,0.802,0.745,0.744,0.74,0.382,0.356,0.309,0.295,0.194,0.162,0.162,0.156,0.153,0.147,0.144,0.142,0.123,0.113,0.098,0.062), RC2=c(0.206,0.282,0.133,0.057,0.091,0.243,-0.068,0.105,0.143,0.173,0.329,0.683,0.253,0.896,-0.155,-0.126,0.06,-0.158,0.952,0.932,-0.077,-0.062,0.322,-0.065), RC3=c(0.153,-0.029,0.093,0.138,0.289,0.071,0.413,-0.011,-0.069,0.181,0.123,-0.035,0.807,0.104,-0.044,0.504,0.15,-0.004,-0.013,0.106,0.785,-0.053,0.751,0.858), RC4=c(0.078,0.05,0.219,0.216,0.218,0.114,0.122,0.249,0.726,0.108,0.725,-0.089,0.249,0.146,0.622,-0.189,0.099,0.406,0.05,0.026,-0.018,-0.095,0.007,-0.118), RC5=c(0.217,0.021,-0.058,0.166,0.352,0.09,0.26,-0.354,0.065,-0.014,0.064,0.359,0.134,-0.114,0.212,0.178,0.878,0.71,-0.019,-0.021,0.015,-0.055,0.165,-0.074), RC6=c(0.027,-0.007,0.087,0.104,0.045,0.319,0.296,0.205,0.088,0.816,0.229,0.302,0.163,0.059,-0.256,0.604,-0.07,0.394,-0.02,-0.041,0.071,-0.008,0.219,-0.068), RC7=c(-0.015,-0.15,0.073,0.126,0.06,0.347,0.082,-0.093,-0.155,0.093,-0.045,-0.175,-0.021,0.004,0.052,-0.184,-0.054,-0.008,0.012,-0.004,0.094,0.951,-0.001,-0.118))->df row.names(df)<- c("X5","X12","X13","X2","X6","X4","X3","X11","X15","X10","X16","X8","X20","X19","X17","X21","X9","X7","X22","X24","X1","X14","X23","X18") ord1 <- apply(as.matrix(df),1,function(x) min(which(abs(x)>=0.4),ncol(df))) ord2 <- df[cbind(1:nrow(df),ord1)] df[order(ord1,-abs(ord2)),] df1<-df[ , ]> 0.4 row.names(df1)<- c("X5","X12","X13","X2","X6","X4","X3","X11","X15","X10","X16","X8","X20","X19","X17","X21","X9","X7","X22","X24","X1","X14","X23","X18") df1 df[df[,]< 0.4] <- "" df Output: RC1 RC2 RC3 RC4 RC5 RC6 RC7 X5 0.902 X12 0.9 X13 0.899 X2 0.825 X6 0.802 X4 0.745 X3 0.744 0.413 X11 0.74 X15 0.726 X10 0.816 X16 0.725 X8 0.683 X20 0.807 X19 0.896 X17 0.622 X21 0.504 0.604 X9 0.878 X7 0.406 0.71 X22 0.952 X24 0.932 X1 0.785 X14 0.951 X23 0.751 X18 0.858 Expected output:
Now the question is cleared up, I think this does what you want: library(dplyr) df %>% mutate(across(everything(), ~ ifelse(. < 0.4, "", format(., digits = 3)))) %>% arrange(across(everything(), desc)) # RC1 RC2 RC3 RC4 RC5 RC6 RC7 # 1 0.902 # 2 0.900 # 3 0.899 # 4 0.825 # 5 0.802 # 6 0.745 # 7 0.744 0.413 # 8 0.740 # 9 0.952 # 10 0.932 # 11 0.896 # 12 0.683 # 13 0.858 # 14 0.807 # 15 0.785 # 16 0.751 # 17 0.504 0.604 # 18 0.726 # 19 0.725 # 20 0.622 # 21 0.406 0.710 # 22 0.878 # 23 0.816 # 24 0.951
library(tidyverse) data <- data.frame( id=c("X5","X12","X13","X2","X6", "X4","X3","X11","X15","X10","X16","X8","X20","X19","X17","X21","X9","X7","X22","X24","X1","X14","X23","X18"), RC1=c(0.902,0.9,0.899,0.825,0.802,0.745,0.744,0.74,0.382,0.356,0.309,0.295,0.194,0.162,0.162,0.156,0.153,0.147,0.144,0.142,0.123,0.113,0.098,0.062), RC2=c(0.206,0.282,0.133,0.057,0.091,0.243,-0.068,0.105,0.143,0.173,0.329,0.683,0.253,0.896,-0.155,-0.126,0.06,-0.158,0.952,0.932,-0.077,-0.062,0.322,-0.065), RC3=c(0.153,-0.029,0.093,0.138,0.289,0.071,0.413,-0.011,-0.069,0.181,0.123,-0.035,0.807,0.104,-0.044,0.504,0.15,-0.004,-0.013,0.106,0.785,-0.053,0.751,0.858), RC4=c(0.078,0.05,0.219,0.216,0.218,0.114,0.122,0.249,0.726,0.108,0.725,-0.089,0.249,0.146,0.622,-0.189,0.099,0.406,0.05,0.026,-0.018,-0.095,0.007,-0.118), RC5=c(0.217,0.021,-0.058,0.166,0.352,0.09,0.26,-0.354,0.065,-0.014,0.064,0.359,0.134,-0.114,0.212,0.178,0.878,0.71,-0.019,-0.021,0.015,-0.055,0.165,-0.074), RC6=c(0.027,-0.007,0.087,0.104,0.045,0.319,0.296,0.205,0.088,0.816,0.229,0.302,0.163,0.059,-0.256,0.604,-0.07,0.394,-0.02,-0.041,0.071,-0.008,0.219,-0.068), RC7=c(-0.015,-0.15,0.073,0.126,0.06,0.347,0.082,-0.093,-0.155,0.093,-0.045,-0.175,-0.021,0.004,0.052,-0.184,-0.054,-0.008,0.012,-0.004,0.094,0.951,-0.001,-0.118) ) # Question 1: How to sort the columns, from largest to smallest, in each column, as in the image? data %>% arrange(-RC1) #> id RC1 RC2 RC3 RC4 RC5 RC6 RC7 #> 1 X5 0.902 0.206 0.153 0.078 0.217 0.027 -0.015 #> 2 X12 0.900 0.282 -0.029 0.050 0.021 -0.007 -0.150 #> 3 X13 0.899 0.133 0.093 0.219 -0.058 0.087 0.073 #> 4 X2 0.825 0.057 0.138 0.216 0.166 0.104 0.126 #> 5 X6 0.802 0.091 0.289 0.218 0.352 0.045 0.060 #> 6 X4 0.745 0.243 0.071 0.114 0.090 0.319 0.347 #> 7 X3 0.744 -0.068 0.413 0.122 0.260 0.296 0.082 #> 8 X11 0.740 0.105 -0.011 0.249 -0.354 0.205 -0.093 #> 9 X15 0.382 0.143 -0.069 0.726 0.065 0.088 -0.155 #> 10 X10 0.356 0.173 0.181 0.108 -0.014 0.816 0.093 #> 11 X16 0.309 0.329 0.123 0.725 0.064 0.229 -0.045 #> 12 X8 0.295 0.683 -0.035 -0.089 0.359 0.302 -0.175 #> 13 X20 0.194 0.253 0.807 0.249 0.134 0.163 -0.021 #> 14 X19 0.162 0.896 0.104 0.146 -0.114 0.059 0.004 #> 15 X17 0.162 -0.155 -0.044 0.622 0.212 -0.256 0.052 #> 16 X21 0.156 -0.126 0.504 -0.189 0.178 0.604 -0.184 #> 17 X9 0.153 0.060 0.150 0.099 0.878 -0.070 -0.054 #> 18 X7 0.147 -0.158 -0.004 0.406 0.710 0.394 -0.008 #> 19 X22 0.144 0.952 -0.013 0.050 -0.019 -0.020 0.012 #> 20 X24 0.142 0.932 0.106 0.026 -0.021 -0.041 -0.004 #> 21 X1 0.123 -0.077 0.785 -0.018 0.015 0.071 0.094 #> 22 X14 0.113 -0.062 -0.053 -0.095 -0.055 -0.008 0.951 #> 23 X23 0.098 0.322 0.751 0.007 0.165 0.219 -0.001 #> 24 X18 0.062 -0.065 0.858 -0.118 -0.074 -0.068 -0.118 # Question 2: How to hide the values in each column when the value is =< 0.04? data %>% filter(RC1 > 0.04) #> id RC1 RC2 RC3 RC4 RC5 RC6 RC7 #> 1 X5 0.902 0.206 0.153 0.078 0.217 0.027 -0.015 #> 2 X12 0.900 0.282 -0.029 0.050 0.021 -0.007 -0.150 #> 3 X13 0.899 0.133 0.093 0.219 -0.058 0.087 0.073 #> 4 X2 0.825 0.057 0.138 0.216 0.166 0.104 0.126 #> 5 X6 0.802 0.091 0.289 0.218 0.352 0.045 0.060 #> 6 X4 0.745 0.243 0.071 0.114 0.090 0.319 0.347 #> 7 X3 0.744 -0.068 0.413 0.122 0.260 0.296 0.082 #> 8 X11 0.740 0.105 -0.011 0.249 -0.354 0.205 -0.093 #> 9 X15 0.382 0.143 -0.069 0.726 0.065 0.088 -0.155 #> 10 X10 0.356 0.173 0.181 0.108 -0.014 0.816 0.093 #> 11 X16 0.309 0.329 0.123 0.725 0.064 0.229 -0.045 #> 12 X8 0.295 0.683 -0.035 -0.089 0.359 0.302 -0.175 #> 13 X20 0.194 0.253 0.807 0.249 0.134 0.163 -0.021 #> 14 X19 0.162 0.896 0.104 0.146 -0.114 0.059 0.004 #> 15 X17 0.162 -0.155 -0.044 0.622 0.212 -0.256 0.052 #> 16 X21 0.156 -0.126 0.504 -0.189 0.178 0.604 -0.184 #> 17 X9 0.153 0.060 0.150 0.099 0.878 -0.070 -0.054 #> 18 X7 0.147 -0.158 -0.004 0.406 0.710 0.394 -0.008 #> 19 X22 0.144 0.952 -0.013 0.050 -0.019 -0.020 0.012 #> 20 X24 0.142 0.932 0.106 0.026 -0.021 -0.041 -0.004 #> 21 X1 0.123 -0.077 0.785 -0.018 0.015 0.071 0.094 #> 22 X14 0.113 -0.062 -0.053 -0.095 -0.055 -0.008 0.951 #> 23 X23 0.098 0.322 0.751 0.007 0.165 0.219 -0.001 #> 24 X18 0.062 -0.065 0.858 -0.118 -0.074 -0.068 -0.118 # Question 3:That the solution is, if possible, generic for n columns data %>% filter_at(vars(starts_with("RC")), ~ .x > 0.04) #> id RC1 RC2 RC3 RC4 RC5 RC6 RC7 #> 1 X2 0.825 0.057 0.138 0.216 0.166 0.104 0.126 #> 2 X6 0.802 0.091 0.289 0.218 0.352 0.045 0.060 #> 3 X4 0.745 0.243 0.071 0.114 0.090 0.319 0.347 # Question 4: If possible, how visually can the doR output be presented in table format (expected output)?. # Output is already a table, you can use kable package for HTML table rendering Created on 2021-09-09 by the reprex package (v2.0.1)
Model fit of SEM in Lavaan
What is the reason for CFI=0 in a sem model in Lavaan. Statistic values are attached
Well, first let's check how does the CFI estimator works: Usually, SEM programs do not present CFI values below 0, as such if a negative value is obtained, the software shows 0. An example: library(lavaan) #> This is lavaan 0.6-8 #> lavaan is FREE software! Please report any bugs. HS.model <- ' visual =~ x1 + x2 + x3 textual =~ x4 + x5 + x6 speed =~ x7 + x8 + x9 ' fit <- cfa(HS.model, data = HolzingerSwineford1939) summary(fit, fit.measures = TRUE) #> lavaan 0.6-8 ended normally after 35 iterations #> #> Estimator ML #> Optimization method NLMINB #> Number of model parameters 21 #> #> Number of observations 301 #> #> Model Test User Model: #> #> Test statistic 85.306 #> Degrees of freedom 24 #> P-value (Chi-square) 0.000 #> #> Model Test Baseline Model: #> #> Test statistic 918.852 #> Degrees of freedom 36 #> P-value 0.000 #> #> User Model versus Baseline Model: #> #> Comparative Fit Index (CFI) 0.931 #> Tucker-Lewis Index (TLI) 0.896 #> #> Loglikelihood and Information Criteria: #> #> Loglikelihood user model (H0) -3737.745 #> Loglikelihood unrestricted model (H1) -3695.092 #> #> Akaike (AIC) 7517.490 #> Bayesian (BIC) 7595.339 #> Sample-size adjusted Bayesian (BIC) 7528.739 #> #> Root Mean Square Error of Approximation: #> #> RMSEA 0.092 #> 90 Percent confidence interval - lower 0.071 #> 90 Percent confidence interval - upper 0.114 #> P-value RMSEA <= 0.05 0.001 #> #> Standardized Root Mean Square Residual: #> #> SRMR 0.065 #> #> Parameter Estimates: #> #> Standard errors Standard #> Information Expected #> Information saturated (h1) model Structured #> #> Latent Variables: #> Estimate Std.Err z-value P(>|z|) #> visual =~ #> x1 1.000 #> x2 0.554 0.100 5.554 0.000 #> x3 0.729 0.109 6.685 0.000 #> textual =~ #> x4 1.000 #> x5 1.113 0.065 17.014 0.000 #> x6 0.926 0.055 16.703 0.000 #> speed =~ #> x7 1.000 #> x8 1.180 0.165 7.152 0.000 #> x9 1.082 0.151 7.155 0.000 #> #> Covariances: #> Estimate Std.Err z-value P(>|z|) #> visual ~~ #> textual 0.408 0.074 5.552 0.000 #> speed 0.262 0.056 4.660 0.000 #> textual ~~ #> speed 0.173 0.049 3.518 0.000 #> #> Variances: #> Estimate Std.Err z-value P(>|z|) #> .x1 0.549 0.114 4.833 0.000 #> .x2 1.134 0.102 11.146 0.000 #> .x3 0.844 0.091 9.317 0.000 #> .x4 0.371 0.048 7.779 0.000 #> .x5 0.446 0.058 7.642 0.000 #> .x6 0.356 0.043 8.277 0.000 #> .x7 0.799 0.081 9.823 0.000 #> .x8 0.488 0.074 6.573 0.000 #> .x9 0.566 0.071 8.003 0.000 #> visual 0.809 0.145 5.564 0.000 #> textual 0.979 0.112 8.737 0.000 #> speed 0.384 0.086 4.451 0.000 As you can see your model's X² is 85.306, with 24 degrees of freedom, and the baseline model has 918.852, with 36 degrees of freedom. With that we can easily calculate CFI by hand: 1-((85.306-24)/(918.852-36)) #> [1] 0.9305591 Which you can compare with the CFI reported by the summary() function (i.e., 0.931). The model reported by you allows us to check that your CFI would be negative if the software did not limit it to 0. 1-((5552.006-94)/(3181.455-21)) #> [1] -0.7269684 Created on 2021-03-27 by the reprex package (v1.0.0)
How to extract the loadings as a matrix from princomp() in R
In using princomp() I get an object with a "loadings" attribute. The "loadings" is a composite object which holds information that I would prefer to have it separate in normal matrix format so that I can handle and manipulate it freely. In particular I would like to extract the loadings in a matrix format. I can not find a way to extract though this information from the object. Is this possible and if yes how? I want to use princomp() because it accepts as input a covariance matrix which is easier to provide since my dataset is very large (7000000 x 650). For a reproducible example, please see below: > data(mtcars) > > princomp_mtcars = princomp(mtcars) > > loadings = princomp_mtcars$loadings > > names(loadings) NULL > > class(loadings) [1] "loadings" > > loadings Loadings: Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 Comp.9 Comp.10 Comp.11 mpg 0.982 0.144 cyl 0.228 -0.239 0.794 0.425 0.189 0.132 0.145 disp -0.900 -0.435 hp -0.435 0.899 drat 0.133 -0.227 0.939 0.184 wt -0.128 0.244 0.127 -0.187 -0.156 0.391 0.830 qsec -0.886 0.214 0.190 0.255 0.103 -0.204 vs -0.177 -0.103 -0.684 0.303 0.626 am 0.136 -0.205 0.201 0.572 -0.163 0.733 gear 0.130 0.276 -0.335 0.802 -0.217 -0.156 0.204 -0.191 carb -0.103 0.269 0.855 0.284 -0.165 -0.128 -0.240 Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 Comp.9 Comp.10 Comp.11 SS loadings 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 Proportion Var 0.091 0.091 0.091 0.091 0.091 0.091 0.091 0.091 0.091 0.091 0.091 Cumulative Var 0.091 0.182 0.273 0.364 0.455 0.545 0.636 0.727 0.818 0.909 1.000 > Your advice will be appreciated.
Goodness-of-fit indices "NA"
I'm running a non-recursive model with Lavaan. However, 2 things happened that I didn't quite understand. First, gooodness-of-fit indices and some standard errors were "NA". Second, the two coefficients between two variables of different directions were not consistent (non-recursive part: ResidentialMobility--Author): one was positive, and another one was negative (at least they should be in the same direction; otherwise, how to explain?). Can someone help me out? Please let me know if you want me to clarify it more. Thanks! model01<-'ResidentialMobility~a*Coun SavingMotherPercentage~e*Affect SavingMotherPercentage~f*Author SavingMotherPercentage~g*Recipro Affect~b*ResidentialMobility Author~c*ResidentialMobility Recipro~d*ResidentialMobility ResidentialMobility~h*Affect ResidentialMobility~i*Author ResidentialMobility~j*Recipro Affect~~Author+Recipro+ResidentialMobility Author~~Recipro+ResidentialMobility Recipro~~ResidentialMobility Coun~SavingMotherPercentage ab:=a*b ac:=a*c ad:=a*d be:=b*e cf:=c*f dg:=d*g ' fit <- cfa(model01, estimator = "MLR", data = data01, missing = "FIML") summary(fit, standardized = TRUE, fit.measures = TRUE) Output: lavaan (0.5-21) converged normally after 93 iterations Used Total Number of observations 502 506 Number of missing patterns 4 Estimator ML Robust Minimum Function Test Statistic NA NA Degrees of freedom -2 -2 Minimum Function Value 0.0005232772506 Scaling correction factor for the Yuan-Bentler correction User model versus baseline model: Comparative Fit Index (CFI) NA NA Tucker-Lewis Index (TLI) NA NA Loglikelihood and Information Criteria: Loglikelihood user model (H0) -5057.346 -5057.346 Loglikelihood unrestricted model (H1) -5057.084 -5057.084 Number of free parameters 29 29 Akaike (AIC) 10172.693 10172.693 Bayesian (BIC) 10295.032 10295.032 Sample-size adjusted Bayesian (BIC) 10202.984 10202.984 Root Mean Square Error of Approximation: RMSEA NA NA 90 Percent Confidence Interval NA NA NA NA P-value RMSEA <= 0.05 NA NA Standardized Root Mean Square Residual: SRMR 0.006 0.006 Parameter Estimates: Information Observed Standard Errors Robust.huber.white Regressions: Estimate Std.Err z-value P(>|z|) Std.lv Std.all ResidentialMobility ~ Coun (a) -1.543 0.255 -6.052 0.000 -1.543 -0.540 SavingMotherPercentage ~ Affect (e) 3.093 1.684 1.837 0.066 3.093 0.122 Author (f) 2.618 0.923 2.835 0.005 2.618 0.145 Recipro (g) 0.061 1.344 0.046 0.964 0.061 0.003 Affect ~ RsdntlMblt (b) -0.311 0.075 -4.125 0.000 -0.311 -0.570 Author ~ RsdntlMblt (c) -0.901 0.119 -7.567 0.000 -0.901 -1.180 Recipro ~ RsdntlMblt (d) -0.313 0.082 -3.841 0.000 -0.313 -0.512 ResidentialMobility ~ Affect (h) -0.209 0.193 -1.082 0.279 -0.209 -0.114 Author (i) 0.475 0.192 2.474 0.013 0.475 0.363 Recipro (j) 0.178 0.346 0.514 0.607 0.178 0.109 Coun ~ SvngMthrPr 0.003 0.001 2.225 0.026 0.003 0.108 Covariances: Estimate Std.Err z-value P(>|z|) Std.lv Std.all .Affect ~~ .Author 0.667 0.171 3.893 0.000 0.667 0.534 .Recipro 0.669 0.119 5.623 0.000 0.669 0.773 .ResidentialMobility ~~ .Affect 0.624 0.144 4.347 0.000 0.624 0.474 .Author ~~ .Recipro 0.565 0.173 3.267 0.001 0.565 0.416 .ResidentialMobility ~~ .Author 1.029 0.288 3.572 0.000 1.029 0.499 .Recipro 0.564 0.304 1.851 0.064 0.564 0.395 Intercepts: Estimate Std.Err z-value P(>|z|) Std.lv Std.all .ResidentlMblty 1.813 NA 1.813 1.270 .SvngMthrPrcntg 29.591 7.347 4.027 0.000 29.591 1.499 .Affect 5.701 0.169 33.797 0.000 5.701 7.320 .Author 5.569 0.275 20.259 0.000 5.569 5.109 .Recipro 5.149 0.186 27.642 0.000 5.149 5.889 .Coun 0.367 0.069 5.336 0.000 0.367 0.735 Variances: Estimate Std.Err z-value P(>|z|) Std.lv Std.all .ResidentlMblty 2.169 0.259 8.378 0.000 2.169 1.064 .SvngMthrPrcntg 363.792 23.428 15.528 0.000 363.792 0.934 .Affect 0.797 0.129 6.153 0.000 0.797 1.314 .Author 1.957 0.343 5.713 0.000 1.957 1.647 .Recipro 0.941 0.126 7.439 0.000 0.941 1.231 .Coun 0.242 0.004 54.431 0.000 0.242 0.969 Defined Parameters: Estimate Std.Err z-value P(>|z|) Std.lv Std.all ab 0.480 0.120 3.991 0.000 0.480 0.308 ac 1.390 0.261 5.328 0.000 1.390 0.637 ad 0.483 0.133 3.640 0.000 0.483 0.276 be -0.962 0.548 -1.757 0.079 -0.962 -0.070 cf -2.359 0.851 -2.771 0.006 -2.359 -0.171 dg -0.019 0.421 -0.046 0.964 -0.019 -0.001
Why you get NA I think is because you have specified a model with -2 in degrees of freedom. You should specify the model differently so that you get a positive number of degrees of freedom.