So, i have to interpolate function f(x) like this.
x = 0:0.1:2.8;
y = [0 0.717 0.999 0.675 0.0583 0.7568 0.9961 0.6312];
Here's the code i got at the moment.
clc
clear
x = 0:0.1:2.8;
y = [0 0.717 0.999 0.675 0.0583 0.7568 0.9961 0.6312];
n = length(x);
del = %nan * ones (n ,7) ;
del (:,1) = y';
for j = 2:7
for i = 1: n - j +1
del (i,j) = del(i+1,j-1) - del(i,j-1);
end
end
del = [x'del];
del = round ( del *10^3) /10^3;
mprintf ("%5s,%7s,%8s,%9s,%8s,%8s,%8s",'x','y','dy','d2y','d3y','d4y','d5y')
disp ( del )
and it's giving me Submatrix incorrectly defined error.
Where could be the problem?
x and y should have the same length, but it is not the case with you data. For example, you can set
y = [0 0.717 0.999 0.675 0.0583 0.7568 0.9961 0.6312];
x = linspace(0,2.8,length(y));
The line del = [x'del]; fails, it should be written as (a space is missing)
del = [x' del];
Then you script outputs the result:
x, y, dy, d2y, d3y, d4y, d5y
0. 0. 0.717 -0.435 -0.171 0.484 0.81 -5.487
0.4 0.717 0.282 -0.606 0.313 1.295 -4.677 9.689
0.8 0.999 -0.324 -0.293 1.608 -3.382 5.012 Nan
1.2 0.675 -0.617 1.315 -1.774 1.629 Nan Nan
1.6 0.058 0.699 -0.459 -0.145 Nan Nan Nan
2. 0.757 0.239 -0.604 Nan Nan Nan Nan
2.4 0.996 -0.365 Nan Nan Nan Nan Nan
2.8 0.631 Nan Nan Nan Nan Nan Nan
Related
Minimize
f(z) = sum_(t=2)^IJ (Z_t - Z_t-1)^2
Subject to constraint
sum_(j=1)^J (Z_(i-1)J+j+k ) = y^f_i, i = 1,....., I-1.
This optimization find out quarterly values from fiscal year data series (y^f_i) and then sum those quarterly value to find out annual value. I is the number of calendar years considered in the series interval (2<=I<=N). J is quarter value and K is the number of period of calendar year i which are in fiscal year i-1.
In my case, I = 39, J = 4, K = 2
How can I solve this problem using R?
The way I tried to write code is provided below:
library(NlcOptim)
library(readxl)
Calendarization <- read_excel("C:/Users/HP/Desktop/Calendarization.xlsx")
View(Calendarization)
y<-Calendarization$`wholesale price`
objfun = function(z){
return(sum(z[t] - lag(z[t], k=1))^2)
}
for (t in 2:156){
objfun
} -> objfun
p0<-0:39
Aeq<-sum(z[((i-1)*4)+j+2])
for (j in 1:4){
for (i in 1:39){
Aeq
}->Aeq
}
Beq<- y[i]
x=p0
solnl(x, objfun=objfun, Aeq=Aeq, Beq=Beq)
Here is the data I have:
year wholesale price
1970-1971 0.99
1971-1972 1.32
1972-1973 20.9
1973-1974 2.83
1974-1975 5.78
1975-1976 3.38
1976-1977 3.02
1977-1978 2.88
1978-1979 4.08
1979-1980 5.4
1980-1981 4.51
1981-1982 5.91
1982-1983 6.42
1983-1984 7.07
1984-1985 7.68
1985-1986 8.04
1986-1987 9.62
1987-1988 10.05
1988-1989 9.81
1989-1990 9.6
1990-1991 10.59
1991-1992 11.08
1992-1993 9.42
1993-1994 9.6
1994-1995 12.28
1995-1996 12.58
1996-1997 10.87
1997-1998 12.09
1998-1999 13.66
1999-2000 12.28
2000-2001 11.75
2001-2002 11.49
2002-2003 13.08
2003-2004 13.43
2004-2005 15.06
2005-2006 16.5
2006-2007 18.48
2007-2008 24.74
2008-2009 26.69
There seems to be something wrong with the formulation. Just looking at the formulas in the image, the last constraint for i=I-1=39-1=38 sums z elements (38-1)*4 + 1 + 6, (38-1)*4 + 2 + 6, (38-1)*4 + 3 + 6 and (38-1)*4 + 4 + 6 which is elements 155 156 157 158 but z goes from 1 to 4*39 and so has only 156 elements. Furthermore not all the z values participate in a constraint.
Given the problems cited let us change the problem to o ne that makes sense and assume we want to minimize the sum of the squares of the successive differences of the I*J elements of z subject to the first 4 elements of z summing to Cal[1, 2], the next 4 summing to Cal[2, 2], and so on up to the last 4 elements of z summing to Cal[39, 2]. In that case we can write the Aeq constraint matrix as a block diagonal matrix using the kronecker product shown. We ignore K. (Cal is shown reproducibly in the Note at the end.)
library(NlcOptim)
I = 39; J = 4
objfun <- function(x) sum(diff(x)^2)
Aeq <- diag(I) %x% matrix(1, 1, J)
Beq <- Cal[, 2]
st <- rep(1, I*J)
res <- solnl(st, objfun, Aeq = Aeq, Beq = Beq)
giving
> res
List of 6
$ par : num [1:156, 1] 0.576 0.445 0.182 -0.213 -0.739 ...
$ fn : num 25.5
$ counts : num [1, 1:2] 19332 124
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:2] "nfval" "ngval"
$ lambda :List of 3
..$ lower: num [1:156, 1] 0 0 0 0 0 0 0 0 0 0 ...
..$ upper: num [1:156, 1] 0 0 0 0 0 0 0 0 0 0 ...
..$ eqlin: num [1:39] 0.263 1.486 2.427 1.662 0.68 ...
$ grad : num [1:156, 1] 0.263 0.263 0.263 0.263 -1.486 ...
$ hessian: num [1:156, 1:156] 1.65775 -1.10379 0.62425 -0.09085 0.00878 ...
Note
Lines <- "year wholesale price
1970-1971 0.99
1971-1972 1.32
1972-1973 20.9
1973-1974 2.83
1974-1975 5.78
1975-1976 3.38
1976-1977 3.02
1977-1978 2.88
1978-1979 4.08
1979-1980 5.4
1980-1981 4.51
1981-1982 5.91
1982-1983 6.42
1983-1984 7.07
1984-1985 7.68
1985-1986 8.04
1986-1987 9.62
1987-1988 10.05
1988-1989 9.81
1989-1990 9.6
1990-1991 10.59
1991-1992 11.08
1992-1993 9.42
1993-1994 9.6
1994-1995 12.28
1995-1996 12.58
1996-1997 10.87
1997-1998 12.09
1998-1999 13.66
1999-2000 12.28
2000-2001 11.75
2001-2002 11.49
2002-2003 13.08
2003-2004 13.43
2004-2005 15.06
2005-2006 16.5
2006-2007 18.48
2007-2008 24.74
2008-2009 26.69"
Cal <- read.table(text = Lines, skip = 1, col.names = c("year", "wholesale price"),
check.names = FALSE, strip.white = TRUE)
I am trying to plot portfolio mean vs standard deviation (scatter plots) for several different values of the correlation between stock returns. However for each different value of the correlation coefficient, I can't seem to be able to get a different value of the standard deviation vector.
rA = .05; sA = .00; lA = "A" # mean and std.dev. and label of stock A returns
rB = .10; sB = .38; lB = "B" # mean and std.dev. and label of stock B returns
# several possible values of the correlation (There should be one curve for each value)
rrab1 = 1
rrab2 = .5
rrab3 = 0 # and other such values
# Create portfolio weight vectors (Needed to generate the curves)
x = collect(0 : .05 : 1) # weight on stock A (a vector)
y = 1 .-x # weight on stock B (a vector)
# Evaluate the portfolio mean and the portfolio std.dev.
rab1 = rA*x + rB*y # Portfolio return
sab1 = sqrt.(sA^2*x.^2 + sB^2*y.^2 - 2*rrab1*sA*sB .*x .* y) # Portfolio std.dev. for correl = 1
rab2 = rA*x + rB*y
sab2 = sqrt.(sA^2*x.^2 + sB^2*y.^2 - 2*rrab2*sA*sB .*x .* y ) # Portfolio std.dev. for correl = .5
rab3 = rA*x + rB*y
sab3 = sqrt.(sA^2*x.^2 + sB^2*y.^2 - 2*rrab3*sA*sB .*x .* y ) # Portfolio std.dev. for correl = 1
and so on for other possible values of correlation.
I want to plot the all the 3 curves in the same r-s axes using PGFPlotsX. But I am getting just a single plot, that for rrab1.
If I check my std.dev. vectors I find that I am unable to get different vectors.
sab1 == sab2 == sab3 # outputs true
using DataFrames
s = DataFrame()
s.x = sab1
s.y = sab3
s
outputs:
21 rows × 2 columns
x y
Float64 Float64
1 0.38 0.38
2 0.361 0.361
3 0.342 0.342
4 0.323 0.323
5 0.304 0.304
6 0.285 0.285
7 0.266 0.266
8 0.247 0.247
9 0.228 0.228
10 0.209 0.209
11 0.19 0.19
12 0.171 0.171
13 0.152 0.152
14 0.133 0.133
15 0.114 0.114
16 0.095 0.095
17 0.076 0.076
18 0.057 0.057
19 0.038 0.038
20 0.019 0.019
21 0.0 0.0
That is, all the s vectors are the same although the correlations are different.
I am new to Julia and must be missing something elementary.
I think Bogumil's comment should be an answer - this isn't really a Julia issue but follows directly from your assumptions. You start from:
sA = .00
and then calculate:
sab1 = sqrt.(sA^2*x.^2 + sB^2*y.^2 - 2*rrab1*sA*sB .*x .* y)
sab2 = sqrt.(sA^2*x.^2 + sB^2*y.^2 - 2*rrab2*sA*sB .*x .* y )
sab3 = sqrt.(sA^2*x.^2 + sB^2*y.^2 - 2*rrab3*sA*sB .*x .* y )
Simplifying this by using the fact that sA = 0.0 and therefore all terms multiplied by sA vanish:
sab1 = sqrt.(sB^2*y.^2)
sab2 = sqrt.(sB^2*y.^2)
sab3 = sqrt.(sB^2*y.^2)
and indeed
julia> sqrt.(sB^2 * y.^2)
21-element Vector{Float64}:
0.38
0.361
0.342
0.323
0.304
0.28500000000000003
0.26599999999999996
0.24700000000000003
0.22799999999999998
0.20900000000000002
0.19
0.17099999999999999
0.152
0.13299999999999998
0.11400000000000002
0.095
0.07599999999999998
0.05700000000000001
0.03799999999999999
0.019000000000000017
0.0
My dataset is very similar to the dataset 'Melanoma' included in the RiskRegression package : 3307 patients, 502 events of interest (fracture), 264 deaths (competing risk). The time is the years after bone examination (DXA) and status is coded in this way O=censored,1=fracture,2=death).
I am trying to fit a Fine-Gray model with interaction, but when I introduce an interaction term under the form of var1 * var2) I receive an error message :
« Error in design[pos, , drop = FALSE] : subscript out of bounds » .
Here is my code :
fgr<-FGR(Hist(time,status)~age+htot_bmd+tot_bmd+amof+PR+atcdtfam+AlcFR+PR+BMI3C+malchronFR+malchronFR*BMI3C+atcdtfam*PR,data=df2,cause=1)
I tried the code provided in the paper of Zhongheng et al. "Model validation for competing risks data" with the data set 'Melanoma' introducing an interaction but the same error message appears.
Is it possible to introduce an interaction with FGR and how to do it ?
Thanks
You can do with your data with the following code:
> library(riskRegression)
> library(survival)
> library(prodlim)
> library(cmprsk)
> library(readxl)
> df2 <- read_xlsx("/Users/zhang/Downloads/df2.xlsx")
New names:
* `` -> ...1
> df2
# A tibble: 300 x 14
...1 neck_bmd htot_bmd tot_bmd age AlcFR PR atcdtfam malchronFR amof BMI3C time event
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
1 1 0.960 0.953 1.04 79.1 0 0 0 0 2 3 9.00 Cen
2 2 0.612 0.620 0.988 79.2 0 0 0 0 0 3 4.76 MOF
3 3 0.880 0.990 0.827 78.6 0 0 0 1 1 2 9.14 Cen
4 4 0.869 0.905 0.866 79.0 0 0 0 0 0 2 9.11 Cen
5 5 0.863 0.991 1.17 79.0 1 0 1 0 0 2 10.2 Cen
6 6 0.722 0.902 0.842 78.8 0 0 0 0 0 2 9.09 Cen
7 7 0.853 0.929 1.33 76.9 0 0 0 0 0 3 10.1 Cen
8 8 0.830 0.912 0.947 77.0 0 0 0 1 0 2 8.13 Cen
9 9 0.872 0.968 1.22 77.2 1 0 0 0 0 2 8.12 Cen
10 10 0.639 0.776 0.822 76.7 0 0 0 1 0 2 8.12 Cen
# … with 290 more rows, and 1 more variable: status <dbl>
> modMatrix <- model.matrix(~age+htot_bmd+tot_bmd+amof+atcdtfam+
+ AlcFR+PR+BMI3C*malchronFR+neck_bmd,df2)[,-1]
> dtInteraction <- cbind(data.frame( modMatrix),
+ status=df2$status, time=df2$time)
> fgr.Interaction<- FGR(as.formula(paste("Hist(time,status)~",
+ paste(names(dtInteraction[1:11]),collapse = "+"))),
+ data = dtInteraction,cause = 1)
> score.cv<-riskRegression::Score(list("Fine-Gray"= fgr.Interaction),
+ formula = Hist(time,status)~1,
+ data=dtInteraction,times = sort(unique(dtInteraction$time))[25:200],
+ cens.method="jackknife",
+ se.fit=1L,plots="calibration")
> plotCalibration(score.cv,times = df2$time[11],
+ cens.method="local")
You can use a model.matrix function as follows. The crr()function can do the interaction.
> library(riskRegression)
> library(survival)
> library(prodlim)
> library(cmprsk)
> data(Melanoma)
> Melanoma$id<-1:nrow(Melanoma)
> set.seed(123)
> ind.split<-sample(1:nrow(Melanoma),
+ round(nrow(Melanoma)*4/5),
+ replace = F)
> dftrain<-Melanoma[ind.split,]
> dftest<-Melanoma[-ind.split,]
> fgr.full<-FGR(Hist(time,status)~age+thick+ici+
+ epicel+ulcer+sex+invasion,
+ data=dftrain,cause=1)
> modMatrix <- model.matrix(~thick+ici+
+ epicel+ulcer*age+invasion,dftrain)[,-1]
>
> fgrMod <- crr(ftime = dftrain$time,
+ fstatus = dftrain$status,
+ cov1 = modMatrix,failcode=2)
> summary(fgrMod)
Competing Risks Regression
Call:
crr(ftime = dftrain$time, fstatus = dftrain$status, cov1 = modMatrix,
failcode = 2)
coef exp(coef) se(coef) z p-value
thick 0.1194 1.127 0.1292 0.924 0.3600
ici1 -0.7607 0.467 1.0721 -0.710 0.4800
ici2 -0.8531 0.426 0.9379 -0.910 0.3600
ici3 -0.1924 0.825 1.0895 -0.177 0.8600
epicelpresent 0.8973 2.453 0.8434 1.064 0.2900
ulcerpresent -0.7101 0.492 1.9776 -0.359 0.7200
age 0.0627 1.065 0.0227 2.766 0.0057
invasionlevel.1 -1.2031 0.300 0.7068 -1.702 0.0890
invasionlevel.2 -2.0365 0.130 1.4121 -1.442 0.1500
ulcerpresent:age 0.0152 1.015 0.0320 0.473 0.6400
exp(coef) exp(-coef) 2.5% 97.5%
thick 1.127 0.887 0.87475 1.45
ici1 0.467 2.140 0.05716 3.82
ici2 0.426 2.347 0.06780 2.68
ici3 0.825 1.212 0.09752 6.98
epicelpresent 2.453 0.408 0.46968 12.81
ulcerpresent 0.492 2.034 0.01019 23.71
age 1.065 0.939 1.01844 1.11
invasionlevel.1 0.300 3.330 0.07515 1.20
invasionlevel.2 0.130 7.664 0.00819 2.08
ulcerpresent:age 1.015 0.985 0.95348 1.08
Num. cases = 164
Pseudo Log-likelihood = -52.3
Pseudo likelihood ratio test = 21.1 on 10 df,
Then you can try the following code:
library(riskRegression)
library(survival)
library(prodlim)
library(cmprsk)
data(Melanoma)
Melanoma$id<-1:nrow(Melanoma)
set.seed(123)
ind.split<-sample(1:nrow(Melanoma),
round(nrow(Melanoma)*4/5),
replace = F)
dftrain<-Melanoma[ind.split,]
dftest<-Melanoma[-ind.split,]
fgr.NoInteraction<-FGR(Hist(time,status)~age+thick+ici+
epicel+ulcer+sex+invasion,
data=dftrain,cause=1)
modMatrix <- model.matrix(~thick+ici+
epicel+ulcer*age+invasion,dftrain)[,-1]
dtInteraction <- cbind(data.frame( modMatrix),status=dftrain$status,
time=dftrain$time)
fgr.Interaction<- FGR(as.formula(paste("Hist(time,status)~",paste(names(dtInteraction[1:9]),collapse = "+"))),
data = dtInteraction,cause = 1)
score.cv<-riskRegression::Score(list("Fine-Gray"= fgr.Interaction),
formula = Hist(time,status)~1,
data=dtInteraction,times = sort(unique(dtInteraction$time)),
cens.method="jackknife",
se.fit=1L,plots="calibration")
plotCalibration(score.cv,times = 3330,cens.method="local")
We still need to add interaction term with model.matrix. However, we can only use object from FGR as input to the Score function. The other figures in the paper can be done with similar tricks.
I'm trying to apply a CAP function to chemical data collected in different years.
I have a data archive:
head(ISPA_data)
SrCa MgCa MnCa RbCa CuCa ZnCa BaCa PbCa NaCa LiCa CoCa NiCa
1 5178 25.101 9.334 0.166 4.869 8.379 34.846 0.194 5464 0.313 2.510 25.181
2 6017 22.922 7.185 0.166 4.685 8.720 24.659 0.154 4600 0.300 2.475 25.060
3 5628 26.232 6.248 0.179 4.628 10.157 23.942 0.166 5378 0.300 2.529 25.252
4 4769 35.598 7.683 0.131 4.370 8.735 50.068 0.180 5938 0.568 2.159 21.645
5 5330 28.284 6.828 0.130 5.370 12.742 34.257 0.220 5614 0.397 2.275 23.852
6 5786 24.603 4.797 0.156 5.317 13.331 66.896 0.117 5001 0.423 2.298 24.361
and a environmental dataset:
head(ISPA.env)
Year OM Code Location
<dbl> <chr> <chr> <chr>
1 1975 0.04349 CSP75_25 CSP
2 1975 0.0433 CSP75_28 CSP
3 1975 0.04553 CSP75_31 CSP
4 1975 0.0439 CSP75_33 CSP
5 1975 0.02998 CSP75_37 CSP
6 1975 0.0246 CSP75_39 CSP
When performing CAPdiscrim,
Ordination.model1 <- CAPdiscrim(ISPA_data~Year,
ISPA.env,
dist="euclidean",
axes=4,
m=0,
add=FALSE,
permutations=999)
this Error occurs:
Error in eval(predvars, data, env) :
numeric 'envir' arg not of length one
Besides: Warning message:
In cmdscale(distmatrix, k = nrow(x) - 1, eig = T, add = add) :
only 13 of the first 19 eigenvalues are > 0
All data has the same length.
Can anyone help me? Thanks!
I am using semPaths (semPlot package) to draw my structural equation models. After some trial and error, I have a pretty good script to show what I want. Except, I haven’t been able to figure out how to include the p-value/significance levels of the estimates/regression coefficients in the figure.
Can/how can I include significance levels either as e.g. p-value in the edge labels below the estimate or as a broken line for insignificance or …?
I am also interested in including the R-square, but not as critically as the significance level.
This is the script I am using so far:
semPaths(fitmod.bac.class2,
what = "std",
whatLabels = "std",
style="ram",
edge.label.cex = 1.3,
layout = 'tree',
intercepts=FALSE,
residuals=FALSE,
nodeLabels = c("Negati-\nvicutes","cand_class\n_MB_A2_108", "CO2", "Bacilli","Ignavi-\nbacteria","C/N", "pH","Water\ncontent"),
sizeMan=7 )
Example of one of the SemPath outputs
In this example the following are not significant:
Ignavibacteria -> First_C_CO2_ugC_gC_day, p = 0.096
pH -> Ignavibacteria, p = 0.151
cand_class_MB_A2_108 <-> Bacilli correlation, p = 0.054
I am a R-user and not really a coder, so I might just be missing a crucial point in the arguments.
I am testing a lot of different models at the moment, and would really like not to have to draw them all up by hand.
update:
Using semPlotModel: Am I right in understanding that semPlotModel doesn’t include the significance levels from the sem function (see my script and output below)? I am specifically looking to include the P(>|z|) for regressions and covariance.
Is it just me that is missing that, or is it not included? If it is not included, my solution is simply just to custom the edge labels.
{model.NA.UP.bac.class2 <- '
#LATANT VARIABLES
#REGRESSIONS
#soil organic carbon quality
c_Negativicutes ~ CN
#microorganisms
First_C_CO2_ugC_gC_day ~ c_Bacilli
First_C_CO2_ugC_gC_day ~ c_Ignavibacteria
First_C_CO2_ugC_gC_day ~ c_cand_class_MB_A2_108
First_C_CO2_ugC_gC_day ~ c_Negativicutes
#pH
c_Bacilli ~pH
c_Ignavibacteria ~pH
c_cand_class_MB_A2_108~pH
c_Negativicutes ~pH
#COVARIANCE
initial_water ~~ CN
c_cand_class_MB_A2_108 ~~ c_Bacilli
'
fitmod.bac.class2 <- sem(model.NA.UP.bac.class2, data=datapNA.UP.log, missing="ml", meanstructure=TRUE, fixed.x=FALSE, std.lv=FALSE, std.ov=FALSE)
summary(fitmod.bac.class2, standardized=TRUE, fit.measures=TRUE, rsq=TRUE)
out <- capture.output(summary(fitmod.bac.class2, standardized=TRUE, fit.measures=TRUE, rsq=TRUE))
}
Output:
lavaan 0.6-5 ended normally after 188 iterations
Estimator ML
Optimization method NLMINB
Number of free parameters 28
Number of observations 30
Number of missing patterns 1
Model Test User Model:
Test statistic 17.816
Degrees of freedom 16
P-value (Chi-square) 0.335
Model Test Baseline Model:
Test statistic 101.570
Degrees of freedom 28
P-value 0.000
User Model versus Baseline Model:
Comparative Fit Index (CFI) 0.975
Tucker-Lewis Index (TLI) 0.957
Loglikelihood and Information Criteria:
Loglikelihood user model (H0) 472.465
Loglikelihood unrestricted model (H1) 481.373
Akaike (AIC) -888.930
Bayesian (BIC) -849.697
Sample-size adjusted Bayesian (BIC) -936.875
Root Mean Square Error of Approximation:
RMSEA 0.062
90 Percent confidence interval - lower 0.000
90 Percent confidence interval - upper 0.185
P-value RMSEA <= 0.05 0.414
Standardized Root Mean Square Residual:
SRMR 0.107
Parameter Estimates:
Information Observed
Observed information based on Hessian
Standard errors Standard
Regressions:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
c_Negativicutes ~
CN 0.419 0.143 2.939 0.003 0.419 0.416
c_cand_class_MB_A2_108 ~
CN -0.433 0.160 -2.707 0.007 -0.433 -0.394
First_C_CO2_ugC_gC_day ~
c_Bacilli 0.525 0.128 4.092 0.000 0.525 0.496
c_Ignavibacter 0.207 0.124 1.667 0.096 0.207 0.195
c_c__MB_A2_108 0.310 0.125 2.475 0.013 0.310 0.301
c_Negativicuts 0.304 0.137 2.220 0.026 0.304 0.271
c_Bacilli ~
pH 0.624 0.135 4.604 0.000 0.624 0.643
c_Ignavibacteria ~
pH 0.245 0.171 1.436 0.151 0.245 0.254
c_cand_class_MB_A2_108 ~
pH 0.393 0.151 2.597 0.009 0.393 0.394
c_Negativicutes ~
pH 0.435 0.129 3.361 0.001 0.435 0.476
Covariances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
CN ~~
initial_water 0.001 0.000 2.679 0.007 0.001 0.561
.c_cand_class_MB_A2_108 ~~
.c_Bacilli -0.000 0.000 -1.923 0.054 -0.000 -0.388
Intercepts:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.c_Negativicuts 0.145 0.198 0.734 0.463 0.145 3.826
.c_c__MB_A2_108 1.038 0.226 4.594 0.000 1.038 25.076
.Frs_C_CO2_C_C_ -0.346 0.233 -1.485 0.137 -0.346 -8.115
.c_Bacilli 0.376 0.135 2.778 0.005 0.376 9.340
.c_Ignavibacter 0.754 0.170 4.424 0.000 0.754 18.796
CN 0.998 0.007 145.158 0.000 0.998 26.502
pH 0.998 0.008 131.642 0.000 0.998 24.034
initial_water 0.998 0.008 125.994 0.000 0.998 23.003
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.c_Negativicuts 0.001 0.000 3.873 0.000 0.001 0.600
.c_c__MB_A2_108 0.001 0.000 3.833 0.000 0.001 0.689
.Frs_C_CO2_C_C_ 0.001 0.000 3.873 0.000 0.001 0.408
.c_Bacilli 0.001 0.000 3.873 0.000 0.001 0.586
.c_Ignavibacter 0.002 0.000 3.873 0.000 0.002 0.936
CN 0.001 0.000 3.873 0.000 0.001 1.000
initial_water 0.002 0.000 3.873 0.000 0.002 1.000
pH 0.002 0.000 3.873 0.000 0.002 1.000
R-Square:
Estimate
c_Negativicuts 0.400
c_c__MB_A2_108 0.311
Frs_C_CO2_C_C_ 0.592
c_Bacilli 0.414
c_Ignavibacter 0.064
Warning message:
In lav_model_hessian(lavmodel = lavmodel, lavsamplestats = lavsamplestats, :
lavaan WARNING: Hessian is not fully symmetric. Max diff = 5.15131396241486e-05
This example is taken from ?semPaths since we don't have your object.
library('semPlot')
modFile <- tempfile(fileext = '.OUT')
download.file('http://sachaepskamp.com/files/mi1.OUT', modFile)
Use semPlotModel to get the object without plotting. There you can inspect what is to be plotted. I just dug around without reading the docs until I found what it seems to be using.
After you run semPlotModel, the object has an element x#Pars which contains the edges, nodes, and the std which is being used for the edge labels in your case. semPaths also has an argument that allows you to make custom edge labels, so you can take the data you need from x#Pars and add your p-values:
x <- semPlotModel(modFile)
x#Pars
# label lhs edge rhs est std group fixed par
# 1 lambda[11]^{(y)} perfIQ -> pc 1.000 0.6219648 Group 1 TRUE 0
# 2 lambda[21]^{(y)} perfIQ -> pa 0.923 0.5664888 Group 1 FALSE 1
# 3 lambda[31]^{(y)} perfIQ -> oa 1.098 0.6550159 Group 1 FALSE 2
# 4 lambda[41]^{(y)} perfIQ -> ma 0.784 0.4609990 Group 1 FALSE 3
# 5 theta[11]^{(epsilon)} pc <-> pc 5.088 0.6131598 Group 1 FALSE 5
# 10 theta[22]^{(epsilon)} pa <-> pa 5.787 0.6790905 Group 1 FALSE 6
# 15 theta[33]^{(epsilon)} oa <-> oa 5.150 0.5709541 Group 1 FALSE 7
# 20 theta[44]^{(epsilon)} ma <-> ma 7.311 0.7874800 Group 1 FALSE 8
# 21 psi[11] perfIQ <-> perfIQ 3.210 1.0000000 Group 1 FALSE 4
# 22 tau[1]^{(y)} int pc 10.500 NA Group 1 FALSE 9
# 23 tau[2]^{(y)} int pa 10.374 NA Group 1 FALSE 10
# 24 tau[3]^{(y)} int oa 10.663 NA Group 1 FALSE 11
# 25 tau[4]^{(y)} int ma 10.371 NA Group 1 FALSE 12
# 11 lambda[11]^{(y)} perfIQ -> pc 1.000 0.6515609 Group 2 TRUE 0
# 27 lambda[21]^{(y)} perfIQ -> pa 0.923 0.5876948 Group 2 FALSE 1
# 31 lambda[31]^{(y)} perfIQ -> oa 1.098 0.6981974 Group 2 FALSE 2
# 41 lambda[41]^{(y)} perfIQ -> ma 0.784 0.4621919 Group 2 FALSE 3
# 51 theta[11]^{(epsilon)} pc <-> pc 5.006 0.5754684 Group 2 FALSE 14
# 101 theta[22]^{(epsilon)} pa <-> pa 5.963 0.6546148 Group 2 FALSE 15
# 151 theta[33]^{(epsilon)} oa <-> oa 4.681 0.5125204 Group 2 FALSE 16
# 201 theta[44]^{(epsilon)} ma <-> ma 8.356 0.7863786 Group 2 FALSE 17
# 211 psi[11] perfIQ <-> perfIQ 3.693 1.0000000 Group 2 FALSE 13
# 221 tau[1]^{(y)} int pc 10.500 NA Group 2 FALSE 9
# 231 tau[2]^{(y)} int pa 10.374 NA Group 2 FALSE 10
# 241 tau[3]^{(y)} int oa 10.663 NA Group 2 FALSE 11
# 251 tau[4]^{(y)} int ma 10.371 NA Group 2 FALSE 12
# 26 alpha[1] int perfIQ -2.469 NA Group 2 FALSE 18
As you can see there are more edge labels than ones that are plotted, and I have no idea how it chooses which to use, so I am just taking the first four from each group (since there are four edges shown and the stds match those. Maybe there is an option to plot all of them or select which ones you need--I haven't read the docs.
## take first four stds from each group, generate some p-values
l <- sapply(split(x#Pars$std, x#Pars$group), function(x) head(x, 4))
set.seed(1)
l <- sprintf('%.3f, p=%s', l, format.pval(runif(length(l)), digits = 2))
l
# [1] "0.622, p=0.27" "0.566, p=0.37" "0.655, p=0.57" "0.461, p=0.91" "0.652, p=0.20" "0.588, p=0.90" "0.698, p=0.94" "0.462, p=0.66"
Then you can plot the object with your new labels, edgeLabels = l
layout(1:2)
semPaths(
x,
edgeLabels = l,
ask = FALSE, title = FALSE,
what = 'std',
whatLabels = 'std',
style = 'ram',
edge.label.cex = 1.3,
layout = 'tree',
intercepts = FALSE,
residuals = FALSE,
sizeMan = 7
)
With the help from #rawr, I have worked it out. If anybody else needs to include estimates and p-value from Lavaan in their semPaths, here is how it can be done.
#extracting the parameters from the sem model and selecting the interactions relevant for the semPaths (here, I need 12 estimates and p-values)
table2<-parameterEstimates(fitmod.bac.class2,standardized=TRUE) %>% head(12)
#turning the chosen parameters into text
b<-gettextf('%.3f \n p=%.3f', table2$std.all, digits=table2$pvalue)
I can honestly say that I do not understand how the last bit of script works. This is copied from rawr's answer before a lot of trial and error until it worked. There might (quite possibly) be a nicer way to write it, but it works :)
#putting that list into edgeLabels in sempaths
semPaths(fitmod.bac.class2,
what = "std",
edgeLabels = b,
style="ram",
edge.label.cex = 1,
layout = 'tree',
intercepts=FALSE,
residuals=FALSE,
nodeLabels = c("Negati-\nvicutes","cand_class\n_MB_A2_108", "CO2", "Bacilli","Ignavi-\nbacteria","C/N", "pH","Water\ncontent"),
sizeMan=7
)
Just a small, but relevant detail for an improvement for the above answer.
The above code requires an inspection of the parameter table to count how many lines to maintain to specify as in %>%head(4).
We can exclude from the extracted parameter table those lines which lhs and rhs are not equal.
#extracting the parameters from the sem model and selecting the interactions relevant for the semPaths
table2<-parameterEstimates(fitmod.bac.class2,standardized=TRUE)%>%as.dataframe()
table2<-table2[!table2$lhs==table2$rhs,]
If the formula comprised also extra lines as those with ':=' those also will comprise the parameter table, and should be removed.
The remaining keeps the same...
#turning the chosen parameters into text
b<-gettextf('%.3f \n p=%.3f', table2$std.all, digits=table2$pvalue)
#putting that list into edgeLabels in sempaths
semPaths(fitmod.bac.class2,
what = "std",
edgeLabels = b,
style="ram",
edge.label.cex = 1,
layout = 'tree',
intercepts=FALSE,
residuals=FALSE,
nodeLabels = c("Negati-\nvicutes","cand_class\n_MB_A2_108", "CO2", "Bacilli","Ignavi-\nbacteria","C/N", "pH","Water\ncontent"),
sizeMan=7
)