mixture copula in R

mixture copula in R - r

I want to use mixture copula for reliability analysis， now ，with the help of a friend ，I've already finished it ‘RVMs_fitted’ 。now i want to perform the probability integral transformation （PIT），but the function of RVINEPIT can’t use，because RVINEPIT(data，RVM)，this RVM not RVINEMATRIX Here is my code：
library(vineclust)
data1 <- read.csv("D:/ASTUDY/Rlanguage/Mix copula/data.csv", header = FALSE)
fit <- vcmm(data = data1, total_comp=3,is_cvine = 0)
print(fit)
summary(fit)
RVMs_fitted <- list()
RVMs_fitted[[1]] <- VineCopula::RVineMatrix(Matrix=fit$output$vine_structure[,,1],
family=fit$output$bicop_familyset[,,1],
par=fit$output$bicop_param[,,1],
par2=fit$output$bicop_param2[,,1])
RVMs_fitted[[2]] <- VineCopula::RVineMatrix(Matrix=fit$output$vine_structure[,,2],
family=fit$output$bicop_familyset[,,2],
par=fit$output$bicop_param[,,2],
par2=fit$output$bicop_param2[,,2])
RVMs_fitted[[3]] <- VineCopula::RVineMatrix(Matrix=fit$output$vine_structure[,,3],
family=fit$output$bicop_familyset[,,3],
par=fit$output$bicop_param[,,3],
par2=fit$output$bicop_param2[,,3])
RVM<-RVMs_fitted
meanx <- c(0.47,0.508,0.45,0.52,0.48)
sigmax <- c(0.318,0.322,0.296,0.29,0.279)
ux1<-pnorm(x[1],meanx[1],sigmax[1])
ux2<-pnorm(x[2],meanx[2],sigmax[2])
ux3<-pnorm(x[3],meanx[3],sigmax[3])
ux4<-pnorm(x[4],meanx[4],sigmax[4])
ux5<-pnorm(x[5],meanx[5],sigmax[5])
data <- c(ux1,ux2,ux3,ux4,ux5)
du=RVinePIT(data, RVM)
y=t(qnorm(t(du)))
Error:
In RVinePIT: RVM has to be an RVineMatrix object.

You have multiple problems here:
RVM is a list. However, you tried to fit RVinePIT to a list, while it works for one data at a time.
The same holds for the y.
I do not have your data, but try it with other data.
Here is the code (it should work):
library(vineclust)
library(VineCopula)
data1 <- read.csv("D:/ASTUDY/Rlanguage/Mix copula/data.csv", header = FALSE)
fit <- vcmm(data = data, total_comp=3,is_cvine = 0)
print(fit)
summary(fit)
RVMs_fitted <- list()
RVMs_fitted[[1]] <- RVineMatrix(Matrix=fit$output$vine_structure[,,1],
family=fit$output$bicop_familyset[,,1],
par=fit$output$bicop_param[,,1],
par2=fit$output$bicop_param2[,,1])
RVMs_fitted[[2]] <- RVineMatrix(Matrix=fit$output$vine_structure[,,2],
family=fit$output$bicop_familyset[,,2],
par=fit$output$bicop_param[,,2],
par2=fit$output$bicop_param2[,,2])
RVMs_fitted[[3]] <- RVineMatrix(Matrix=fit$output$vine_structure[,,3],
family=fit$output$bicop_familyset[,,3],
par=fit$output$bicop_param[,,3],
par2=fit$output$bicop_param2[,,3])
RVM<-RVMs_fitted
meanx <- c(0.47,0.508,0.45,0.52,0.48)
sigmax <- c(0.318,0.322,0.296,0.29,0.279)
ux1<-pnorm(x[1],meanx[1],sigmax[1])
ux2<-pnorm(x[2],meanx[2],sigmax[2])
ux3<-pnorm(x[3],meanx[3],sigmax[3])
ux4<-pnorm(x[4],meanx[4],sigmax[4])
ux5<-pnorm(x[5],meanx[5],sigmax[5])
data <- c(ux1,ux2,ux3,ux4,ux5)### This must be a matrix to work with RVinePIT
du=lapply(1:3, function(i) RVinePIT(data, RVM[[i]]))
y <-lapply(1:3, function(i) t(qnorm(t(du[[i]]))))

Related

Solution for filter() not working within a For Loop?

I'm trying to fit a von Bertalanffy growth function (VGBF) in r to my data grouped by a serial number.
This is a snippet of my data:
Serial_No<- c(315,315,315,315,315,315,315,316,316,316,316,317,317,317,317,317,317,317,317,317,318,318,318,318,319,319,319,319)
Year<-c(1945,1945,1945,1945,1945,1945,1945,1945,1945,1945,1945,1945,1945,1945,1945,1945,1945,1945,1945,1945,1945,1945,1945,1945,1945,1945,1945,1945)
tl<-c(19,33,46,55,63,66,70,22,39,55,65,20,40,51,56,60,62,63,64,65,26,43,54,60,28,43,53,61)
age<-c(1,2,3,4,5,6,7,1,2,3,4,1,2,3,4,5,6,7,8,9,1,2,3,4,1,2,3,4))
df<-data.frame(Serial_No, Year, tl, age)
I've been following this example: https://www.r-bloggers.com/2020/01/von-bertalanffy-growth-plots-ii/
and have changed my code to as follows:
vb <- vbFuns()
predict2 <- function(x) predict(x,data.frame(age=ages))
agesum <- group_by(df,Serial_No) %>%
summarize(minage=min(age),maxage=max(age))
Serial_Nos <- unique(df$Serial_No)
nSerial_Nos <- length(Serial_Nos)
cfs <- cis <- preds1 <- preds2 <- NULL
for (i in 1:nSerial_Nos) {
cat(Serial_Nos[i],"Loop\n")
tmp1 <- filter(df,Serial_No==Serial_Nos[i])
sv1 <- vbStarts(tl~age,data=tmp1)
fit1 <- nls(tl~vb(age,Linf,K,t0),data=tmp1,start=sv1)
cfs <- rbind(cfs,coef(fit1))
boot1 <- Boot(fit1)
tmp2 <- confint(boot1)
cis <- rbind(cis,c(tmp2["Linf",],tmp2["K",],tmp2["t0",]))
ages <- seq(-1,16,0.2)
boot2 <- Boot(fit1,f=predict2)
tmp2 <- data.frame(Serial_No=Serial_Nos[i],age=ages,
predict(fit1,data.frame(age=ages)),
confint(boot2))
preds1 <- rbind(preds1,tmp2)
tmp2 <- filter(tmp2,age>=agesum$minage[i],age<=agesum$maxage[i])
preds2 <- rbind(preds2,tmp2)
}
The code runs, but the results from the VBGF returned are the same for every serial no, which can't be the case. I think it's the filter function not working in the above code.
I've searched for solutions but can't get it to work.
If anyone can please help, or knows of a solution i'd really appreciate it.
Thank you in advance

Model fit with package growthrates
The following post describes an alternative approach without for-loop and filter. Similar loop-free solutions can be implemented using the common nls function and lapply in "base" R or group_by in "tidyverse".
Model definition
The growthrates package does not contain a von Bertalanffy function, so it has to be provided as user supplied model, as described in the package vignette. Here I borrowed the function from package FSA and adapted it accordingly:
library("growthrates")
grow_von_bert <- function(time, parms) {
with(as.list(parms), {
y <- Linf * (1 - exp(-K * (time - t0)))
as.matrix(data.frame(time = time, y = y))
})
}
Test of the model with a single example
p <- c(t0=5, Linf=10, K=.1)
time <- seq(5, 100)
plot(grow_von_bert(time, p), type="l")
Fit of a single data example
It is always a good idea to fit one or more single examples first, before doing this for all.
df1 <- subset(df, Serial_No == 315)
fit1 <- fit_growthmodel(df1$age, df1$tl,
FUN = grow_von_bert, p=c(t0=0, Linf=70, K=0.1))
summary(fit1)
Fit of all data sets
This can be done in a loop or with appropriate tidyverse functions, whipe package growthrates has such a function already built in, so all models can be fitted with a single function call. It is of course necessary to specify good start parameters, either the same for all curves or individual parameter sets, depending on the quality of the data. Here is the complete code including the data of the OP:
library("growthrates")
df <- data.frame(
Serial_No = factor(c(315,315,315,315,315,315,315,316,316,316,316,317,317,317,317,
317,317,317,317,317,318,318,318,318,319,319,319,319)),
year = c(1945,1945,1945,1945,1945,1945,1945,1945,1945,1945,1945,1945,1945,1945,
1945,1945,1945,1945,1945,1945,1945,1945,1945,1945,1945,1945,1945,1945),
tl = c(19,33,46,55,63,66,70,22,39,55,65,20,40,51,56,60,62,63,64,65,26,43,54,60,28,
43,53,61),
age = c(1,2,3,4,5,6,7,1,2,3,4,1,2,3,4,5,6,7,8,9,1,2,3,4,1,2,3,4)
)
grow_von_bert <- function(time, parms) {
with(as.list(parms), {
y <- Linf * (1 - exp(-K * (time - t0)))
as.matrix(data.frame(time = time, y = y))
})
}
fit <- all_growthmodels(tl ~ age | Serial_No,
data=df,
FUN = grow_von_bert,
p=c(t0=0, Linf=70, K=0.1))
results(fit)
par(mfrow=c(2,3))
plot(fit, las=1)

How do I make variable weights dynamic in lmer for loop

I want to be able to input the variable name that I'll be using in the "weights" option in the lmer function. So then I can change the dataset, and cycle through the "weights" and pull the correct variable.
I want to pull the correct column for weights within the for loop.
So for y, the equation would be:
lmer(y~x+(1|study), weights = weight.var)
And y1:
lmer(y1~x+(1|study),weights = weight.var1)
So I named the weighting variables (weight.opt), then want to use them in the formula within the for loop. I can use "as.formula" to get the formula working and connected to the dataset, but I'm not sure how to do something similar with the weights.
x <- rnorm(300,0,1)
y <- x*rnorm(300,2,0.5)
y1 <- x*rnorm(300,0.1,0.1)
study <- rep(c("a","b","c"),each = 100)
weight.var <- rep(c(0.5,2,4),each = 100)
weight.var1 <- rep(c(0.1,.2,.15),each = 100)
library(lme4)
dataset <- data.frame(x,y,y1,study,weight.var,weight.var1)
resp1 <- c("y","y1")
weight.opt <- c("weight.var","weight.var1")
for(i in 1:2){
lmer(as.formula(paste(resp1[i],"~x+(1|study)")),weights = weight.opt[i],data = dataset)
}

This seems to work fine:
res_list <- list()
for(i in 1:2){
res_list[[i]] <- lmer(as.formula(paste(resp1[i],"~x+(1|study)")),
weights = dataset[[weight.opt[i]]],data = dataset)
}

Extract the best parameters from cva.glmnet object

I'm sure there is an elegant way to extract the best alpha and lambda after running cva.glmnet but somehow I cannot find it.
Here is the code I am using in the meantime.
Thank you
library(data.table);library(glmnetUtils);library(useful)
# make some dummy data
data(iris)
x <- useful::build.x(data = iris,formula = Sepal.Length ~ .)
y <- iris$Sepal.Length
# run cv for alpha in c(0,0.5,1)
output.of.cva.glmnet <- cva.glmnet(x=x,y=y,alpha = c(0,0.5,1))
# extract the best parameters
number.of.alphas.tested <- length(output.of.cva.glmnet$alpha)
cv.glmnet.dt <- data.table()
for (i in 1:number.of.alphas.tested){
glmnet.model <- output.of.cva.glmnet$modlist[[i]]
min.mse <- min(glmnet.model$cvm)
min.lambda <- glmnet.model$lambda.min
alpha.value <- output.of.cva.glmnet$alpha[i]
new.cv.glmnet.dt <- data.table(alpha=alpha.value,min_mse=min.mse,min_lambda=min.lambda)
cv.glmnet.dt <- rbind(cv.glmnet.dt,new.cv.glmnet.dt)
}
best.params <- cv.glmnet.dt[which.min(cv.glmnet.dt$min_mse)]

Based on a thread I read on GitHub the author wants people to use plot(fit) instead of just outputting the best parameters. However, that isn't always possible, especially when cross validation is involved. These helper functions can be a good workaround.
# Train model.
fit <- cva.glmnet(X, y)
# Get alpha.
get_alpha <- function(fit) {
alpha <- fit$alpha
error <- sapply(fit$modlist, function(mod) {min(mod$cvm)})
alpha[which.min(error)]
}
# Get all parameters.
get_model_params <- function(fit) {
alpha <- fit$alpha
lambdaMin <- sapply(fit$modlist, `[[`, "lambda.min")
lambdaSE <- sapply(fit$modlist, `[[`, "lambda.1se")
error <- sapply(fit$modlist, function(mod) {min(mod$cvm)})
best <- which.min(error)
data.frame(alpha = alpha[best], lambdaMin = lambdaMin[best],
lambdaSE = lambdaSE[best], eror = error[best])
}

Reqsubsets results differ with coef() for model with linear dependencies

while using Regsubsets from package leaps on data with linear dependencies, I found that results given by coef() and by summary()$which differs. It seems that, when linear dependencies are found, reordering changes position of coefficients and coef() returns wrong values.
I use mtcars just to "simulate" the problem I had with other data. In first example there is no issue of lin. dependencies and best given model by BIC is mpg~wt+cyl and both coef(),summary()$which gives the same result. In second example I add dummy variable so there is possibility of perfect multicollinearity, but variables in this order (dummy in last column) don't cause the problem. In last example after changing order of variables in dataset, the problem finally appears and coef(),summary()$which gives different models. Is there anything incorrect in this approach? Is there any other way to get coefficients from regsubsets?
require("leaps") #install.packages("leaps")
###Example1
dta <- mtcars[,c("mpg","cyl","am","wt","hp") ]
bestSubset.cars <- regsubsets(mpg~., data=dta)
(best.sum <- summary(bestSubset.cars))
#
w <- which.min(best.sum$bic)
best.sum$which[w,]
#
best.sum$outmat
coef(bestSubset.cars, w)
#
###Example2
dta2 <- cbind(dta, manual=as.numeric(!dta$am))
bestSubset.cars2 <- regsubsets(mpg~., data=dta)
(best.sum2 <- summary(bestSubset.cars2))
#
w <- which.min(best.sum2$bic)
best.sum2$which[w,]
#
coef(bestSubset.cars2, w)
#
###Example3
bestSubset.cars3 <- regsubsets(mpg~., data=dta2[,c("mpg","manual","am","cyl","wt","hp")])
(best.sum3 <- summary(bestSubset.cars3))
#
w <- which.min(best.sum3$bic)
best.sum3$which[w,]
#
coef(bestSubset.cars3, w)
#
best.sum2$which
coef(bestSubset.cars2,1:4)
best.sum3$which
coef(bestSubset.cars3,1:4)

The order of vars by summary.regsubsets and regsubsets are different. The generic function coef() of regsubsets calls those two in one function, and the results are in mess if you are trying to force.in or using formula with fixed order. Changing some lines in the coef() function might help. Try codes below, see if it works!
coef.regsubsets <- function (object, id, vcov = FALSE, ...)
{
s <- summary(object)
invars <- s$which[id, , drop = FALSE]
betas <- vector("list", length(id))
for (i in 1:length(id)) {
# added
var.name <- names(which(invars[i, ]))
thismodel <- which(object$xnames %in% var.name)
names(thismodel) <- var.name
# deleted
#thismodel <- which(invars[i, ])
qr <- .Fortran("REORDR", np = as.integer(object$np),
nrbar = as.integer(object$nrbar), vorder = as.integer(object$vorder),
d = as.double(object$d), rbar = as.double(object$rbar),
thetab = as.double(object$thetab), rss = as.double(object$rss),
tol = as.double(object$tol), list = as.integer(thismodel),
n = as.integer(length(thismodel)), pos1 = 1L, ier = integer(1))
beta <- .Fortran("REGCF", np = as.integer(qr$np), nrbar = as.integer(qr$nrbar),
d = as.double(qr$d), rbar = as.double(qr$rbar), thetab = as.double(qr$thetab),
tol = as.double(qr$tol), beta = numeric(length(thismodel)),
nreq = as.integer(length(thismodel)), ier = numeric(1))$beta
names(beta) <- object$xnames[qr$vorder[1:qr$n]]
reorder <- order(qr$vorder[1:qr$n])
beta <- beta[reorder]
if (vcov) {
p <- length(thismodel)
R <- diag(qr$np)
R[row(R) > col(R)] <- qr$rbar
R <- t(R)
R <- sqrt(qr$d) * R
R <- R[1:p, 1:p, drop = FALSE]
R <- chol2inv(R)
dimnames(R) <- list(object$xnames[qr$vorder[1:p]],
object$xnames[qr$vorder[1:p]])
V <- R * s$rss[id[i]]/(object$nn - p)
V <- V[reorder, reorder]
attr(beta, "vcov") <- V
}
betas[[i]] <- beta
}
if (length(id) == 1)
beta
else betas
}

Another solution that works for me is to randomize the order of the column(independent variables) in your dataset before running the regsubsets. The idea is that after reorder hopefully the highly correlated columns will be far apart from each other and will not trigger the reorder behavior in the regsubsets algorithm.

Neural Network Prediction Intervals in R

I am trying to compute prediction intervals for my neural network created with the neuralnet package.
I use R in Tableau Software, by creating .RData files containing my functions and loaded in Tableau.
It's a simple NN, with one hidden layer containing 5 nodes. I searched and found this package : nnetpredint
So I tried to use it, using their examples.
I tried also to change the way I use it (train/test in same data frame, separated data frames with the same columns names etc.)
And the best result I had was the prediction, but without the lowerBound and upperBound columns.
In fact, I got exactly the same result as when I use compute(myNN, etc.), but I don't have the second and third columns.
Thanks for your help,
EDIT :
My data is coming from tableau, my function take five parameters which are :
ValuesToExplain,train1,train2,test1,test2.
Then, i create and train my NN with the 3first and try to compute the two last.
(test1 = k*train1 and test2 = k2*train2 for now but it will probably move in the future).
Here is my whole code :
NNetwork <- function(objectiveValues, knownValues1, knownValues2, newData, newData2){
numberOfColumn = 3
##Create the training dataframe
training <- data.frame(objectiveValues, knownValues1,knownValues2)
training[which(is.na(training[,"objectiveValues"])),"objectiveValues"]<- mean(training[,"objectiveValues"], na.rm = TRUE)
training[which(is.na(training[,"knownValues1"])),"knownValues1"]<- mean(training[,"knownValues1"], na.rm = TRUE)
training[which(is.na(training[,"knownValues2"])),"knownValues2"]<- mean(training[,"knownValues2"], na.rm = TRUE)
## Create the testing dataframe
testing <- data.frame(objectiveValues,newData,newData2)
names(testing) <- c("objectiveValues", "knownValues1", "knownValues2")
testing[which(is.na(testing[,"objectiveValues"])),"objectiveValues"]<- mean(testing[,"objectiveValues"], na.rm = TRUE)
testing[which(is.na(testing[,"knownValues1"])),"knownValues1"]<- mean(testing[,"knownValues1"], na.rm = TRUE)
testing[which(is.na(testing[,"knownValues2"])),"knownValues2"]<- mean(testing[,"knownValues2"], na.rm = TRUE)
## Scaling
maxs <- apply(training, 2, max)
mins <- apply(training, 2, min)
trainingScaled <- as.data.frame(scale(training, center = mins, scale = maxs - mins))
testingScaled <- as.data.frame(scale(testing, center = mins, scale = maxs - mins))
### NeuralNetwork Part
library(neuralnet)
n <- names(trainingScaled)
f <- as.formula(paste("objectiveValues ~", paste(n[!n %in% "objectiveValues"], collapse = " + ")))
# Training NN
nn <- neuralnet(f, data=trainingScaled,hidden=5,linear.output=TRUE)
# Using NN
computedTrainingScaled <- compute(nn,trainingScaled[,2:numberOfColumn])
computedFromNNScaled <- compute(nn,testingScaled[,2:numberOfColumn])
# UnScaling
computedTraining <- computedTrainingScaled$net.result*(max(training$objectiveValues)-min(training$objectiveValues))+min(training$objectiveValues)
computedFromNN <- computedFromNNScaled$net.result*(max(training$objectiveValues)-min(training$objectiveValues))+min(training$objectiveValues)
RSquare = (1-( (sum((training$objectiveValues - computedTraining)^2))/(sum((training$objectiveValues - mean(training$objectiveValues))^2)) ))*100
RSE = sum((training$objectiveValues - computedTraining)^2)/nrow(training)
res <- (1:nrow(training))
library(nnetpredint) # Getting prediction confidence interval
x <- trainingScaled[,2:numberOfColumn]
y <- trainingScaled[1]
newData <- testingScaled[,2:numberOfColumn]
# S3 generic method: Object of nn
yPredInt <- nnetPredInt(nn, x, y, newData)
for(i in 1:nrow(training)){
res[i] <- paste(computedFromNN[i],RSquare,RSE, sep="#")
}
return(res)
}
save(NNetwork, file = "NNetwork.RData")
Here, i removed the part using the nnetpredint pckage because it was not working, but it was like this :
library(nnetpredint)
y <- trainingScaled
x <- trainingScaled[,2:3]
newData <- testingScaled[,2:3]
yPredInt <- nnetPredInt(nn, x, y, newData)
My problem is that when I try to access yPredInt$lowerBound or yPredInt$upperBound , they don't exist.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

mixture copula in R - r

Related

Solution for filter() not working within a For Loop?

How do I make variable weights dynamic in lmer for loop

Extract the best parameters from cva.glmnet object

Reqsubsets results differ with coef() for model with linear dependencies

Neural Network Prediction Intervals in R

Categories

Resources